sql server - Stored procedure for merging tables -


i doing work @ client on sales datamart, adding new connection erp. client have kept old database ('888') manufacturing , created new '999' database sales. history remains in 888 until 31 december , new sales in 999 1 jan 2016 onwards.

i have sql server tables , views set interrogate erp both 888 , 999 using new tables , views replicas of had 888. want union of 888 , 999 data sets have single data set.

i pretty comfortable creating unions fact tables there should no duplicate records, guidance on dim tables.

the customers , products replicated out of 888 999 in erp absolutely will have duplicates, want have single version 999. however, customer gets deleted 999, still have history 888 need include it.

what thinking can't union (because of duplicates), rather thinking need create sort of procedure inserts records 999 table , processes 888 against that, adding records not exist in target.

my problem don't know how write such stored procedure. if think of customer dimension example, primary key [order_debtor] , example of descriptive column [order_debtor_description].

what looking guidance on writing code insertion 999.customer merged.customer , checking , insertion 888.customer merged.customer when not exist in merged.customer.

given tables z888customer , z999customer have been populated

as understand it, here cases you've described de-duplicating data:

case    z888customer        z999customer         use data       123                 (doesn't exist)      888 b       (doesn't exist)     789                  999 c       456                 456                  888 

so 1 way create view containing this:

-- case - 888 records not in 999 select debtor_id, field1, field2 z888customer not exists (     select *     z999customer     z999customer.debtor_id =  z888customer.debtor_id     ) union -- case b , c select debtor_id, field1, field2 z999customer 

say view called loadcustomers can load new ones with:

insert customerdimension (field1,field2,debtorid) select field1,field2,debtorid loadcustomers not exists (     select *      dimcustomer      dimcustomer.debtorid = loadcustomer.debtorid  ) 

so keep adding stuff. if gets deleted source, wont deleted dimension because there no delete statement being run

we using debtor_id match key here. need watch these things:

  • if debtorid 63 gets deleted db gets recreated new debtor in db reusing same id, won't reflected - old 1 stay same
  • if have 2 different database guarantee diverge. suggest change 'historical' database read (alter database set read_only, set files in file system read only) otherwise people fiddle , you'll have problems
  • i've provided insert case above. no update. means if debtor changes names in source won't reflected in data warehouse, because there insert step not update one

these related 'replication' i.e. merging data in , need decide on rules every case

oh , 1 more thing consider using schemas seperate tables (not z prefix)

so dimension might dbo.dimcustomer , staging might staging.888customer. staging different database altogether though.


Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -