![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
#3
| |||
| |||
|
|
Hi all, I'm currently implementing a Data Warehouse in SQL Server 2000, but being quite new to the more advanced features of DTS, am struggling to create packages to support Slowly Changing Dimensions. Three of my dimensions are hybrid type 1 and type 2 SCDs, so I need to i) check the incoming source data against the data in the DW to see if the records already exist - if not, create a new row, and new surrogate key for that record; ii) if the new record already exists, check to see if any of the attributes have changed, and if so, either overwrite the existing data (I've already determined which fields are SCD1 and SCD2), or create a new record depending on SCD type. I know what I need to do in principal, but it's just the physical codeing I'm struggling with, and I'm not sure where to start - Does anyone know of any code skeletons/templates/resources that could point me in the right direction, as I'm struggling with the code I've created so far? I have started some transformation coding for one dimension. So far, I have a DTS package which takes incoming data from a CSV, and places it into a staging table for that dimension. I then have a transformation task between the new source staging table, and the destination dimension table. The transformation currently only checks for any new dimension records, and if a new record is present in the incoming data, a new surrogate key is issued and the new record is inserted into the dimension table. However, the script currently also tries to insert a number of completely blank records (initially, it fell over at this point, since NULL values broke referential integrity, so I'm allowing all fields in the table to have NULL values for now, while in the initial stages of development, and for additional debugging), equivalent to the number of remaining records in the imported data (at least it's entering the new records, with updated surrogate key, which is a start). For the script, I'm currently using two Lookups, called LastKey (retrieves the current highest surrogate key value for the dimension), and ExistTest (which I'm trying to use to check if the incoming data already exists in the dimension table), and an ActiveX script, as below: Lookup "LastKey" selects the surrogate key value from a lookup table containing two fields - with the values of the dimension name, and the current highest surrogate key used in that dimension table: SELECT Surr_Key FROM SurrKeys WHERE (Dimension = ?) Lookup "ExistTest" (StaffDimLookup is another lookup table in the database, with two fields recording the Natural Key and the current Surrogate key applied to it for the dimension): SELECT Staff_Key FROM StaffDimLookup WHERE (Staff_ID = ?) And the transformation script itself, so far: Function Main() Dim HighKey If IsEmpty(DTSLookups("ExistTest").Execute(DTSSource( "Staff_ID"))) Then HighKey = DTSLookups("LastKey").Execute("StaffDim") HighKey = HighKey + 1 DTSDestination("Staff_Key") = HighKey 'New Surrogate Key DTSDestination("Staff_ID") = DTSSource("Staff_ID") 'Natural Key Other attributes.. ... ... DTSDestination("Curr_Row") = 1 'a flag to indicate the current live record DTSDestination("Row_Begin_Date") = Date 'The date the record is active from, for type 2 SCD changes DTSDestination("Row_End_Date") = "some value" 'The date the row was last valid for type 2 SCD changes End If Main = DTSTransformStat_OK End Function I'm pretty new to using DTS to this extent, so I'm really racking my brains about how to implement it all. I still need to incorporate Update sctipts, to check and update rows for type 1 SCD changes, and to create new records for type 2 changes (for which I've thought about creating an additional field in the dimension table, for a CRC value, which could be used to check changes as opposed to checking every field, although I'm not sure how to create a CRC for a row at the moment). If I crack it for this dimension, it shouldn't be too much of an issue to apply the pricipals to the other dimensions. If anyone has any ideas, I'd be incredibly grateful. Thanks for any help/information Paul. P.S - would it be considered rude to post this request for help on microsoft.public.sqlserver.datawarehouse too? |
![]() |
| Thread Tools | |
| Display Modes | |
| |