![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I am trying to do a load on a data warehouse staging table using SSIS where I want to insert new records and update existing records based on existence of the key values in the table. The input records are the change records coming from an OLTP system. This would seem to be a very typical scenario for loading a data warehouse table. Is there an example or best practices to best handle this scenario? It would be great if there was a far more simplified version of the Slowly Changing Dimension or an Update transform that would be far easier to map parameters to than the OLE DB Command. I have been able to get a lookup transform determine if the key exists in the table, and set the error output to redirect the row when it does not find the key and insert the new records into the staging table. On the output flow, I try to get the SQL Command transform with an update statement to modify the existing records, but with a large number of columns, this is extremely problematic to map and maintain the parameters. Last alternative is to create a stored procedure, to handle this load scenario, which is the same method which we had to do in DTS. |
#3
| |||
| |||
|
|
Hello vnapoli, You could use the Merge Join to accomplish the New row, existing row match. You then could use a Conditional split to split off the new rows to an output and you can also test in there for rows that have changed but are matches in source and destination. If you do not like the idea of the OLE DB transform then you could stage these "Changed Rows" and then after the Data Flow you can issue a set based UPDATE. This is how i would do it. Best practices? My tests and experiences show that the SCD does not scale to huge inputs with latge outputs so I tend to use the Merge Join. You can still use the SCD in my example and then pipe the "Updated Rows" output to a table for staging then set based updates and remove the OLE DB Transform. Allan Mitchell www.SQLDTS.com www.SQLIS.com www.Konesans.com I am trying to do a load on a data warehouse staging table using SSIS where I want to insert new records and update existing records based on existence of the key values in the table. The input records are the change records coming from an OLTP system. This would seem to be a very typical scenario for loading a data warehouse table. Is there an example or best practices to best handle this scenario? It would be great if there was a far more simplified version of the Slowly Changing Dimension or an Update transform that would be far easier to map parameters to than the OLE DB Command. I have been able to get a lookup transform determine if the key exists in the table, and set the error output to redirect the row when it does not find the key and insert the new records into the staging table. On the output flow, I try to get the SQL Command transform with an update statement to modify the existing records, but with a large number of columns, this is extremely problematic to map and maintain the parameters. Last alternative is to create a stored procedure, to handle this load scenario, which is the same method which we had to do in DTS. |
#4
| |||
| |||
|
|
Thanks, the Merge Join or Lookup transfrom will adequately handle the lookup / check for existing records in a destination table for me. The OLE DB command transform will be a major issue for updates in having to hand code it for every update operation that will take place after determining if a record exists and must be updated. Where is an UPSERT Transform that are seem in other ETL products that more reliably and intuitively handle update operations in a data flow? It seems that SSIS is falling just short in this aspect. "Allan Mitchell" wrote: Hello vnapoli, You could use the Merge Join to accomplish the New row, existing row match. You then could use a Conditional split to split off the new rows to an output and you can also test in there for rows that have changed but are matches in source and destination. If you do not like the idea of the OLE DB transform then you could stage these "Changed Rows" and then after the Data Flow you can issue a set based UPDATE. This is how i would do it. Best practices? My tests and experiences show that the SCD does not scale to huge inputs with latge outputs so I tend to use the Merge Join. You can still use the SCD in my example and then pipe the "Updated Rows" output to a table for staging then set based updates and remove the OLE DB Transform. Allan Mitchell www.SQLDTS.com www.SQLIS.com www.Konesans.com I am trying to do a load on a data warehouse staging table using SSIS where I want to insert new records and update existing records based on existence of the key values in the table. The input records are the change records coming from an OLTP system. This would seem to be a very typical scenario for loading a data warehouse table. Is there an example or best practices to best handle this scenario? It would be great if there was a far more simplified version of the Slowly Changing Dimension or an Update transform that would be far easier to map parameters to than the OLE DB Command. I have been able to get a lookup transform determine if the key exists in the table, and set the error output to redirect the row when it does not find the key and insert the new records into the staging table. On the output flow, I try to get the SQL Command transform with an update statement to modify the existing records, but with a large number of columns, this is extremely problematic to map and maintain the parameters. Last alternative is to create a stored procedure, to handle this load scenario, which is the same method which we had to do in DTS. |
![]() |
| Thread Tools | |
| Display Modes | |
| |