dbTalk Databases Forums  

SSIS and DW loads

microsoft.public.sqlserver.dts microsoft.public.sqlserver.dts


Discuss SSIS and DW loads in the microsoft.public.sqlserver.dts forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
vnapoli
 
Posts: n/a

Default SSIS and DW loads - 02-24-2006 , 03:36 PM






I am trying to do a load on a data warehouse staging table using SSIS where I
want to insert new records and update existing records based on existence of
the key values in the table. The input records are the change records
coming from an OLTP system. This would seem to be a very typical scenario
for loading a data warehouse table. Is there an example or best practices
to best handle this scenario?

It would be great if there was a far more simplified version of the Slowly
Changing Dimension or an Update transform that would be far easier to map
parameters to than the OLE DB Command.

I have been able to get a lookup transform determine if the key exists in
the table, and set the error output to redirect the row when it does not find
the key and insert the new records into the staging table. On the output
flow, I try to get the SQL Command transform with an update statement to
modify the existing records, but with a large number of columns, this is
extremely problematic to map and maintain the parameters.

Last alternative is to create a stored procedure, to handle this load
scenario, which is the same method which we had to do in DTS.



Reply With Quote
  #2  
Old   
Allan Mitchell
 
Posts: n/a

Default Re: SSIS and DW loads - 02-25-2006 , 02:09 PM






Hello vnapoli,

You could use the Merge Join to accomplish the New row, existing row match.
You then could use a Conditional split to split off the new rows to an output
and you can also test in there for rows that have changed but are matches
in source and destination. If you do not like the idea of the OLE DB transform
then you could stage these "Changed Rows" and then after the Data Flow you
can issue a set based UPDATE. This is how i would do it. Best practices?
My tests and experiences show that the SCD does not scale to huge inputs
with latge outputs so I tend to use the Merge Join.

You can still use the SCD in my example and then pipe the "Updated Rows"
output to a table for staging then set based updates and remove the OLE DB
Transform.



Allan Mitchell
www.SQLDTS.com
www.SQLIS.com
www.Konesans.com

Quote:
I am trying to do a load on a data warehouse staging table using SSIS
where I want to insert new records and update existing records based
on existence of the key values in the table. The input records are
the change records coming from an OLTP system. This would seem to be
a very typical scenario for loading a data warehouse table. Is there
an example or best practices to best handle this scenario?

It would be great if there was a far more simplified version of the
Slowly Changing Dimension or an Update transform that would be far
easier to map parameters to than the OLE DB Command.

I have been able to get a lookup transform determine if the key exists
in the table, and set the error output to redirect the row when it
does not find the key and insert the new records into the staging
table. On the output flow, I try to get the SQL Command transform
with an update statement to modify the existing records, but with a
large number of columns, this is extremely problematic to map and
maintain the parameters.

Last alternative is to create a stored procedure, to handle this load
scenario, which is the same method which we had to do in DTS.




Reply With Quote
  #3  
Old   
vnapoli
 
Posts: n/a

Default Re: SSIS and DW loads - 03-03-2006 , 01:37 PM



Thanks, the Merge Join or Lookup transfrom will adequately handle the lookup
/ check for existing records in a destination table for me. The OLE DB
command transform will be a major issue for updates in having to hand code it
for every update operation that will take place after determining if a record
exists and must be updated. Where is an UPSERT Transform that are seem in
other ETL products that more reliably and intuitively handle update
operations in a data flow? It seems that SSIS is falling just short in this
aspect.






"Allan Mitchell" wrote:

Quote:
Hello vnapoli,

You could use the Merge Join to accomplish the New row, existing row match.
You then could use a Conditional split to split off the new rows to an output
and you can also test in there for rows that have changed but are matches
in source and destination. If you do not like the idea of the OLE DB transform
then you could stage these "Changed Rows" and then after the Data Flow you
can issue a set based UPDATE. This is how i would do it. Best practices?
My tests and experiences show that the SCD does not scale to huge inputs
with latge outputs so I tend to use the Merge Join.

You can still use the SCD in my example and then pipe the "Updated Rows"
output to a table for staging then set based updates and remove the OLE DB
Transform.



Allan Mitchell
www.SQLDTS.com
www.SQLIS.com
www.Konesans.com

I am trying to do a load on a data warehouse staging table using SSIS
where I want to insert new records and update existing records based
on existence of the key values in the table. The input records are
the change records coming from an OLTP system. This would seem to be
a very typical scenario for loading a data warehouse table. Is there
an example or best practices to best handle this scenario?

It would be great if there was a far more simplified version of the
Slowly Changing Dimension or an Update transform that would be far
easier to map parameters to than the OLE DB Command.

I have been able to get a lookup transform determine if the key exists
in the table, and set the error output to redirect the row when it
does not find the key and insert the new records into the staging
table. On the output flow, I try to get the SQL Command transform
with an update statement to modify the existing records, but with a
large number of columns, this is extremely problematic to map and
maintain the parameters.

Last alternative is to create a stored procedure, to handle this load
scenario, which is the same method which we had to do in DTS.





Reply With Quote
  #4  
Old   
Allan Mitchell
 
Posts: n/a

Default Re: SSIS and DW loads - 03-03-2006 , 02:03 PM



Hello vnapoli,

UPSERTs are notoriously poorly performing. SSIS may not have an UPSERT Transform
but with a little ingenuity you can accomplish pretty much what you want
more efficiently

Allan Mitchell
www.SQLDTS.com
www.SQLIS.com
www.Konesans.com

Quote:
Thanks, the Merge Join or Lookup transfrom will adequately handle the
lookup / check for existing records in a destination table for me.
The OLE DB command transform will be a major issue for updates in
having to hand code it for every update operation that will take place
after determining if a record exists and must be updated. Where is
an UPSERT Transform that are seem in other ETL products that more
reliably and intuitively handle update operations in a data flow? It
seems that SSIS is falling just short in this aspect.

"Allan Mitchell" wrote:

Hello vnapoli,

You could use the Merge Join to accomplish the New row, existing row
match. You then could use a Conditional split to split off the new
rows to an output and you can also test in there for rows that have
changed but are matches in source and destination. If you do not
like the idea of the OLE DB transform then you could stage these
"Changed Rows" and then after the Data Flow you can issue a set based
UPDATE. This is how i would do it. Best practices? My tests and
experiences show that the SCD does not scale to huge inputs with
latge outputs so I tend to use the Merge Join.

You can still use the SCD in my example and then pipe the "Updated
Rows" output to a table for staging then set based updates and remove
the OLE DB Transform.

Allan Mitchell
www.SQLDTS.com
www.SQLIS.com
www.Konesans.com
I am trying to do a load on a data warehouse staging table using
SSIS where I want to insert new records and update existing records
based on existence of the key values in the table. The input
records are the change records coming from an OLTP system. This
would seem to be a very typical scenario for loading a data
warehouse table. Is there an example or best practices to best
handle this scenario?

It would be great if there was a far more simplified version of the
Slowly Changing Dimension or an Update transform that would be far
easier to map parameters to than the OLE DB Command.

I have been able to get a lookup transform determine if the key
exists in the table, and set the error output to redirect the row
when it does not find the key and insert the new records into the
staging table. On the output flow, I try to get the SQL Command
transform with an update statement to modify the existing records,
but with a large number of columns, this is extremely problematic to
map and maintain the parameters.

Last alternative is to create a stored procedure, to handle this
load scenario, which is the same method which we had to do in DTS.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.