![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
|
Looking for opinions on this: In designing a DataWarehouse in SQL 2005, I am developing the SSIS Package that will MODIFY changed rows in a FACT table. As far as I can see, there are two general options: Option 1 is to EXTRACT all modified rows (since the last extract) and dump them into a staging table in the Warehouse. Then use a stored procedure to update the FACT Table with data from the staging table based on a join between the two (joining KEY fields, of course). Option 2 is to use an OLE DB Command in the SSIS data flow to execute a stored procedure for each row in the stream. Is there an Option 3 involving a Script Transform? Not sure how that would work. My guess is that the set based operation of the first option would be faster than a row based operation of any others. Any opinions out there? Is there some 'best practice' when updating rows in the Destination in SSIS (updating, not adding new rows) Thanks. |
#2
| |||
| |||
|
|
Allan, This particular Fact table is not a "pure" transactional system. It is for a Help-Desk ticketing system and has properties such as "Status" and "Assignee" which change. Updating with a staging table and one sp will not pose any problems. I do have other Fact tables that are strictly appends (no modifications) so they don't have this dilema. As far as a religious debate, I am the only one designing the DW, so what I say goes, even if it's wrong! Thanks for your input. Always appreciated. "Allan Mitchell" wrote: Hello Todd, Modifying fact table rows can become a religious debate. Most shops I have worked in simply enter a delta row to cancel out the "Error" Can you identify changed rows in the source so that you know their fact table row will be out of date? If you can and you must update the fact table then yes the first approach is the better of the two. Regards Allan Mitchell Konesans Ltd T +44 7966 476 572 F +44 2071 008 479 http://www.konesans.com Looking for opinions on this: In designing a DataWarehouse in SQL 2005, I am developing the SSIS Package that will MODIFY changed rows in a FACT table. As far as I can see, there are two general options: Option 1 is to EXTRACT all modified rows (since the last extract) and dump them into a staging table in the Warehouse. Then use a stored procedure to update the FACT Table with data from the staging table based on a join between the two (joining KEY fields, of course). Option 2 is to use an OLE DB Command in the SSIS data flow to execute a stored procedure for each row in the stream. Is there an Option 3 involving a Script Transform? Not sure how that would work. My guess is that the set based operation of the first option would be faster than a row based operation of any others. Any opinions out there? Is there some 'best practice' when updating rows in the Destination in SSIS (updating, not adding new rows) Thanks. |
#3
| |||
| |||
|
|
I'm still new at designing Fact table loads for a DW so I started with the smallest I could find in my environment: 1000 rows existing and 10 rows updated (OK, you can stop laughing now!) The next Fact table will be a lot bigger and I just wanted to get an idea / set some guidelines for future design work. I went with the staging table and SP and it works fine. Although, always eager to learn something new, how would I set up a Script Transform to modify rows? Thanks. "Allan Mitchell" wrote: Hello Todd, "so what I say goes, even if it's wrong!" - :-) The way i see it is, it is a toss up between the OLE DB cmd transform to do the updates and the staged rows and set based update post data flow task. For me the answer is in how many updates you are likely to do. If it is 10 or so then the perf hit of the OLE DB transform will be negligible. If it is 100K then up are going to know about it and should choose the staging of the rows with SET based update. That said I would probably stage in any event and do the SET based update. Cleaner and should you get wild fluctuations in counts to be updated then you are prepared. How many rows do you have as an input dataset and a lookup dataset. This needs thinnking abount as well because there are things we can do here. Regards Allan Mitchell Konesans Ltd T +44 7966 476 572 F +44 2071 008 479 http://www.konesans.com |
![]() |
| Thread Tools | |
| Display Modes | |
| |