![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
From: "Scott M" <scott_M (AT) nospam (DOT) nospam Subject: Can I pick your brains for a second? transferring a large database problem Date: Mon, 6 Jun 2005 17:33:03 -0500 Lines: 72 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2900.2527 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527 Message-ID: <ui2eBfuaFHA.584 (AT) TK2MSFTNGP15 (DOT) phx.gbl Newsgroups: microsoft.public.sqlserver.datawarehouse NNTP-Posting-Host: adsl-068-209-157-050.sip.lft.bellsouth.net 68.209.157.50 Path: TK2MSFTNGXA01.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFT NGP15.phx.gbl Xref: TK2MSFTNGXA01.phx.gbl microsoft.public.sqlserver.datawarehouse:1784 X-Tomcat-NG: microsoft.public.sqlserver.datawarehouse -=-=- Sorry for the dup post, I'm posting it under my MSDN universal acct now -=-=-Okay, I have a problem that I need some suggestions with. I have a 30 Gig database that I need to sync across a T1 every night. This database is a data dump of our billing system data, the tables are dropped ad re-created every night. I can do a very long and laborious process on the server of creating tables with the primary keys from each table and a binary checksum of a row, then complaining it to a copy of the database from the day before. Then I generate SQL statements for the updates (Actually deletes and inserts). All of this on the source server. I'd like to compact this down using a DTS package on the destination side, however I can't use a dynamic query in the "openRowSet" command to get only the row that I want. Here is what I'm doing (Just for one table, I'll have data driven steps for each table in my DB) insert into @tempChecker select BSum, chgNo from openquery(EMBAN2, 'Select BINARY_CHECKSUM(*) BSum, chgno from charge_t') Declare Charge_t_Cursor Cursor FOR select a.chgno from (select BSum, chgno from @tempChecker ) a where not exists ( select 'x' from charge_t b where a.chgno = b.chgno and a.BSum <> BINARY_CHECKSUM(*) ) Once I get all the chgno's (The key in the table) I want to only pull back the data that has changes. So I have this (Which I know I can't do put you get the idea) Open Charge_t_Cursor Fetch next from Charge_t_cursor into @ChgNo While @@FETCH_STATUS = 0 BEGIN insert into @ChargeT select * from OPENROWSET(EMBAN2, 'SELECT * FROM charge_t WHERE chgno = ' + cast(isnull(@ChgNo, 0) as varchar))) Fetch next from Charge_t_cursor into @ChgNo END Close Charge_t_cursor deallocate Charge_t_cursor select * from @ChargeT this way I'm only pulling over the wire the data that is updated and it will be in a nice, compact binary format. ANY help would be VERY appreciated!!! Thanks much! Scott |
#3
| |||
| |||
|
|
Hello Scott, It is not easy to use a DTS package to do this job. I think database replication is more appropriate for this situation. You may consider transaction or merge replication according to your requirements. If you still want to consider DTS, lookup query feature might be helpful http://msdn.microsoft.com/library/de...us/dtssql/dts_ addf_misc_2dix.asp Regards, Peter Yang MCSE2000/2003, MCSA, MCDBA Microsoft Online Partner Support When responding to posts, please "Reply to Group" via your newsreader so that others may learn and benefit from your issue. ================================================== === This posting is provided "AS IS" with no warranties, and confers no rights. -------------------- | From: "Scott M" <scott_M (AT) nospam (DOT) nospam | Subject: Can I pick your brains for a second? transferring a large database problem | Date: Mon, 6 Jun 2005 17:33:03 -0500 | Lines: 72 | MIME-Version: 1.0 | Content-Type: text/plain; | format=flowed; | charset="iso-8859-1"; | reply-type=original | Content-Transfer-Encoding: 7bit | X-Priority: 3 | X-MSMail-Priority: Normal | X-Newsreader: Microsoft Outlook Express 6.00.2900.2527 | X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527 | Message-ID: <ui2eBfuaFHA.584 (AT) TK2MSFTNGP15 (DOT) phx.gbl | Newsgroups: microsoft.public.sqlserver.datawarehouse | NNTP-Posting-Host: adsl-068-209-157-050.sip.lft.bellsouth.net 68.209.157.50 | Path: TK2MSFTNGXA01.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFT NGP15.phx.gbl | Xref: TK2MSFTNGXA01.phx.gbl microsoft.public.sqlserver.datawarehouse:1784 | X-Tomcat-NG: microsoft.public.sqlserver.datawarehouse | | -=-=- Sorry for the dup post, I'm posting it under my MSDN universal acct | now -=-=-| | Okay, I have a problem that I need some suggestions with. | | I have a 30 Gig database that I need to sync across a T1 every night. This | database is a data dump of our billing system data, the tables are dropped | ad re-created every night. | | I can do a very long and laborious process on the server of creating tables | with the primary keys from each table and a binary checksum of a row, then | complaining it to a copy of the database from the day before. Then I | generate SQL statements for the updates (Actually deletes and inserts). All | of this on the source server. | | I'd like to compact this down using a DTS package on the destination side, | however I can't use a dynamic query in the "openRowSet" command to get only | the row that I want. | | Here is what I'm doing (Just for one table, I'll have data driven steps for | each table in my DB) | | insert into @tempChecker | select BSum, chgNo from openquery(EMBAN2, 'Select BINARY_CHECKSUM(*) BSum, | chgno from charge_t') | | Declare Charge_t_Cursor Cursor | FOR | select | a.chgno | from | (select | BSum, | chgno | from @tempChecker | ) a | where | not exists | ( | select 'x' | from charge_t b | where a.chgno = b.chgno and a.BSum <> BINARY_CHECKSUM(*) | ) | | Once I get all the chgno's (The key in the table) I want to only pull back | the data that has changes. So I have this (Which I know I can't do put you | get the idea) | | Open Charge_t_Cursor | | Fetch next from Charge_t_cursor into @ChgNo | | While @@FETCH_STATUS = 0 | BEGIN | insert into @ChargeT select * | from OPENROWSET(EMBAN2, 'SELECT * FROM charge_t WHERE chgno = ' + | cast(isnull(@ChgNo, 0) as varchar))) | | Fetch next from Charge_t_cursor into @ChgNo | END | | Close Charge_t_cursor | deallocate Charge_t_cursor | | select * from @ChargeT | | this way I'm only pulling over the wire the data that is updated and it will | be in a nice, compact binary format. ANY help would be VERY appreciated!!! | Thanks much! | | Scott | | |
#4
| |||
| |||
|
#5
| |||
| |||
|
#6
| |||
| |||
|
|
Hello Scott, I think you have the best idea using the check sum. For performance I would look at creating a stored proc which DTS can call from with in the Data pump task in your source database. Have you considered deletes as well? This is easy to do against your audit table just do a left outer join against your source tables to see if the keys exist. I would move away from the cursor as you don't need it. You can do all you comparisons in a Select statement. I would also look at creating three DTS data pump tasks. 1. New Records 2. Updated Records 3. Deleted Records. I use Binary checksum all the time. It works well. There is a good article on using this as a practice see: http://msdn.microsoft.com/library/de...tbpwithdts.asp Look under the section titled 'Data Transformation and Cleansing Approach' For everyone else I have noted some points have a look at the following. 1. Triggers would be to slow against large inserts into the source tables. Great for small amounts of data, but remember a trigger will create an over head on you OLTP system. 2. Replication, although on paper this looks like a great idea again you have an over head and new Services have to be enabled if you are not doing replication already. Also I believe (not 100%) SQL server 2000 Replication does not support DDL changes. Still Replication would be a great solution if the transactions are of a reasonable size. 3. One other option to look at if the tables where not being drop is log shipping. Shipping the transaction log and processing the changes against a database on another server is a great way of mirroring the database. Hope this helps Myles Matheson Data Warehouse Architect |
#7
| |||
| |||
|
|
Hello Scott, I think you have the best idea using the check sum. For performance I would look at creating a stored proc which DTS can call from with in the Data pump task in your source database. Have you considered deletes as well? This is easy to do against your audit table just do a left outer join against your source tables to see if the keys exist. I would move away from the cursor as you don't need it. You can do all you comparisons in a Select statement. I would also look at creating three DTS data pump tasks. 1. New Records 2. Updated Records 3. Deleted Records. I use Binary checksum all the time. It works well. There is a good article on using this as a practice see: http://msdn.microsoft.com/library/de...tbpwithdts.asp Look under the section titled 'Data Transformation and Cleansing Approach' For everyone else I have noted some points have a look at the following. 1. Triggers would be to slow against large inserts into the source tables. Great for small amounts of data, but remember a trigger will create an over head on you OLTP system. 2. Replication, although on paper this looks like a great idea again you have an over head and new Services have to be enabled if you are not doing replication already. Also I believe (not 100%) SQL server 2000 Replication does not support DDL changes. Still Replication would be a great solution if the transactions are of a reasonable size. 3. One other option to look at if the tables where not being drop is log shipping. Shipping the transaction log and processing the changes against a database on another server is a great way of mirroring the database. Hope this helps Myles Matheson Data Warehouse Architect |
#8
| |||
| |||
|
#9
| |||
| |||
|
|
Hello Scott, Are you doing any transformations in ActiveX? (Formatting dates, strings, etc) If not use the copy column transformation in the data pump task. If you are doing transformations format data using TSQL and use the data pump to load the destination tables. ActiveX has a reputation for being slow with large amounts of data. Check out: http://www.sql-server-performance.com/ for tuning tips when using DTS. Also check out www.SQLDTS.com great site for DTS programming. One more thing, with Data pump task you can get great performance if you do the following: 1. Format data using a stored proc 2. Call the stored proc using data pump task 3. pump the data using one transformation 4. use fast load option with table lock Hope this helps Myles |
#10
| |||
| |||
|
![]() |
| Thread Tools | |
| Display Modes | |
| |