dbTalk Databases Forums  

Can I pick your brains for a second? transferring a large database problem

microsoft.public.sqlserver.datawarehouse microsoft.public.sqlserver.datawarehouse


Discuss Can I pick your brains for a second? transferring a large database problem in the microsoft.public.sqlserver.datawarehouse forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Scott M
 
Posts: n/a

Default Can I pick your brains for a second? transferring a large database problem - 06-06-2005 , 05:33 PM






-=-=- Sorry for the dup post, I'm posting it under my MSDN universal acct
now -=-=-

Okay, I have a problem that I need some suggestions with.

I have a 30 Gig database that I need to sync across a T1 every night. This
database is a data dump of our billing system data, the tables are dropped
ad re-created every night.

I can do a very long and laborious process on the server of creating tables
with the primary keys from each table and a binary checksum of a row, then
complaining it to a copy of the database from the day before. Then I
generate SQL statements for the updates (Actually deletes and inserts). All
of this on the source server.

I'd like to compact this down using a DTS package on the destination side,
however I can't use a dynamic query in the "openRowSet" command to get only
the row that I want.

Here is what I'm doing (Just for one table, I'll have data driven steps for
each table in my DB)

insert into @tempChecker
select BSum, chgNo from openquery(EMBAN2, 'Select BINARY_CHECKSUM(*) BSum,
chgno from charge_t')

Declare Charge_t_Cursor Cursor
FOR
select
a.chgno
from
(select
BSum,
chgno
from @tempChecker
) a
where
not exists
(
select 'x'
from charge_t b
where a.chgno = b.chgno and a.BSum <> BINARY_CHECKSUM(*)
)

Once I get all the chgno's (The key in the table) I want to only pull back
the data that has changes. So I have this (Which I know I can't do put you
get the idea)

Open Charge_t_Cursor

Fetch next from Charge_t_cursor into @ChgNo

While @@FETCH_STATUS = 0
BEGIN
insert into @ChargeT select *
from OPENROWSET(EMBAN2, 'SELECT * FROM charge_t WHERE chgno = ' +
cast(isnull(@ChgNo, 0) as varchar)))

Fetch next from Charge_t_cursor into @ChgNo
END

Close Charge_t_cursor
deallocate Charge_t_cursor

select * from @ChargeT

this way I'm only pulling over the wire the data that is updated and it will
be in a nice, compact binary format. ANY help would be VERY appreciated!!!
Thanks much!

Scott


Reply With Quote
  #2  
Old   
Peter Yang [MSFT]
 
Posts: n/a

Default RE: Can I pick your brains for a second? transferring a large database problem - 06-06-2005 , 10:13 PM






Hello Scott,

It is not easy to use a DTS package to do this job. I think database
replication is more appropriate for this situation. You may consider
transaction or merge replication according to your requirements.

If you still want to consider DTS, lookup query feature might be helpful

http://msdn.microsoft.com/library/de...us/dtssql/dts_
addf_misc_2dix.asp

Regards,

Peter Yang
MCSE2000/2003, MCSA, MCDBA
Microsoft Online Partner Support

When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.

================================================== ===


This posting is provided "AS IS" with no warranties, and confers no rights.




--------------------
Quote:
From: "Scott M" <scott_M (AT) nospam (DOT) nospam
Subject: Can I pick your brains for a second? transferring a large
database problem
Date: Mon, 6 Jun 2005 17:33:03 -0500
Lines: 72
MIME-Version: 1.0
Content-Type: text/plain;
format=flowed;
charset="iso-8859-1";
reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.2527
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527
Message-ID: <ui2eBfuaFHA.584 (AT) TK2MSFTNGP15 (DOT) phx.gbl
Newsgroups: microsoft.public.sqlserver.datawarehouse
NNTP-Posting-Host: adsl-068-209-157-050.sip.lft.bellsouth.net
68.209.157.50
Path: TK2MSFTNGXA01.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFT NGP15.phx.gbl
Xref: TK2MSFTNGXA01.phx.gbl microsoft.public.sqlserver.datawarehouse:1784
X-Tomcat-NG: microsoft.public.sqlserver.datawarehouse

-=-=- Sorry for the dup post, I'm posting it under my MSDN universal acct
now -=-=-

Okay, I have a problem that I need some suggestions with.

I have a 30 Gig database that I need to sync across a T1 every night.
This
database is a data dump of our billing system data, the tables are dropped
ad re-created every night.

I can do a very long and laborious process on the server of creating
tables
with the primary keys from each table and a binary checksum of a row, then
complaining it to a copy of the database from the day before. Then I
generate SQL statements for the updates (Actually deletes and inserts).
All
of this on the source server.

I'd like to compact this down using a DTS package on the destination side,
however I can't use a dynamic query in the "openRowSet" command to get
only
the row that I want.

Here is what I'm doing (Just for one table, I'll have data driven steps
for
each table in my DB)

insert into @tempChecker
select BSum, chgNo from openquery(EMBAN2, 'Select BINARY_CHECKSUM(*) BSum,
chgno from charge_t')

Declare Charge_t_Cursor Cursor
FOR
select
a.chgno
from
(select
BSum,
chgno
from @tempChecker
) a
where
not exists
(
select 'x'
from charge_t b
where a.chgno = b.chgno and a.BSum <> BINARY_CHECKSUM(*)
)

Once I get all the chgno's (The key in the table) I want to only pull back
the data that has changes. So I have this (Which I know I can't do put
you
get the idea)

Open Charge_t_Cursor

Fetch next from Charge_t_cursor into @ChgNo

While @@FETCH_STATUS = 0
BEGIN
insert into @ChargeT select *
from OPENROWSET(EMBAN2, 'SELECT * FROM charge_t WHERE chgno = ' +
cast(isnull(@ChgNo, 0) as varchar)))

Fetch next from Charge_t_cursor into @ChgNo
END

Close Charge_t_cursor
deallocate Charge_t_cursor

select * from @ChargeT

this way I'm only pulling over the wire the data that is updated and it
will
be in a nice, compact binary format. ANY help would be VERY
appreciated!!!
Thanks much!

Scott




Reply With Quote
  #3  
Old   
Scott M
 
Posts: n/a

Default Re: Can I pick your brains for a second? transferring a large database problem - 06-06-2005 , 11:35 PM



Would replication work if the source tables are dropped every time it's
built?

"Peter Yang [MSFT]" <petery (AT) online (DOT) microsoft.com> wrote

Quote:
Hello Scott,

It is not easy to use a DTS package to do this job. I think database
replication is more appropriate for this situation. You may consider
transaction or merge replication according to your requirements.

If you still want to consider DTS, lookup query feature might be helpful

http://msdn.microsoft.com/library/de...us/dtssql/dts_
addf_misc_2dix.asp

Regards,

Peter Yang
MCSE2000/2003, MCSA, MCDBA
Microsoft Online Partner Support

When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.

================================================== ===


This posting is provided "AS IS" with no warranties, and confers no
rights.




--------------------
| From: "Scott M" <scott_M (AT) nospam (DOT) nospam
| Subject: Can I pick your brains for a second? transferring a large
database problem
| Date: Mon, 6 Jun 2005 17:33:03 -0500
| Lines: 72
| MIME-Version: 1.0
| Content-Type: text/plain;
| format=flowed;
| charset="iso-8859-1";
| reply-type=original
| Content-Transfer-Encoding: 7bit
| X-Priority: 3
| X-MSMail-Priority: Normal
| X-Newsreader: Microsoft Outlook Express 6.00.2900.2527
| X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527
| Message-ID: <ui2eBfuaFHA.584 (AT) TK2MSFTNGP15 (DOT) phx.gbl
| Newsgroups: microsoft.public.sqlserver.datawarehouse
| NNTP-Posting-Host: adsl-068-209-157-050.sip.lft.bellsouth.net
68.209.157.50
| Path: TK2MSFTNGXA01.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFT NGP15.phx.gbl
| Xref: TK2MSFTNGXA01.phx.gbl
microsoft.public.sqlserver.datawarehouse:1784
| X-Tomcat-NG: microsoft.public.sqlserver.datawarehouse
|
| -=-=- Sorry for the dup post, I'm posting it under my MSDN universal
acct
| now -=-=-
|
| Okay, I have a problem that I need some suggestions with.
|
| I have a 30 Gig database that I need to sync across a T1 every night.
This
| database is a data dump of our billing system data, the tables are
dropped
| ad re-created every night.
|
| I can do a very long and laborious process on the server of creating
tables
| with the primary keys from each table and a binary checksum of a row,
then
| complaining it to a copy of the database from the day before. Then I
| generate SQL statements for the updates (Actually deletes and inserts).
All
| of this on the source server.
|
| I'd like to compact this down using a DTS package on the destination
side,
| however I can't use a dynamic query in the "openRowSet" command to get
only
| the row that I want.
|
| Here is what I'm doing (Just for one table, I'll have data driven steps
for
| each table in my DB)
|
| insert into @tempChecker
| select BSum, chgNo from openquery(EMBAN2, 'Select BINARY_CHECKSUM(*)
BSum,
| chgno from charge_t')
|
| Declare Charge_t_Cursor Cursor
| FOR
| select
| a.chgno
| from
| (select
| BSum,
| chgno
| from @tempChecker
| ) a
| where
| not exists
| (
| select 'x'
| from charge_t b
| where a.chgno = b.chgno and a.BSum <> BINARY_CHECKSUM(*)
| )
|
| Once I get all the chgno's (The key in the table) I want to only pull
back
| the data that has changes. So I have this (Which I know I can't do put
you
| get the idea)
|
| Open Charge_t_Cursor
|
| Fetch next from Charge_t_cursor into @ChgNo
|
| While @@FETCH_STATUS = 0
| BEGIN
| insert into @ChargeT select *
| from OPENROWSET(EMBAN2, 'SELECT * FROM charge_t WHERE chgno = ' +
| cast(isnull(@ChgNo, 0) as varchar)))
|
| Fetch next from Charge_t_cursor into @ChgNo
| END
|
| Close Charge_t_cursor
| deallocate Charge_t_cursor
|
| select * from @ChargeT
|
| this way I'm only pulling over the wire the data that is updated and it
will
| be in a nice, compact binary format. ANY help would be VERY
appreciated!!!
| Thanks much!
|
| Scott
|
|



Reply With Quote
  #4  
Old   
naka55n@hotmail.com
 
Posts: n/a

Default Re: Can I pick your brains for a second? transferring a large database problem - 06-07-2005 , 09:43 AM



How many tables are you talking about?

Why not just use triggers and create "shadow tables" that contain the
changes to your data?


Reply With Quote
  #5  
Old   
Myles.Matheson@gmail.com
 
Posts: n/a

Default Re: Can I pick your brains for a second? transferring a large database problem - 06-07-2005 , 04:33 PM



Hello Scott,

I think you have the best idea using the check sum. For performance I
would look at creating a stored proc which DTS can call from with in
the Data pump task in your source database.

Have you considered deletes as well? This is easy to do against your
audit table just do a left outer join against your source tables to see
if the keys exist.

I would move away from the cursor as you don't need it. You can do
all you comparisons in a Select statement.

I would also look at creating three DTS data pump tasks.

1. New Records
2. Updated Records
3. Deleted Records.

I use Binary checksum all the time. It works well. There is a good
article on using this as a practice see:

http://msdn.microsoft.com/library/de...tbpwithdts.asp

Look under the section titled 'Data Transformation and Cleansing
Approach'


For everyone else I have noted some points have a look at the
following.

1. Triggers would be to slow against large inserts into the source
tables. Great for small amounts of data, but remember a trigger will
create an over head on you OLTP system.
2. Replication, although on paper this looks like a great idea again
you have an over head and new Services have to be enabled if you are
not doing replication already. Also I believe (not 100%) SQL server
2000 Replication does not support DDL changes. Still Replication would
be a great solution if the transactions are of a reasonable size.
3. One other option to look at if the tables where not being drop is
log shipping. Shipping the transaction log and processing the changes
against a database on another server is a great way of mirroring the
database.


Hope this helps

Myles Matheson
Data Warehouse Architect


Reply With Quote
  #6  
Old   
Scott M
 
Posts: n/a

Default Re: Can I pick your brains for a second? transferring a large database problem - 06-08-2005 , 05:20 PM



VERY awesome. I'm learning a lot from the article. But now I have this
issue...

It's taking .5 (almost) seconds to insert each record and each record is in
it's own transaction (Wouldn't that increase overhead?). The table is 243
columns wide and I'm loading it all through an ActiveX script. Any other
suggestions?

Hey, At least I can insert data!!!!! The Update doesn't seem to want to
run in tandem yet

<Myles.Matheson (AT) gmail (DOT) com> wrote

Quote:
Hello Scott,

I think you have the best idea using the check sum. For performance I
would look at creating a stored proc which DTS can call from with in
the Data pump task in your source database.

Have you considered deletes as well? This is easy to do against your
audit table just do a left outer join against your source tables to see
if the keys exist.

I would move away from the cursor as you don't need it. You can do
all you comparisons in a Select statement.

I would also look at creating three DTS data pump tasks.

1. New Records
2. Updated Records
3. Deleted Records.

I use Binary checksum all the time. It works well. There is a good
article on using this as a practice see:

http://msdn.microsoft.com/library/de...tbpwithdts.asp

Look under the section titled 'Data Transformation and Cleansing
Approach'


For everyone else I have noted some points have a look at the
following.

1. Triggers would be to slow against large inserts into the source
tables. Great for small amounts of data, but remember a trigger will
create an over head on you OLTP system.
2. Replication, although on paper this looks like a great idea again
you have an over head and new Services have to be enabled if you are
not doing replication already. Also I believe (not 100%) SQL server
2000 Replication does not support DDL changes. Still Replication would
be a great solution if the transactions are of a reasonable size.
3. One other option to look at if the tables where not being drop is
log shipping. Shipping the transaction log and processing the changes
against a database on another server is a great way of mirroring the
database.


Hope this helps

Myles Matheson
Data Warehouse Architect



Reply With Quote
  #7  
Old   
Scott M
 
Posts: n/a

Default Re: Can I pick your brains for a second? transferring a large database problem - 06-08-2005 , 05:22 PM



Sorry, inserts are on average of .1 seconds....

<Myles.Matheson (AT) gmail (DOT) com> wrote

Quote:
Hello Scott,

I think you have the best idea using the check sum. For performance I
would look at creating a stored proc which DTS can call from with in
the Data pump task in your source database.

Have you considered deletes as well? This is easy to do against your
audit table just do a left outer join against your source tables to see
if the keys exist.

I would move away from the cursor as you don't need it. You can do
all you comparisons in a Select statement.

I would also look at creating three DTS data pump tasks.

1. New Records
2. Updated Records
3. Deleted Records.

I use Binary checksum all the time. It works well. There is a good
article on using this as a practice see:

http://msdn.microsoft.com/library/de...tbpwithdts.asp

Look under the section titled 'Data Transformation and Cleansing
Approach'


For everyone else I have noted some points have a look at the
following.

1. Triggers would be to slow against large inserts into the source
tables. Great for small amounts of data, but remember a trigger will
create an over head on you OLTP system.
2. Replication, although on paper this looks like a great idea again
you have an over head and new Services have to be enabled if you are
not doing replication already. Also I believe (not 100%) SQL server
2000 Replication does not support DDL changes. Still Replication would
be a great solution if the transactions are of a reasonable size.
3. One other option to look at if the tables where not being drop is
log shipping. Shipping the transaction log and processing the changes
against a database on another server is a great way of mirroring the
database.


Hope this helps

Myles Matheson
Data Warehouse Architect



Reply With Quote
  #8  
Old   
Myles.Matheson@gmail.com
 
Posts: n/a

Default Re: Can I pick your brains for a second? transferring a large database problem - 06-08-2005 , 06:41 PM



Hello Scott,

Are you doing any transformations in ActiveX? (Formatting dates,
strings, etc) If not use the copy column transformation in the data
pump task. If you are doing transformations format data using TSQL and
use the data pump to load the destination tables.

ActiveX has a reputation for being slow with large amounts of data.

Check out: http://www.sql-server-performance.com/ for tuning tips when
using DTS.

Also check out www.SQLDTS.com great site for DTS programming.


One more thing, with Data pump task you can get great performance if
you do the following:

1. Format data using a stored proc
2. Call the stored proc using data pump task
3. pump the data using one transformation
4. use fast load option with table lock

Hope this helps

Myles


Reply With Quote
  #9  
Old   
Scott M
 
Posts: n/a

Default Re: Can I pick your brains for a second? transferring a large database problem - 06-08-2005 , 07:01 PM



Well, I'm using a linked server to retrieve the data. I was just pulling
back the keys that I need to insert and then performing a lookup.

It was my understanding that if I perform a query on a linked server it will
pull the entire dataset over to the working server. (Think I read it on
MSDN article somewhere)

If that is the case then I would just do a select from the remote server to
pull all the data.

Is my thinking correct?

<Myles.Matheson (AT) gmail (DOT) com> wrote

Quote:
Hello Scott,

Are you doing any transformations in ActiveX? (Formatting dates,
strings, etc) If not use the copy column transformation in the data
pump task. If you are doing transformations format data using TSQL and
use the data pump to load the destination tables.

ActiveX has a reputation for being slow with large amounts of data.

Check out: http://www.sql-server-performance.com/ for tuning tips when
using DTS.

Also check out www.SQLDTS.com great site for DTS programming.


One more thing, with Data pump task you can get great performance if
you do the following:

1. Format data using a stored proc
2. Call the stored proc using data pump task
3. pump the data using one transformation
4. use fast load option with table lock

Hope this helps

Myles



Reply With Quote
  #10  
Old   
Myles.Matheson@gmail.com
 
Posts: n/a

Default Re: Can I pick your brains for a second? transferring a large database problem - 06-08-2005 , 07:56 PM



Good question, I have read something similar around joining tables.

To be honest I try and avoid a pull process from any source system
(although is may not be possible in all cases). I prefer the source to
push the required files (data extracts) to the data warehouse.

This way if the source system goes down the data warehouse is not
affected and like wise if the DW server goes down the source server is
not affected.

In your case can you create the audit table on the source system? This
way you will process everything on the source server and only pull back
the new, updated records. To make this work DTS would call the Store
procedures from a SQL Execute Task or in the data pump task. This would
stop you from using Linked Servers.

Myles


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.