dbTalk Databases Forums  

Optimize large INSERT

microsoft.public.sqlserver.dts microsoft.public.sqlserver.dts


Discuss Optimize large INSERT in the microsoft.public.sqlserver.dts forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Dave
 
Posts: n/a

Default Optimize large INSERT - 02-14-2006 , 04:19 PM






Can DTS outperform pure T-SQL on inserts when the source and
destination are in the same database?

I have a staging table with the same schema as my destination table.

My destination has 200 million records and my staging table has 30
million records.

Do you have any suggestions on indexing strategies or optimization
tips?

Currently the clustered index is a covering index on (col2_int,
col3_int, col1_char_55)

SELECT count(*)
FROM myTable_stage a with(nolock)
LEFT OUTER JOIN myTable b with(nolock) ON b.col1_int = a.col1_int
AND b.col2_int = a. col2_int
AND b.col3_char_55 = a.col3_char_55
WHERE b.col1_int IS NULL

col1_int has 1500 unique values
col2_int has 50000 unique values
col3_char_55 has 50000 unique values


Currently I am using one copy column transformation in 50000 batches.


Reply With Quote
  #2  
Old   
Evergray
 
Posts: n/a

Default Re: Optimize large INSERT - 02-15-2006 , 09:19 AM






The problem is 50K inserts? Then the way may be Source --> transform -->
temporary table --> insert (dest) select * from temp. table

What transforms and inserts are performed? Speaking abount indexing strategy
without knowlege about queries is nonsense.
Also, clustering index can't cover queries by definition.

--
WBR, Evergray
--
Words mean nothing...


"Dave" <daveg.01 (AT) gmail (DOT) com> wrote

Quote:
Can DTS outperform pure T-SQL on inserts when the source and
destination are in the same database?

I have a staging table with the same schema as my destination table.

My destination has 200 million records and my staging table has 30
million records.

Do you have any suggestions on indexing strategies or optimization
tips?

Currently the clustered index is a covering index on (col2_int,
col3_int, col1_char_55)

SELECT count(*)
FROM myTable_stage a with(nolock)
LEFT OUTER JOIN myTable b with(nolock) ON b.col1_int = a.col1_int
AND b.col2_int = a. col2_int
AND b.col3_char_55 = a.col3_char_55
WHERE b.col1_int IS NULL

col1_int has 1500 unique values
col2_int has 50000 unique values
col3_char_55 has 50000 unique values


Currently I am using one copy column transformation in 50000 batches.




Reply With Quote
  #3  
Old   
Dave
 
Posts: n/a

Default Re: Optimize large INSERT - 02-15-2006 , 10:59 AM



No the problem is inserting 30 million records in general and
understanding how the DTS copy column transformation using the fast
load option compares to native the T-SQL insert.

Books online claims that DTS uses a buffer when it commits the
transformation in batches. However it is taking so long to do the
insert, I wonder if the transformation really writes 30 million records
to a buffer or if it re-queries the data after 1 or more batches.

I was only interested in the best index strategy for the insert step so
I could test and have a baseline.

Also, why can't clustered indexes be covering indexes? This makes no
sense to me at all.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.