![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi, we have a table 'tableA' in 'databaseA' that has 400 million rows and 60 gig big, without any index on it. We need to build a cluster index on tableA, and a non- clustered index on 2 of its columns. Hence we created tableB which is identical to tableA only tableB will with index built on it. TableB will also be in a different database databaseB on the same server, because we also need to reorganize the filegroup for databaseA. after I done all this, I simply use DTS wizard created a dts package, in the saved package, I modified the 'select' query to have 'order by' clause on the clustered index column, and then run the dts package. this package have been running 40 hours now, it's writting data to the new table because the file size of the new table has been increasing. my estimation of this packege will need another 10 hr to complete. did I do something wrong here? is there a better way to build indexes on such a big table? will building the clustered indexes directly on the original table faster? if so, it will create more tran-log, isn't it? desperatly need your advice!! thanks a ton!! JJ |
#3
| |||
| |||
|
|
Hi, we have a table 'tableA' in 'databaseA' that has 400 million rows and 60 gig big, without any index on it. We need to build a cluster index on tableA, and a non- clustered index on 2 of its columns. Hence we created tableB which is identical to tableA only tableB will with index built on it. TableB will also be in a different database databaseB on the same server, because we also need to reorganize the filegroup for databaseA. after I done all this, I simply use DTS wizard created a dts package, in the saved package, I modified the 'select' query to have 'order by' clause on the clustered index column, and then run the dts package. this package have been running 40 hours now, it's writting data to the new table because the file size of the new table has been increasing. my estimation of this packege will need another 10 hr to complete. did I do something wrong here? is there a better way to build indexes on such a big table? will building the clustered indexes directly on the original table faster? if so, it will create more tran-log, isn't it? desperatly need your advice!! thanks a ton!! JJ |
#4
| |||
| |||
|
|
-----Original Message----- JJ, Here is a good guideline excerpt from BOL: As a general guide, the following table shows suggested figures for the amount of data to be added to a table for various types of indexes. If you exceed these percentages, you may find it faster to drop and re-create the indexes. Indexes Amount of data added Clustered index only 30% Clustered and one nonclustered index 25% Clustered and two nonclustered indexes 25% Single nonclustered index only 100% Two nonclustered indexes 60% -- -oj RAC v2.2 & QALite! http://www.rac4sql.net "JJ Wang" <anonymous (AT) discussions (DOT) microsoft.com> wrote in message news:02d801c3c733$fc929430$a101280a (AT) phx (DOT) gbl... Hi, we have a table 'tableA' in 'databaseA' that has 400 million rows and 60 gig big, without any index on it. We need to build a cluster index on tableA, and a non- clustered index on 2 of its columns. Hence we created tableB which is identical to tableA only tableB will with index built on it. TableB will also be in a different database databaseB on the same server, because we also need to reorganize the filegroup for databaseA. after I done all this, I simply use DTS wizard created a dts package, in the saved package, I modified the 'select' query to have 'order by' clause on the clustered index column, and then run the dts package. this package have been running 40 hours now, it's writting data to the new table because the file size of the new table has been increasing. my estimation of this packege will need another 10 hr to complete. did I do something wrong here? is there a better way to build indexes on such a big table? will building the clustered indexes directly on the original table faster? if so, it will create more tran-log, isn't it? desperatly need your advice!! thanks a ton!! JJ . |
#5
| |||
| |||
|
|
-----Original Message----- Well JJ, You do have a lot of data in your table... I think you have done the right thing cause it would have taken a long time re-arranging the pages while creating the clustered index. 40 hours and 10 more to go would can probably be explained by a combination of things * you original table not having any indexes as such (that would make it very long to sort the results and actually fetch the data considering it will probably belong to a lot of data pages) * if you have the new database on the same physical drive. (different partition doesnt make a difference) I guess DBAs might have a better answer Hope this helps HD "JJ Wang" <anonymous (AT) discussions (DOT) microsoft.com> wrote in message news:02d801c3c733$fc929430$a101280a (AT) phx (DOT) gbl... Hi, we have a table 'tableA' in 'databaseA' that has 400 million rows and 60 gig big, without any index on it. We need to build a cluster index on tableA, and a non- clustered index on 2 of its columns. Hence we created tableB which is identical to tableA only tableB will with index built on it. TableB will also be in a different database databaseB on the same server, because we also need to reorganize the filegroup for databaseA. after I done all this, I simply use DTS wizard created a dts package, in the saved package, I modified the 'select' query to have 'order by' clause on the clustered index column, and then run the dts package. this package have been running 40 hours now, it's writting data to the new table because the file size of the new table has been increasing. my estimation of this packege will need another 10 hr to complete. did I do something wrong here? is there a better way to build indexes on such a big table? will building the clustered indexes directly on the original table faster? if so, it will create more tran-log, isn't it? desperatly need your advice!! thanks a ton!! JJ . |
#6
| ||||
| ||||
|
|
we have a table 'tableA' in 'databaseA' that has 400 million rows and 60 gig big, without any index on it. |
|
We need to build a cluster index on tableA, and a non- clustered index on 2 of its columns. Hence we created tableB which is identical to tableA only tableB will with index built on it. TableB will also be in a different database databaseB on the same server, because we also need to reorganize the filegroup for databaseA. after I done all this, I simply use DTS wizard created a dts package, in the saved package, I modified the 'select' query to have 'order by' clause on the clustered index column, and then run the dts package. |
|
this package have been running 40 hours now, it's writting data to the new table because the file size of the new table has been increasing. my estimation of this packege will need another 10 hr to complete. |
|
did I do something wrong here? is there a better way to build indexes on such a big table? will building the clustered indexes directly on the original table faster? if so, it will create more tran-log, isn't it? |

#7
| |||
| |||
|
|
You could have simply used: Insert Into databaseB..tableB With(TABLOCKX) (...) Select ... From DatabaseA..tableA With(TABLOCK) |
#8
| |||
| |||
|
|
Insert Into databaseB..tableB With(TABLOCKX) (...) Select ... From DatabaseA..tableA With(TABLOCK) I wouldn't do that with 40 million rows, much less 400 million. This is exactly why they created BCP. |
#9
| |||
| |||
|
|
-----Original Message----- On Sat, 20 Dec 2003 22:17:27 -0500, "James Hokes" wrote: Insert Into databaseB..tableB With(TABLOCKX) (...) Select ... From DatabaseA..tableA With(TABLOCK) I wouldn't do that with 40 million rows, much less 400 million. This is exactly why they created BCP. Hmmm, good point - I was thinking that the recovery model would alleviate the problem of transaction logs, but thinking now, that only works for SELECT INTO ... and then you still have the clustered index to create. So, just BCP the source table out sorted, and BULK INSERT it into the destination. DTS is kind of doing this on-the-fly, so maybe it is the better solution. Either way, it's that 'order by' on the source table that probably contributed most to the 40 hours. An index would greatly improve the performance. Since it sounded like it was a once-off activity, it's all probably moot now. cheers, Ross. -- Ross McKay, WebAware Pty Ltd "The lawn could stand another mowing; funny, I don't even care" - Elvis Costello . |
#10
| |||
| |||
|
|
in my situation, BCP is better or DTS packege is better in terms of speed and tran-log growth management? |

![]() |
| Thread Tools | |
| Display Modes | |
| |