![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp output on disk. |
#3
| |||
| |||
|
|
Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp output on disk. |
#4
| |||
| |||
|
|
On Nov 10, 2:11 pm, "Tester"<teste... (AT) tester (DOT) cc> wrote: Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp output on disk. Or if you can't bypass the disk completely and don't have the CIS plugin, I've had good results on some O/S's by doing this: 1) Create a FIFO (named pipe). 2) Start up a bcp in from the FIFO. 3) Run a bcp out to the FIFO. Keith |
#5
| |||
| |||
|
|
Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp output on disk. |
#6
| |||
| |||
|
|
SELECT INTO does look like the fastest method. But in our case, for instance, we have a table with approx. 150M rows. While SELECT INTO took roughly 3 hours, it's the subsequent build of clustered index that times out. Is there a way to boost that, maybe run something in parallel? Partitioning this table is not an open option at this time, since we are prohibited to change the model by the vendor. "Leonid Gvirtz" <lgvi... (AT) yahoo (DOT) com> wrote in message news:09a014e8-87d6-4421-98d2-e3c043ec4195 (AT) t2g2000yqn (DOT) googlegroups.com... On Nov 10, 10:11 pm, "Tester" <teste... (AT) tester (DOT) cc> wrote: Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp outputon disk. Hi Assuming that you need to move the data between different dataservers, you can use CIS for this purpose. Create a proxy table in the destination database that points to your source table in the source dataserver. Then, you will be able to run SELECT INTO statement into a new table in the destination dataserver from the proxy table with minimal logging. You may want to tune some CIS-related parameters on source and destination dataservers for better performance, such as "cis packet size", "cis bulk insert array size", "cis cursor rows" etc. Hope it helps Leonid Gvirtzhttp://www.gvirtz-consulting.com |
#7
| |||
| |||
|
|
SELECT INTO does look like the fastest method. But in our case, for instance, we have a table with approx. 150M rows. While SELECT INTO took roughly 3 hours, it's the subsequent build of clustered index that times out. Is there a way to boost that, maybe run something in parallel? Partitioning this table is not an open option at this time, since we are prohibited to change the model by the vendor. "Leonid Gvirtz" <lgvi... (AT) yahoo (DOT) com> wrote in message news:09a014e8-87d6-4421-98d2-e3c043ec4195 (AT) t2g2000yqn (DOT) googlegroups.com... On Nov 10, 10:11 pm, "Tester" <teste... (AT) tester (DOT) cc> wrote: Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp output on disk. Hi Assuming that you need to move the data between different dataservers, you can use CIS for this purpose. Create a proxy table in the destination database that points to your source table in the source dataserver. Then, you will be able to run SELECT INTO statement into a new table in the destination dataserver from the proxy table with minimal logging. You may want to tune some CIS-related parameters on source and destination dataservers for better performance, such as "cis packet size", "cis bulk insert array size", "cis cursor rows" etc. Hope it helps Leonid Gvirtzhttp://www.gvirtz-consulting.com |
#8
| |||
| |||
|
|
That's what I was thinking exactly, however, this seem to completely fill the logsegment in tempdb unless we make it the size of this enormous table we got. This happens during the SELECT INTO FROM src_table ORDER BY step. "Leonid Gvirtz" <lgvirtz (AT) yahoo (DOT) com> wrote in message news:670903cc-2e4d-47ae-85ff-8ed548a98113 (AT) d5g2000yqm (DOT) googlegroups.com... Hi Axel Did you try WITH SORTED_DATA option for creation of your clustered index? If not - then it may be the way. Add ORDER BY clause according to the clustered index fields to your SELECT INTO statement. Then, you will be able to skip the data copying during the clustered index creation by using WITH SORTED_DATA option since the data in the resulting table is already sorted. See this link for more details: http://infocenter.sybase.com/help/to...868.htm#X24774 Hope it helps Leonid Gvirtz http://www.gvirtz-consulting.com On Nov 18, 2:45 am, "Axel" <tes... (AT) tester (DOT) cc> wrote: SELECT INTO does look like the fastest method. But in our case, for instance, we have a table with approx. 150M rows. While SELECT INTO took roughly 3 hours, it's the subsequent build of clustered index that times out. Is there a way to boost that, maybe run something in parallel? Partitioning this table is not an open option at this time, since we are prohibited to change the model by the vendor. "Leonid Gvirtz" <lgvi... (AT) yahoo (DOT) com> wrote in message news:09a014e8-87d6-4421-98d2-e3c043ec4195 (AT) t2g2000yqn (DOT) googlegroups.com... On Nov 10, 10:11 pm, "Tester" <teste... (AT) tester (DOT) cc> wrote: Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp output on disk. Hi Assuming that you need to move the data between different dataservers, you can use CIS for this purpose. Create a proxy table in the destination database that points to your source table in the source dataserver. Then, you will be able to run SELECT INTO statement into a new table in the destination dataserver from the proxy table with minimal logging. You may want to tune some CIS-related parameters on source and destination dataservers for better performance, such as "cis packet size", "cis bulk insert array size", "cis cursor rows" etc. Hope it helps Leonid Gvirtzhttp://www.gvirtz-consulting.com |
#9
| |||
| |||
|
|
If the source table has an index identical to the desired clustered index for the new table, consider having the SELECT/INTO use said index (force it if necessary) from the source table. Then use the 'with sorted_data' option when creating the clustered index on the new table. Upside: should eliminate the overhead of sorting the table (either via select/into/order-by or create/index) Downside: SELECT could take longer to run and require more physical/logical IOs, especially if having to make 150 million hops from index leaf pages to data pages Obviously (?): - if the source table is APL with a clustered index, then insuring the SELECT uses said clustered index should be (relatively) quick *and* allow for the use of the 'with sorted_data' option - use of parallel/worker threads for the SELECT could cause the data to be scrambled durint the INSERT step, thus negating the ability to use the 'with sorted_data' option Axel wrote: That's what I was thinking exactly, however, this seem to completely fill the logsegment in tempdb unless we make it the size of this enormous table we got. This happens during the SELECT INTO FROM src_table ORDER BY step. "Leonid Gvirtz" <lgvirtz (AT) yahoo (DOT) com> wrote in message news:670903cc-2e4d-47ae-85ff-8ed548a98113 (AT) d5g2000yqm (DOT) googlegroups.com... Hi Axel Did you try WITH SORTED_DATA option for creation of your clustered index? If not - then it may be the way. Add ORDER BY clause according to the clustered index fields to your SELECT INTO statement. Then, you will be able to skip the data copying during the clustered index creation by using WITH SORTED_DATA option since the data in the resulting table is already sorted. See this link for more details: http://infocenter.sybase.com/help/to...868.htm#X24774 Hope it helps Leonid Gvirtz http://www.gvirtz-consulting.com On Nov 18, 2:45 am, "Axel" <tes... (AT) tester (DOT) cc> wrote: SELECT INTO does look like the fastest method. But in our case, for instance, we have a table with approx. 150M rows. While SELECT INTO took roughly 3 hours, it's the subsequent build of clustered index that times out. Is there a way to boost that, maybe run something in parallel? Partitioning this table is not an open option at this time, since we are prohibited to change the model by the vendor. "Leonid Gvirtz" <lgvi... (AT) yahoo (DOT) com> wrote in message news:09a014e8-87d6-4421-98d2-e3c043ec4195 (AT) t2g2000yqn (DOT) googlegroups.com... On Nov 10, 10:11 pm, "Tester" <teste... (AT) tester (DOT) cc> wrote: Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp output on disk. Hi Assuming that you need to move the data between different dataservers, you can use CIS for this purpose. Create a proxy table in the destination database that points to your source table in the source dataserver. Then, you will be able to run SELECT INTO statement into a new table in the destination dataserver from the proxy table with minimal logging. You may want to tune some CIS-related parameters on source and destination dataservers for better performance, such as "cis packet size", "cis bulk insert array size", "cis cursor rows" etc. Hope it helps Leonid Gvirtzhttp://www.gvirtz-consulting.com |
#10
| |||
| |||
|
|
Instead of: * * *if the source table is APL with a clustered index Should have been: * * *if the source table is APL with the same desired clustered index Mark A. Parsons wrote: If the source table has an index identical to the desired clustered index for the new table, consider having the SELECT/INTO use said index (force it if necessary) from the source table. Then use the 'with sorted_data' option when creating the clustered index on the new table. Upside: *should eliminate the overhead of sorting the table (either via select/into/order-by or create/index) Downside: *SELECT could take longer to run and require more physical/logical IOs, especially if having to make 150 million hops from index leaf pages to data pages Obviously (?): - if the source table is APL with a clustered index, then insuring the SELECT uses said clustered index should be (relatively) quick *and* allow for the use of the 'with sorted_data' option - use of parallel/worker threads for the SELECT could cause the data to be scrambled durint the INSERT step, thus negating the ability to use the 'with sorted_data' option Axel wrote: That's what I was thinking exactly, however, this seem to completely fill the logsegment in tempdb unless we make it the size of this enormous table we got. This happens during the SELECT INTO FROM src_table ORDER BY step. "Leonid Gvirtz" <lgvi... (AT) yahoo (DOT) com> wrote in message news:670903cc-2e4d-47ae-85ff-8ed548a98113 (AT) d5g2000yqm (DOT) googlegroups.com.... Hi Axel Did you try WITH SORTED_DATA option for creation of your clustered index? If not - then it may be the way. Add ORDER BY clause according to the clustered index fields to your SELECT INTO statement. Then, you will be able to skip the data copying during the clustered index creation by using WITH SORTED_DATA option since the data in the resulting table is already sorted. See this link for more details: http://infocenter.sybase.com/help/to...center.dc36272.... Hope it helps Leonid Gvirtz http://www.gvirtz-consulting.com On Nov 18, 2:45 am, "Axel" <tes... (AT) tester (DOT) cc> wrote: SELECT INTO does look like the fastest method. But in our case, for instance, we have a table with approx. 150M rows. While SELECT INTO took roughly 3 hours, it's the subsequent build of clustered index that times out. Is there a way to boost that, maybe run something in parallel? Partitioning this table is not an open option at this time, since we are prohibited to change the model by the vendor. "Leonid Gvirtz" <lgvi... (AT) yahoo (DOT) com> wrote in message news:09a014e8-87d6-4421-98d2-e3c043ec4195 (AT) t2g2000yqn (DOT) googlegroups.com.... On Nov 10, 10:11 pm, "Tester" <teste... (AT) tester (DOT) cc> wrote: Bypassing the disk. Something like a batch select into dest_db..dest_table from src_db..src_table or insert select. Anything that does not involve excessive logging and storing bcp output on disk. Hi Assuming that you need to move the data between different dataservers, you can use CIS for this purpose. Create a proxy table in the destination database that points to your source table in the source dataserver. Then, you will be able to run SELECT INTO statement into a new table in the destination dataserver from the proxy table with minimal logging. You may want to tune some CIS-related parameters on source and destination dataservers for better performance, such as "cis packet size", "cis bulk insert array size", "cis cursor rows" etc. Hope it helps Leonid Gvirtzhttp://www.gvirtz-consulting.com |
![]() |
| Thread Tools | |
| Display Modes | |
| |