dbTalk Databases Forums  

updating 1.5 millions rows slow...

microsoft.public.sqlserver.dts microsoft.public.sqlserver.dts


Discuss updating 1.5 millions rows slow... in the microsoft.public.sqlserver.dts forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Jéjé
 
Posts: n/a

Default updating 1.5 millions rows slow... - 06-26-2005 , 10:04 PM






Hi,

I have a DTS package where I load 1.5 millions of rows in a staging table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce the
process time.

during the update the processors are not used, or just a little (25% of 1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.



Reply With Quote
  #2  
Old   
Adam Machanic
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-26-2005 , 10:12 PM






Have you tried not updating the entire table -- but only the modified rows?
(Is that possible in your situation?)


--ANSI-standard method:

UPDATE YourTable
SET YourCol =
(
SELECT SomeVal
FROM OtherTable
WHERE YourTable.ID = OtherTable.ID
)
WHERE
YourTable.YourCol <>
(
SELECT SomeVal
FROM OtherTable
WHERE YourTable.ID = OtherTable.ID
)



--T-SQL proprietary method:

UPDATE YourTable
SET YourTable.YourCol = OtherTable.SomeVal
FROM YourTable
JOIN OtherTable ON YourTable.ID = OtherTable.ID
WHERE
YourTable.YourCol <> OtherTable.SomeVal


--
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--


"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote

Quote:
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce the
process time.

during the update the processors are not used, or just a little (25% of 1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.





Reply With Quote
  #3  
Old   
Jéjé
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-27-2005 , 07:14 AM



yes, but I have to update all rows,
I must update a "value" column (which is null by default) by doing a lookup
in another table.
I can't join the lookup table during the first bulk insert because the big
table come from 1 system and the lookup table from another.
But finally, the update appear to be too slow.

"Adam Machanic" <amachanic (AT) hotmail (DOT) _removetoemail_.com> wrote

Quote:
Have you tried not updating the entire table -- but only the modified
rows?
(Is that possible in your situation?)


--ANSI-standard method:

UPDATE YourTable
SET YourCol =
(
SELECT SomeVal
FROM OtherTable
WHERE YourTable.ID = OtherTable.ID
)
WHERE
YourTable.YourCol
(
SELECT SomeVal
FROM OtherTable
WHERE YourTable.ID = OtherTable.ID
)



--T-SQL proprietary method:

UPDATE YourTable
SET YourTable.YourCol = OtherTable.SomeVal
FROM YourTable
JOIN OtherTable ON YourTable.ID = OtherTable.ID
WHERE
YourTable.YourCol <> OtherTable.SomeVal


--
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--


"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote in message
news:em3$qTseFHA.3028 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging
table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce
the
process time.

during the update the processors are not used, or just a little (25% of 1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.







Reply With Quote
  #4  
Old   
Rick Gittins
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-27-2005 , 11:30 AM



Have you tried to create indexes in your staging table on the field you are
using to link to your reference table after the data is imported, but before
you start the updates?

Rick

"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote

Quote:
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce the
process time.

during the update the processors are not used, or just a little (25% of 1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.





Reply With Quote
  #5  
Old   
Adam Machanic
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-27-2005 , 11:48 AM



Is the column you're updating clustered and/or heavily indexed? Can you
drop some of the indexes before the update, if that's the case? Or if it's
clustered, can you cluster on something else?


--
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--


"Jéjé" <willgart_A_ (AT) hotmail_A_ (DOT) com> wrote

Quote:
yes, but I have to update all rows,
I must update a "value" column (which is null by default) by doing a
lookup
in another table.
I can't join the lookup table during the first bulk insert because the big
table come from 1 system and the lookup table from another.
But finally, the update appear to be too slow.

"Adam Machanic" <amachanic (AT) hotmail (DOT) _removetoemail_.com> wrote in message
news:%23c74YYseFHA.4040 (AT) TK2MSFTNGP14 (DOT) phx.gbl...
Have you tried not updating the entire table -- but only the modified
rows?
(Is that possible in your situation?)


--ANSI-standard method:

UPDATE YourTable
SET YourCol =
(
SELECT SomeVal
FROM OtherTable
WHERE YourTable.ID = OtherTable.ID
)
WHERE
YourTable.YourCol
(
SELECT SomeVal
FROM OtherTable
WHERE YourTable.ID = OtherTable.ID
)



--T-SQL proprietary method:

UPDATE YourTable
SET YourTable.YourCol = OtherTable.SomeVal
FROM YourTable
JOIN OtherTable ON YourTable.ID = OtherTable.ID
WHERE
YourTable.YourCol <> OtherTable.SomeVal


--
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--


"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote in message
news:em3$qTseFHA.3028 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging
table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in
a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce
the
process time.

during the update the processors are not used, or just a little (25% of
1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.









Reply With Quote
  #6  
Old   
JT
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-27-2005 , 01:54 PM



I believe that transaction logging what you are fighting against. Here is my
reply to similar post in *sqlserver.programmer last week.

Performance Issues for Huge Data import/insert
http://groups-beta.google.com/group/...43a0e83b096523

"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote

Quote:
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce the
process time.

during the update the processors are not used, or just a little (25% of 1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.





Reply With Quote
  #7  
Old   
Adam Machanic
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-27-2005 , 03:56 PM



JT,

I agree with your post, except the part about setting autogrow to 10%! That
can be very problematic. A better option is to manually grow the database
large enough to begin with, and not worry about shrinking it back down
(which can cause fragmentation) -- disk space is cheap. The problem with
percentage autogrow is seen in larger databases. What happens if your
database is 100 GB? How long will a 10 GB autogrow take? How about if
you're working on a 1 TB+ database? At least on my somewhat underpowered
disk systems, I don't want to wait on a 10 GB grow in the middle of a
transaction, let alone 100 GB! I have mine set to grow 10 MB -- and I have
a monitor set up to alert me if the database ever gets within 20% full.
I'll go in and manually grow it at that point... I don't ever want to see
auto-growth during a production transaction. The 10 MB is a failsafe --
should I hit the wall (or miss the alert), the database won't crash, and
10MB is small enough that it won't cause any timeouts or annoy users.


--
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--


"JT" <someone (AT) microsoft (DOT) com> wrote

Quote:
I believe that transaction logging what you are fighting against. Here is
my
reply to similar post in *sqlserver.programmer last week.

Performance Issues for Huge Data import/insert

http://groups-beta.google.com/group/...43a0e83b096523

"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote in message
news:em3$qTseFHA.3028 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging
table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce
the
process time.

during the update the processors are not used, or just a little (25% of
1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.







Reply With Quote
  #8  
Old   
JT
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-27-2005 , 04:13 PM



My post about autogrowing was in reply to someone who was inserting 6GB of
data. If the data being imported has a lot of varchar columns, then the
density of the populated values can have a significant impact on the total
space consumed (50% - >200% larger), and sometimes it is up to the user
about how densly populated their non-required columns are. I try to set the
filesize to something that can handle the expected record volume, but leave
autogrow just in case to prevent the process aborting due to lack of
available space. One large 1GB growth on a 10GB database can take less time
overall than 100 smaller 10mb increments.

"Adam Machanic" <amachanic (AT) hotmail (DOT) _removetoemail_.com> wrote

Quote:
JT,

I agree with your post, except the part about setting autogrow to 10%!
That
can be very problematic. A better option is to manually grow the database
large enough to begin with, and not worry about shrinking it back down
(which can cause fragmentation) -- disk space is cheap. The problem with
percentage autogrow is seen in larger databases. What happens if your
database is 100 GB? How long will a 10 GB autogrow take? How about if
you're working on a 1 TB+ database? At least on my somewhat underpowered
disk systems, I don't want to wait on a 10 GB grow in the middle of a
transaction, let alone 100 GB! I have mine set to grow 10 MB -- and I
have
a monitor set up to alert me if the database ever gets within 20% full.
I'll go in and manually grow it at that point... I don't ever want to see
auto-growth during a production transaction. The 10 MB is a failsafe --
should I hit the wall (or miss the alert), the database won't crash, and
10MB is small enough that it won't cause any timeouts or annoy users.


--
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--


"JT" <someone (AT) microsoft (DOT) com> wrote in message
news:eFFGAo0eFHA.3880 (AT) tk2msftngp13 (DOT) phx.gbl...
I believe that transaction logging what you are fighting against. Here
is
my
reply to similar post in *sqlserver.programmer last week.

Performance Issues for Huge Data import/insert


http://groups-beta.google.com/group/...43a0e83b096523

"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote in message
news:em3$qTseFHA.3028 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging
table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in
a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce
the
process time.

during the update the processors are not used, or just a little (25%
of
1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.









Reply With Quote
  #9  
Old   
David Browne
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-27-2005 , 05:44 PM




"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote

Quote:
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce the
process time.

during the update the processors are not used, or just a little (25% of 1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?
A couple of things. First, try to get an in-place update by using a
fixed-width data type and/or preloading dummy values into the staging table.
An update which expands all the rows is much more expensive than one which
does not.

Second, you are on the right track with replacing update with insert. Just
use a insert-select-join to load a second staging table with the updated
values.


David




Reply With Quote
  #10  
Old   
Jéjé
 
Posts: n/a

Default Re: updating 1.5 millions rows slow... - 06-27-2005 , 05:44 PM



there is no index in this table

adding some don't provide any performance improvment, The system do a full
scan of the table.

I work with the sysadmin, I think there is an hardware issue, I think my
file is not on a dedicated Raid 0 like requested, but on a shared raid 5!!!

"Adam Machanic" <amachanic (AT) hotmail (DOT) _removetoemail_.com> wrote

Quote:
Is the column you're updating clustered and/or heavily indexed? Can you
drop some of the indexes before the update, if that's the case? Or if
it's
clustered, can you cluster on something else?


--
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--


"Jéjé" <willgart_A_ (AT) hotmail_A_ (DOT) com> wrote in message
news:uho%23lGxeFHA.2984 (AT) TK2MSFTNGP15 (DOT) phx.gbl...
yes, but I have to update all rows,
I must update a "value" column (which is null by default) by doing a
lookup
in another table.
I can't join the lookup table during the first bulk insert because the
big
table come from 1 system and the lookup table from another.
But finally, the update appear to be too slow.

"Adam Machanic" <amachanic (AT) hotmail (DOT) _removetoemail_.com> wrote in message
news:%23c74YYseFHA.4040 (AT) TK2MSFTNGP14 (DOT) phx.gbl...
Have you tried not updating the entire table -- but only the modified
rows?
(Is that possible in your situation?)


--ANSI-standard method:

UPDATE YourTable
SET YourCol =
(
SELECT SomeVal
FROM OtherTable
WHERE YourTable.ID = OtherTable.ID
)
WHERE
YourTable.YourCol
(
SELECT SomeVal
FROM OtherTable
WHERE YourTable.ID = OtherTable.ID
)



--T-SQL proprietary method:

UPDATE YourTable
SET YourTable.YourCol = OtherTable.SomeVal
FROM YourTable
JOIN OtherTable ON YourTable.ID = OtherTable.ID
WHERE
YourTable.YourCol <> OtherTable.SomeVal


--
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--


"Jéjé" <willgart (AT) BBBhotmailAAA (DOT) com> wrote in message
news:em3$qTseFHA.3028 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
Hi,

I have a DTS package where I load 1.5 millions of rows in a staging
table,
and I perform multiple updates.
1 of these updates update the entire table by doing a simple lookup in
a
small table.
this update takes 30 minutes !

my file is on a Raid 0

I have tried to update by batch (set rowcount option), but this reduce
the
process time.

during the update the processors are not used, or just a little (25%
of
1
proc on 4 available)
the system appear to wait for the disk pageiolatch wait type.

what can I do to improove this?

if there is no solution, I'll load the table a second time in another
temporary table, because a bulk insert takes only 5 minutes! (which is
bettre then 30 minutes)

thanks

jerome.











Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.