dbTalk Databases Forums  

sequential disk read speed

comp.databases.theory comp.databases.theory


Discuss sequential disk read speed in the comp.databases.theory forum.

Reply
 
Thread Tools Display Modes
  #81  
Old   
David BL
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 08:46 PM






On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:
Quote:
If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives, the
the best average seek time will only be a fraction of the seek time of the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks, instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms
Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.
In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).





Reply With Quote
  #82  
Old   
David BL
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 08:46 PM






On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:
Quote:
If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives, the
the best average seek time will only be a fraction of the seek time of the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks, instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms
Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.
In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).



Reply With Quote
  #83  
Old   
Brian Selzer
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 09:47 PM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:

If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives,
the
the best average seek time will only be a fraction of the seek time of
the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks,
instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms

Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Yes. And the bit density is generally greater at the outside of the
platter, so it generally takes fewer tracks to store the same information
there as opposed to near the center; consequently, simply dividing the
difference of the full-stroke seek and the track-to-track seek by four is a
perhaps overly conservative method of estimation. I want to stress that
this is not just a hair-brained theory of mine: I've had significant success
using this mechanism to boost performance. In one application, by
installing a disk that was seven times the size required and creating a
partition on the outer edge of the disk, performance improved by over 6000%:
batch processes that had been taking over 25 hours to complete were
finishing in under 25 minutes.

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.

In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).

Drives that support disconnection or some other command queueing mechanism
are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger
chance that a seek in the middle of the read would be required. Consider:
if 3.5 stripes fit on a track in one zone of the disk, then on average every
fourth read would require an additional seek to get the remaining half
stripe. If on the other hand, 28 stripes fit on a track, then no additional
seeks would be necessary. Even if it were 28.5 stripes instead of 28, one
additional seek for every 29 reads is a whole lot better than one for every
4.




Reply With Quote
  #84  
Old   
Brian Selzer
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 09:47 PM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:

If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives,
the
the best average seek time will only be a fraction of the seek time of
the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks,
instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms

Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Yes. And the bit density is generally greater at the outside of the
platter, so it generally takes fewer tracks to store the same information
there as opposed to near the center; consequently, simply dividing the
difference of the full-stroke seek and the track-to-track seek by four is a
perhaps overly conservative method of estimation. I want to stress that
this is not just a hair-brained theory of mine: I've had significant success
using this mechanism to boost performance. In one application, by
installing a disk that was seven times the size required and creating a
partition on the outer edge of the disk, performance improved by over 6000%:
batch processes that had been taking over 25 hours to complete were
finishing in under 25 minutes.

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.

In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).

Drives that support disconnection or some other command queueing mechanism
are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger
chance that a seek in the middle of the read would be required. Consider:
if 3.5 stripes fit on a track in one zone of the disk, then on average every
fourth read would require an additional seek to get the remaining half
stripe. If on the other hand, 28 stripes fit on a track, then no additional
seeks would be necessary. Even if it were 28.5 stripes instead of 28, one
additional seek for every 29 reads is a whole lot better than one for every
4.




Reply With Quote
  #85  
Old   
Brian Selzer
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 09:47 PM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:

If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives,
the
the best average seek time will only be a fraction of the seek time of
the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks,
instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms

Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Yes. And the bit density is generally greater at the outside of the
platter, so it generally takes fewer tracks to store the same information
there as opposed to near the center; consequently, simply dividing the
difference of the full-stroke seek and the track-to-track seek by four is a
perhaps overly conservative method of estimation. I want to stress that
this is not just a hair-brained theory of mine: I've had significant success
using this mechanism to boost performance. In one application, by
installing a disk that was seven times the size required and creating a
partition on the outer edge of the disk, performance improved by over 6000%:
batch processes that had been taking over 25 hours to complete were
finishing in under 25 minutes.

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.

In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).

Drives that support disconnection or some other command queueing mechanism
are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger
chance that a seek in the middle of the read would be required. Consider:
if 3.5 stripes fit on a track in one zone of the disk, then on average every
fourth read would require an additional seek to get the remaining half
stripe. If on the other hand, 28 stripes fit on a track, then no additional
seeks would be necessary. Even if it were 28.5 stripes instead of 28, one
additional seek for every 29 reads is a whole lot better than one for every
4.




Reply With Quote
  #86  
Old   
Brian Selzer
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 09:47 PM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:

If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives,
the
the best average seek time will only be a fraction of the seek time of
the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks,
instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms

Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Yes. And the bit density is generally greater at the outside of the
platter, so it generally takes fewer tracks to store the same information
there as opposed to near the center; consequently, simply dividing the
difference of the full-stroke seek and the track-to-track seek by four is a
perhaps overly conservative method of estimation. I want to stress that
this is not just a hair-brained theory of mine: I've had significant success
using this mechanism to boost performance. In one application, by
installing a disk that was seven times the size required and creating a
partition on the outer edge of the disk, performance improved by over 6000%:
batch processes that had been taking over 25 hours to complete were
finishing in under 25 minutes.

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.

In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).

Drives that support disconnection or some other command queueing mechanism
are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger
chance that a seek in the middle of the read would be required. Consider:
if 3.5 stripes fit on a track in one zone of the disk, then on average every
fourth read would require an additional seek to get the remaining half
stripe. If on the other hand, 28 stripes fit on a track, then no additional
seeks would be necessary. Even if it were 28.5 stripes instead of 28, one
additional seek for every 29 reads is a whole lot better than one for every
4.




Reply With Quote
  #87  
Old   
Brian Selzer
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 09:47 PM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:

If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives,
the
the best average seek time will only be a fraction of the seek time of
the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks,
instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms

Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Yes. And the bit density is generally greater at the outside of the
platter, so it generally takes fewer tracks to store the same information
there as opposed to near the center; consequently, simply dividing the
difference of the full-stroke seek and the track-to-track seek by four is a
perhaps overly conservative method of estimation. I want to stress that
this is not just a hair-brained theory of mine: I've had significant success
using this mechanism to boost performance. In one application, by
installing a disk that was seven times the size required and creating a
partition on the outer edge of the disk, performance improved by over 6000%:
batch processes that had been taking over 25 hours to complete were
finishing in under 25 minutes.

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.

In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).

Drives that support disconnection or some other command queueing mechanism
are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger
chance that a seek in the middle of the read would be required. Consider:
if 3.5 stripes fit on a track in one zone of the disk, then on average every
fourth read would require an additional seek to get the remaining half
stripe. If on the other hand, 28 stripes fit on a track, then no additional
seeks would be necessary. Even if it were 28.5 stripes instead of 28, one
additional seek for every 29 reads is a whole lot better than one for every
4.




Reply With Quote
  #88  
Old   
Brian Selzer
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 09:47 PM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:

If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives,
the
the best average seek time will only be a fraction of the seek time of
the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks,
instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms

Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Yes. And the bit density is generally greater at the outside of the
platter, so it generally takes fewer tracks to store the same information
there as opposed to near the center; consequently, simply dividing the
difference of the full-stroke seek and the track-to-track seek by four is a
perhaps overly conservative method of estimation. I want to stress that
this is not just a hair-brained theory of mine: I've had significant success
using this mechanism to boost performance. In one application, by
installing a disk that was seven times the size required and creating a
partition on the outer edge of the disk, performance improved by over 6000%:
batch processes that had been taking over 25 hours to complete were
finishing in under 25 minutes.

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.

In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).

Drives that support disconnection or some other command queueing mechanism
are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger
chance that a seek in the middle of the read would be required. Consider:
if 3.5 stripes fit on a track in one zone of the disk, then on average every
fourth read would require an additional seek to get the remaining half
stripe. If on the other hand, 28 stripes fit on a track, then no additional
seeks would be necessary. Even if it were 28.5 stripes instead of 28, one
additional seek for every 29 reads is a whole lot better than one for every
4.




Reply With Quote
  #89  
Old   
Brian Selzer
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 09:47 PM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:

If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives,
the
the best average seek time will only be a fraction of the seek time of
the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks,
instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms

Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Yes. And the bit density is generally greater at the outside of the
platter, so it generally takes fewer tracks to store the same information
there as opposed to near the center; consequently, simply dividing the
difference of the full-stroke seek and the track-to-track seek by four is a
perhaps overly conservative method of estimation. I want to stress that
this is not just a hair-brained theory of mine: I've had significant success
using this mechanism to boost performance. In one application, by
installing a disk that was seven times the size required and creating a
partition on the outer edge of the disk, performance improved by over 6000%:
batch processes that had been taking over 25 hours to complete were
finishing in under 25 minutes.

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.

In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).

Drives that support disconnection or some other command queueing mechanism
are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger
chance that a seek in the middle of the read would be required. Consider:
if 3.5 stripes fit on a track in one zone of the disk, then on average every
fourth read would require an additional seek to get the remaining half
stripe. If on the other hand, 28 stripes fit on a track, then no additional
seeks would be necessary. Even if it were 28.5 stripes instead of 28, one
additional seek for every 29 reads is a whole lot better than one for every
4.




Reply With Quote
  #90  
Old   
Brian Selzer
 
Posts: n/a

Default Re: sequential disk read speed - 08-27-2008 , 09:47 PM







"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
On Aug 24, 12:39 pm, "Brian Selzer" <br... (AT) selzer-software (DOT) com> wrote:

If you have a 100GB database and you put it on single
100GB disk drive, your best average seek time is the average seek time of
the disk drive, but if you put the database on four 100GB disk drives,
the
the best average seek time will only be a fraction of the seek time of
the
single disk. Suppose that the full-stroke seek time on the 100GB disk is
7ms and the track-to-track seek time is 1ms. Well, with four disks,
instead
of an average 4ms seek time, the individual seek time of each disk is
reduced to roughly 2.5ms

Is this because less of the disk is actually being used so on a given
platter the head doesn't have such a large range of tracks to move
over?

Yes. And the bit density is generally greater at the outside of the
platter, so it generally takes fewer tracks to store the same information
there as opposed to near the center; consequently, simply dividing the
difference of the full-stroke seek and the track-to-track seek by four is a
perhaps overly conservative method of estimation. I want to stress that
this is not just a hair-brained theory of mine: I've had significant success
using this mechanism to boost performance. In one application, by
installing a disk that was seven times the size required and creating a
partition on the outer edge of the disk, performance improved by over 6000%:
batch processes that had been taking over 25 hours to complete were
finishing in under 25 minutes.

Quote:
, and since there are four disks, the average seek
time for the disk subsystem is reduced to a quarter of that or roughly
.625ms.

In order for the effective seek time to be reduced to a quarter the
seeking must be independent. To achieve that I think the striping
would need to be very coarse (eg 512kb or 1Mb).

Drives that support disconnection or some other command queueing mechanism
are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger
chance that a seek in the middle of the read would be required. Consider:
if 3.5 stripes fit on a track in one zone of the disk, then on average every
fourth read would require an additional seek to get the remaining half
stripe. If on the other hand, 28 stripes fit on a track, then no additional
seeks would be necessary. Even if it were 28.5 stripes instead of 28, one
additional seek for every 29 reads is a whole lot better than one for every
4.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.