dbTalk Databases Forums  

question about btree's locality of reference

comp.databases.berkeley-db comp.databases.berkeley-db


Discuss question about btree's locality of reference in the comp.databases.berkeley-db forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
rong.xian@gmail.com
 
Posts: n/a

Default question about btree's locality of reference - 01-19-2006 , 10:02 AM






Hello, guys!
Things seems to quite illogical when I'm using berkeley db(4.3.28) to
store some big key/item pairs. Each key is an 4-byte integer and items
are some binary blocks around 40K to 60K Bytes.
I choose BTree as the accessing method. Pagesize was set to be 64KB and
cachesize was set to be 10MB.
As bdb's doc suggests, btree will provide a performance bonus because
the locality of reference.
So I guess the if I use a cursor to traverse the whole db from the
smallest key will minimize the I/O times.
However, things were not like what I imagined.
There are totallly 2000+ key/items in a database, each key/items is
around 40K~60K bytes, if the locality of reference of btree really
works, I/O read times will far less than 2000.(right?)
But in fact it still cost me 2000+ I/O read to finish going throw the
whole db.
I traced into the internal Dbc->get() method, and I found each
get(..,..,DB_NEXT) will cause an I/O read (65536 bytes,one page each
time).Because the key/item pair I searched is on some
'offpage'(overflow page)?
Does that mean when I finish get the smallest key, the second smallest
key was not loaded into cache?
Then does that mean btree's locality of reference doesn't work in that
case?
why?
Should I divide the key/item into some smaller block to enjoy locality
of reference?
thx a lot!


Reply With Quote
  #2  
Old   
Chris Newcombe
 
Posts: n/a

Default Re: question about btree's locality of reference - 01-19-2006 , 11:00 AM







rong.xian (AT) gmail (DOT) com wrote:
Quote:
Each key is an 4-byte integer and items
are some binary blocks around 40K to 60K Bytes.
I choose BTree as the accessing method. Pagesize was set to be 64KB
....
Because the key/item pair I searched is on some 'offpage'(overflow page)?
Does that mean when I finish get the smallest key, the second smallest
key was not loaded into cache?
Yes. See http://www.sleepycat.com/docs/ref/am...bt_minkey.html

"This size is a function of the page size and the fact that at
least two key/data pairs must fit on any Btree page. Whenever key or
data items exceed the calculated size, they are stored on overflow
pages instead of in the standard Btree leaf pages."

IIRC, the default configuration is to move a record to an overflow page
if it is > 25% of the page size.

Quote:
Then does that mean btree's locality of reference doesn't work in that
case? why?
Because BDB is page based, and with your record sizes, each key is on a
different page.

Quote:
Should I divide the key/item into some smaller block to enjoy locality
of reference?
Yes. Or consider compressing them (e.g. with zlib) before storing
them. It is often the case that the cpu overhead of
compression/decompression is more than outweighed by the reduced IO
(better cache utilization).

Also, I believe that Sleepycat are considering allowing signficantly
larger pages than the current 64KB limit in a future release, which
would also solve your problem (without changing your record size). If
you are interested in that, I would suggest sending an email to
support (AT) sleepycat (DOT) com to request that feature (to make sure it stays on
the priority list).

regards,

Chris



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.