dbTalk Databases Forums  

real data vs. db file

comp.databases.berkeley-db comp.databases.berkeley-db


Discuss real data vs. db file in the comp.databases.berkeley-db forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
AT
 
Posts: n/a

Default real data vs. db file - 05-29-2006 , 08:40 PM






I write a program that creates one database, and put lots of data into
it, then i calculate the real data and the pysical db files, i find
that the db files are so large.

the follows is the code:

int main(int argc, char **argv)
int rc;
DB_ENV* dbenv;
rc = db_env_create(&dbenv, 0);
dbenv->set_errfile(dbenv, stderr);

DBT key, data;
memset(&key, 0, sizeof(key));
memset(&data, 0, sizeof(data));

char sz_key[64];
char *sz_data;
int size;

unsigned int flags = DB_CREATE | DB_INIT_CDB | DB_INIT_MPOOL |
DB_THREAD;
int mode = 0;
rc = dbenv->open(dbenv, db_home, flags, mode);
if (0 == rc)
{
DB* pdb;
rc = db_create(&pdb, dbenv, 0);
flags = DB_CREATE | DB_EXCL | DB_THREAD;
mode = 0;
rc = pdb->open(pdb, NULL,
table_file_name, table_name,
DB_BTREE, flags, mode);

if (argc > 1)
{
size = atoi(argv[1]); // size is 16, 32, 64, ... , 1024,
2048, 4096
}
else
{
size = 1024;
}

sz_data = (char*)malloc(size);
memset(sz_data, 'k', size);
data.data = sz_data;
data.size = size;

for (int i=0; i<1024*128; i++)
{
sprintf(sz_key, "aaaa_key_%d", i);
key.data = sz_key;
key.size = strlen(sz_key) + 1;

rc = pdb->put(pdb, NULL, &key, &data, 0);
}

free(sz_data);
}

return 0;
}

for each data size from 16 to 4096, db files are too large. Let's
calculate the real data, average key size is 12, so real data is
(12+x)*1024*128 Bytes :
data size db files real data db files/real data
16 12M 3.5M 3.43
32 16M 5.5M 2.91
64 24M 9.5M 2.53
128 40M 17.5M 2.29
256 70M 33.5M 2.90
512 120M 65.5M 1.83
1024 522M 129.5M 4.03
2048 522M 257.5M 2.03
4096 1.1G 513.5M 2.14

All configurations are default, and the OS's page size is 4 KB.

The questiongs are: first, why db files/real data are so large? second,
look at the line when data size is 512 and 1024, the smallest and
biggest db files/real data. why?


Reply With Quote
  #2  
Old   
guenther (AT) gmail (DOT) com
 
Posts: n/a

Default Re: real data vs. db file - 05-29-2006 , 10:26 PM






likun.navipal (AT) gmail (DOT) com wrote:
....
Quote:
for (int i=0; i<1024*128; i++)
{
sprintf(sz_key, "aaaa_key_%d", i);
key.data = sz_key;
key.size = strlen(sz_key) + 1;
....
Let's calculate the real data, average key size is 12, so real data is
(12+x)*1024*128 Bytes :
Actually, the average key size is a bit over 15 bytes; of the 131072
keys, 90,000 are 15 bytes in length.


Quote:
data size db files real data db files/real data
16 12M 3.5M 3.43
32 16M 5.5M 2.91
64 24M 9.5M 2.53
128 40M 17.5M 2.29
256 70M 33.5M 2.90
512 120M 65.5M 1.83
1024 522M 129.5M 4.03
2048 522M 257.5M 2.03
4096 1.1G 513.5M 2.14
....
The questiongs are: first, why db files/real data are so large?
The DB format adds overhead in a number of places:
- space is needed to track the size of each key and value and the
'type' of each entry, and possibly a byte or two to align the key
and/or value (6 to 8 bytes per entry)
- B-trees are page based, so there will be some "internal
fragmentation" that occurs whenever a new page has to
allocated because the previous page didn't have enough free-space
for the next key/value pair. This is often expressed as the
'fill-factor', the precentage of total data space that's actually
holding information
- each page has a header that tracks various things, such as the
amount of free space on the page, the page-numbers of the pages
before and after it in the tree, and the LSNs for transaction
recovery (26 bytes)
- the B-tree has internal pages that track the locations of the pages
below them
- the first page of the B-tree holds metadata for the entire tree (1
page)


Also, your code inserts the keys out of sorted order (the key
"aaaa_key_13" sorts _before_ "aaaa_key_4") which will result more
internal fragmentation (a lower fill-factor) than if you inserted them
in sorted order.


Quote:
second, look at the line when data size is 512 and 1024, the smallest and
biggest db files/real data. why?
If a key or value would take up more than a quarter of the space on a
page, then it'll get placed on an 'overflow' page instead and a
reference to that page goes on the normal B-tree page.


Much of the above is described or discussed in the DB Reference Guide
in the DB source tree and at
http://www.sleepycat.com/docs/ref/toc.html

The db_stat utility can be used to get many statistics about a DB file,
such as the fill-factor and the number of overflow pages.


Philip Guenther



Reply With Quote
  #3  
Old   
AT
 
Posts: n/a

Default Re: real data vs. db file - 05-30-2006 , 03:30 AM



thank you very much!

guenther (AT) gmail (DOT) com 写道:

Quote:
likun.navipal (AT) gmail (DOT) com wrote:
...
for (int i=0; i<1024*128; i++)
{
sprintf(sz_key, "aaaa_key_%d", i);
key.data = sz_key;
key.size = strlen(sz_key) + 1;
...
Let's calculate the real data, average key size is 12, so real data is
(12+x)*1024*128 Bytes :

Actually, the average key size is a bit over 15 bytes; of the 131072
keys, 90,000 are 15 bytes in length.


data size db files real data db files/real data
16 12M 3.5M 3.43
32 16M 5.5M 2.91
64 24M 9.5M 2.53
128 40M 17.5M 2.29
256 70M 33.5M 2.90
512 120M 65.5M 1.83
1024 522M 129.5M 4.03
2048 522M 257.5M 2.03
4096 1.1G 513.5M 2.14
...
The questiongs are: first, why db files/real data are so large?

The DB format adds overhead in a number of places:
- space is needed to track the size of each key and value and the
'type' of each entry, and possibly a byte or two to align the key
and/or value (6 to 8 bytes per entry)
- B-trees are page based, so there will be some "internal
fragmentation" that occurs whenever a new page has to
allocated because the previous page didn't have enough free-space
for the next key/value pair. This is often expressed as the
'fill-factor', the precentage of total data space that's actually
holding information
- each page has a header that tracks various things, such as the
amount of free space on the page, the page-numbers of the pages
before and after it in the tree, and the LSNs for transaction
recovery (26 bytes)
- the B-tree has internal pages that track the locations of the pages
below them
- the first page of the B-tree holds metadata for the entire tree (1
page)


Also, your code inserts the keys out of sorted order (the key
"aaaa_key_13" sorts _before_ "aaaa_key_4") which will result more
internal fragmentation (a lower fill-factor) than if you inserted them
in sorted order.


second, look at the line when data size is 512 and 1024, the smallest and
biggest db files/real data. why?

If a key or value would take up more than a quarter of the space on a
page, then it'll get placed on an 'overflow' page instead and a
reference to that page goes on the normal B-tree page.


Much of the above is described or discussed in the DB Reference Guide
in the DB source tree and at
http://www.sleepycat.com/docs/ref/toc.html

The db_stat utility can be used to get many statistics about a DB file,
such as the fill-factor and the number of overflow pages.


Philip Guenther


Reply With Quote
  #4  
Old   
AT
 
Posts: n/a

Default Re: real data vs. db file - 05-31-2006 , 02:43 AM




but i still have some problems:

1. althouth it has some internal data structures in berkeley db, the db
file/real data be high to 4 is hard to understand.

2. you mentioned about "fill-factor", which means the precentage of
total data space that's actually holding information. can i control the
"fill-factor", e.g., set to 0.8?

3. you said that, If a key or value would take up more than a quarter
of the space on a
page, then it'll get placed on an 'overflow' page instead. why is a
quarter? is it a waste? how can i change it?

many thanks to you!

guenther (AT) gmail (DOT) com 写道:

Quote:
likun.navipal (AT) gmail (DOT) com wrote:
...
for (int i=0; i<1024*128; i++)
{
sprintf(sz_key, "aaaa_key_%d", i);
key.data = sz_key;
key.size = strlen(sz_key) + 1;
...
Let's calculate the real data, average key size is 12, so real data is
(12+x)*1024*128 Bytes :

Actually, the average key size is a bit over 15 bytes; of the 131072
keys, 90,000 are 15 bytes in length.


data size db files real data db files/real data
16 12M 3.5M 3.43
32 16M 5.5M 2.91
64 24M 9.5M 2.53
128 40M 17.5M 2.29
256 70M 33.5M 2.90
512 120M 65.5M 1.83
1024 522M 129.5M 4.03
2048 522M 257.5M 2.03
4096 1.1G 513.5M 2.14
...
The questiongs are: first, why db files/real data are so large?

The DB format adds overhead in a number of places:
- space is needed to track the size of each key and value and the
'type' of each entry, and possibly a byte or two to align the key
and/or value (6 to 8 bytes per entry)
- B-trees are page based, so there will be some "internal
fragmentation" that occurs whenever a new page has to
allocated because the previous page didn't have enough free-space
for the next key/value pair. This is often expressed as the
'fill-factor', the precentage of total data space that's actually
holding information
- each page has a header that tracks various things, such as the
amount of free space on the page, the page-numbers of the pages
before and after it in the tree, and the LSNs for transaction
recovery (26 bytes)
- the B-tree has internal pages that track the locations of the pages
below them
- the first page of the B-tree holds metadata for the entire tree (1
page)


Also, your code inserts the keys out of sorted order (the key
"aaaa_key_13" sorts _before_ "aaaa_key_4") which will result more
internal fragmentation (a lower fill-factor) than if you inserted them
in sorted order.


second, look at the line when data size is 512 and 1024, the smallest and
biggest db files/real data. why?

If a key or value would take up more than a quarter of the space on a
page, then it'll get placed on an 'overflow' page instead and a
reference to that page goes on the normal B-tree page.


Much of the above is described or discussed in the DB Reference Guide
in the DB source tree and at
http://www.sleepycat.com/docs/ref/toc.html

The db_stat utility can be used to get many statistics about a DB file,
such as the fill-factor and the number of overflow pages.


Philip Guenther


Reply With Quote
  #5  
Old   
Philip Guenther
 
Posts: n/a

Default Re: real data vs. db file - 05-31-2006 , 05:42 PM



likun.navipal (AT) gmail (DOT) com wrote:
Quote:
1. althouth it has some internal data structures in berkeley db, the db
file/real data be high to 4 is hard to understand.
What about it is hard to understand? Did you actually use db_stat and
do the math on what it returned to see how much overhead was caused by
each factor I mentioned?


Quote:
2. you mentioned about "fill-factor", which means the precentage of
total data space that's actually holding information. can i control the
"fill-factor", e.g., set to 0.8?
No. In B-trees the fill-factor is a consequence of the sizes of the
keys and values and the insertion order and not something you can
configure.


Quote:
3. you said that, If a key or value would take up more than a quarter
of the space on a
page, then it'll get placed on an 'overflow' page instead. why is a
quarter? is it a waste? how can i change it?
It's a quarter because the default for the bt_minkeys parameter is 2
and a key/value pair takes 2 slots. For information on how to change
that parameter and its effects, please read section 2.5.3 of the
reference guide.

As for whether that's a waste, that depends on what you're trying to
save. Space? Speed of searches? Speed of inserts and deletes?
Programming complexity? Since you haven't described the limits that
your application needs to run under nor your reasons for using Berkeley
DB to begin with, it's impossible to say whether the default
key-per-page value or even the factor of four growth is a good tradeoff
or a poor one. If the size of the data on disk is the overriding
concern than you should consider alternative storage formats.
Depending on your tradeoffs, you may be better off completely tossing
BDB and hand-coding the storage layer of your application. You're the
only one who knows what tradeoffs you're willing to make.


Philip Guenther



Reply With Quote
  #6  
Old   
AT
 
Posts: n/a

Default Re: real data vs. db file - 06-01-2006 , 09:36 AM



thans for your opinion.

I really used db_stat to try to find something, what i want to know is
why the db file/real data is from 1.3 to 4. The db_stat for the db
which db file/real data is 4 will put at the end.

I use berkeley db to store devices' historical data, which has some
characters:
1. every device will issuing data out continuously every 1 or several
seconds, and the device number may be large
2. several devices' data will be put into one berkeley db's database
3. normal access: write the data; query a device's data by time range;
nearly never change the data stored in berkeley db

The insert and query speed are met my requirement, but the space is not
now.

db_stat -e result for the database which has a data size of 1024

Thu Jun 1 22:38:39 2006 Local time
0x120897 Magic number
0 Panic value
4.4.16 Environment version
Thu Jun 1 16:20:26 2006 Creation time
0xd32ead0e Environment ID
1 Primary region allocation and reference count mutex [0/2 0% !Own]
1 References
Thread status blocks:
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
272KB Mutex region size
0 The number of region locks that required waiting (0%)
4 Mutex alignment
1 Mutex test-and-set spins
2352 Mutex total count
2242 Mutex free count
110 Mutex in-use count
115 Mutex maximum in-use count
Mutex counts
2352 Unallocated
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
4 Last allocated locker ID
0x7fffffff Current maximum unused locker ID
5 Number of lock modes
1000 Maximum number of locks possible
1000 Maximum number of lockers possible
1000 Maximum number of lock objects possible
0 Number of current locks
3 Maximum number of locks at any one time
0 Number of current lockers
3 Maximum number of lockers at any one time
0 Number of current lock objects
3 Maximum number of lock objects at any one time
131082 Total number of locks requested
131082 Total number of locks released
1 Total number of locks upgraded
3 Total number of locks downgraded
0 Lock requests not available due to conflicts, for which we waited
0 Lock requests not available due to conflicts, for which we did not
wait
0 Number of deadlocks
0 Lock timeout value
0 Number of locks that have timed out
0 Transaction timeout value
0 Number of transactions that have timed out
344KB The size of the lock region
0 The number of region locks that required waiting (0%)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
257KB 868B Total cache size
1 Number of caches
264KB Pool individual cache size
0 Maximum memory-mapped file size
0 Maximum open file descriptors
0 Maximum sequential buffer writes
0 Sleep after writing maximum sequential buffers
0 Requested pages mapped into the process' address space
518367 Requested pages found in the cache (99%)
2 Requested pages not found in the cache
132395 Pages created in the cache
2 Pages read into the cache
132399 Pages written from the cache to the backing file
2 Clean pages forced from the cache
132330 Dirty pages forced from the cache
0 Dirty pages written by trickle-sync thread
65 Current total page count
65 Current clean page count
0 Current dirty page count
37 Number of hash buckets used for page location
650766 Total number of times hash chains searched for a page
5 The longest hash chain searched for a page
1439523 Total number of hash chain entries checked for page
0 The number of hash bucket locks that required waiting (0%)
0 The maximum number of times any hash bucket lock was waited for
0 The number of region locks that required waiting (0%)
132402 The number of page allocations
265370 The number of hash buckets examined during allocations
4 The maximum number of hash buckets examined for an allocation
132332 The number of pages examined during allocations
1 The max number of pages examined for an allocation
Pool File: aa_table_file.db
4096 Page size
0 Requested pages mapped into the process' address space
518367 Requested pages found in the cache (99%)
2 Requested pages not found in the cache
132395 Pages created in the cache
2 Pages read into the cache
132399 Pages written from the cache to the backing file


Philip Guenther 写道:

Quote:
likun.navipal (AT) gmail (DOT) com wrote:
1. althouth it has some internal data structures in berkeley db, the db
file/real data be high to 4 is hard to understand.

What about it is hard to understand? Did you actually use db_stat and
do the math on what it returned to see how much overhead was caused by
each factor I mentioned?


2. you mentioned about "fill-factor", which means the precentage of
total data space that's actually holding information. can i control the
"fill-factor", e.g., set to 0.8?

No. In B-trees the fill-factor is a consequence of the sizes of the
keys and values and the insertion order and not something you can
configure.


3. you said that, If a key or value would take up more than a quarter
of the space on a
page, then it'll get placed on an 'overflow' page instead. why is a
quarter? is it a waste? how can i change it?

It's a quarter because the default for the bt_minkeys parameter is 2
and a key/value pair takes 2 slots. For information on how to change
that parameter and its effects, please read section 2.5.3 of the
reference guide.

As for whether that's a waste, that depends on what you're trying to
save. Space? Speed of searches? Speed of inserts and deletes?
Programming complexity? Since you haven't described the limits that
your application needs to run under nor your reasons for using Berkeley
DB to begin with, it's impossible to say whether the default
key-per-page value or even the factor of four growth is a good tradeoff
or a poor one. If the size of the data on disk is the overriding
concern than you should consider alternative storage formats.
Depending on your tradeoffs, you may be better off completely tossing
BDB and hand-coding the storage layer of your application. You're the
only one who knows what tradeoffs you're willing to make.


Philip Guenther


Reply With Quote
  #7  
Old   
Philip Guenther
 
Posts: n/a

Default Re: real data vs. db file - 06-01-2006 , 05:01 PM



likun.navipal (AT) gmail (DOT) com wrote:
Quote:
I really used db_stat to try to find something, what i want to know is
why the db file/real data is from 1.3 to 4. The db_stat for the db
which db file/real data is 4 will put at the end.
Could you run that again, this time with the -d option and the name of
the .db file instead of the -e option? The -e option dumps the
environment statistics, while the -d options dumps the table statistics
for the given table.

Philip Guenther



Reply With Quote
  #8  
Old   
AT
 
Posts: n/a

Default Re: real data vs. db file - 06-01-2006 , 08:16 PM



ok

db_stat -d file-1024.db (data size is 1024), the result is :

Fri Jun 2 09:12:11 2006 Local time
53162 Btree magic number
9 Btree version number
Little-endian Byte order
multiple-databases Flags
2 Minimum keys per-page
4096 Underlying database page size
1 Number of levels in the tree
1 Number of unique keys in the tree
1 Number of data items in the tree
0 Number of tree internal pages
0 Number of bytes free in tree internal pages (0% ff)
1 Number of tree leaf pages
4046 Number of bytes free in tree leaf pages (1% ff)
0 Number of tree duplicate pages
0 Number of bytes free in tree duplicate pages (0% ff)
0 Number of tree overflow pages
0 Number of bytes free in tree overflow pages (0% ff)
0 Number of empty pages
0 Number of pages on the free list

and this is for the db_stat -d file-512.db (data size is 512) :

Fri Jun 2 09:19:06 2006 Local time
53162 Btree magic number
9 Btree version number
Little-endian Byte order
multiple-databases Flags
2 Minimum keys per-page
4096 Underlying database page size
1 Number of levels in the tree
1 Number of unique keys in the tree
1 Number of data items in the tree
0 Number of tree internal pages
0 Number of bytes free in tree internal pages (0% ff)
1 Number of tree leaf pages
4046 Number of bytes free in tree leaf pages (1% ff)
0 Number of tree duplicate pages
0 Number of bytes free in tree duplicate pages (0% ff)
0 Number of tree overflow pages
0 Number of bytes free in tree overflow pages (0% ff)
0 Number of empty pages
0 Number of pages on the free list

Philip Guenther 写道:

Quote:
likun.navipal (AT) gmail (DOT) com wrote:
I really used db_stat to try to find something, what i want to know is
why the db file/real data is from 1.3 to 4. The db_stat for the db
which db file/real data is 4 will put at the end.

Could you run that again, this time with the -d option and the name of
the .db file instead of the -e option? The -e option dumps the
environment statistics, while the -d options dumps the table statistics
for the given table.

Philip Guenther


Reply With Quote
  #9  
Old   
Philip Guenther
 
Posts: n/a

Default Re: real data vs. db file - 06-01-2006 , 09:56 PM



likun.navipal (AT) gmail (DOT) com wrote:
Quote:
db_stat -d file-1024.db (data size is 1024), the result is :
....
4096 Underlying database page size
1 Number of levels in the tree
1 Number of unique keys in the tree
1 Number of data items in the tree
Umm, you only have a single item in the table; of course it'll be
inefficient space-wise at that point: you don't even have enough data
to fill a single page! You should build a database of a size that
would/will be 'typical' and snag the stats on it before you start
worrying about fill-factors or overhead.


Philip Guenther



Reply With Quote
  #10  
Old   
AT
 
Posts: n/a

Default Re: real data vs. db file - 06-01-2006 , 10:27 PM



please take a look at my source code, 1024*128 items have been
inserted.

if there is only one item in the db, why the db file is so large?

Philip Guenther 写道:

Quote:
likun.navipal (AT) gmail (DOT) com wrote:
db_stat -d file-1024.db (data size is 1024), the result is :
...
4096 Underlying database page size
1 Number of levels in the tree
1 Number of unique keys in the tree
1 Number of data items in the tree

Umm, you only have a single item in the table; of course it'll be
inefficient space-wise at that point: you don't even have enough data
to fill a single page! You should build a database of a size that
would/will be 'typical' and snag the stats on it before you start
worrying about fill-factors or overhead.


Philip Guenther


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.