dbTalk Databases Forums  

How do I optimize loading into DB ?

comp.databases.berkeley-db comp.databases.berkeley-db


Discuss How do I optimize loading into DB ? in the comp.databases.berkeley-db forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Radzi
 
Posts: n/a

Default How do I optimize loading into DB ? - 11-04-2006 , 06:03 PM






I've just starting to use BDB and was trying to find out how to
optimize loading data into db. Below is the code i used to load data;
but this one could not beat my SQLite version. I had 6million records
to load. In SQLite it took 4~5 mins yet in BDB it took me 12~15 mins.
Could someone give me some hints to optimize this.

Thanks.

Radzi.


Db master(NULL, 0);
master.open(NULL, "bdbMaster.db", NULL, DB_BTREE, DB_CREATE, 0);
Db detail(NULL, 0);
detail.set_flags(DB_DUP);
detail.open(NULL, "bdbDetail.db", NULL, DB_BTREE, DB_CREATE, 0);

std::ifstream inputFile(asciiName.c_str());
unsigned long long offset = 0;
union {
char buf[4096];
Taiwan2004::OPD r;
} opdData;
FactTableRecord factRec;
long id = 0;
char key[2048], prevKey[2048];
memset(prevKey, 0, 2048);
while (inputFile.good() && inputFile.peek() != EOF) {
offset = inputFile.tellg();
inputFile.getline(opdData.buf, sizeof(opdData.buf));
strcpy(key, GetKey(opdData.r).c_str());
if (std::strcmp(key, prevKey) != 0) {
nCons++;
Dbt data(key, std::strlen(key));
Dbt key(&nCons, sizeof(nCons));
master.put(NULL, &key, &data, 0);
std::strcpy(prevKey, key);
}
nDet++;
FactTableDetailRecord dr;
AssignValues(dr, opdData.r);
Dbt data(&dr, sizeof(dr));
Dbt key(&nCons, sizeof(nCons));
detail.put(NULL, &key, &data, 0);
}
master.close(0);
detail.close(0);


Reply With Quote
  #2  
Old   
Radzi
 
Posts: n/a

Default Re: How do I optimize loading into DB ? - 11-07-2006 , 07:15 AM






Hi,
I've done some couple of tests. And it seems that SQLite is faster than
BDB. How can it be? I still believe that BDB *should* be faster than
SQLite. Or am I wrong?

regards,
Radzi.


Radzi wrote:
Quote:
I've just starting to use BDB and was trying to find out how to
optimize loading data into db. Below is the code i used to load data;
but this one could not beat my SQLite version. I had 6million records
to load. In SQLite it took 4~5 mins yet in BDB it took me 12~15 mins.
Could someone give me some hints to optimize this.

Thanks.

Radzi.


Db master(NULL, 0);
master.open(NULL, "bdbMaster.db", NULL, DB_BTREE, DB_CREATE, 0);
Db detail(NULL, 0);
detail.set_flags(DB_DUP);
detail.open(NULL, "bdbDetail.db", NULL, DB_BTREE, DB_CREATE, 0);

std::ifstream inputFile(asciiName.c_str());
unsigned long long offset = 0;
union {
char buf[4096];
Taiwan2004::OPD r;
} opdData;
FactTableRecord factRec;
long id = 0;
char key[2048], prevKey[2048];
memset(prevKey, 0, 2048);
while (inputFile.good() && inputFile.peek() != EOF) {
offset = inputFile.tellg();
inputFile.getline(opdData.buf, sizeof(opdData.buf));
strcpy(key, GetKey(opdData.r).c_str());
if (std::strcmp(key, prevKey) != 0) {
nCons++;
Dbt data(key, std::strlen(key));
Dbt key(&nCons, sizeof(nCons));
master.put(NULL, &key, &data, 0);
std::strcpy(prevKey, key);
}
nDet++;
FactTableDetailRecord dr;
AssignValues(dr, opdData.r);
Dbt data(&dr, sizeof(dr));
Dbt key(&nCons, sizeof(nCons));
detail.put(NULL, &key, &data, 0);
}
master.close(0);
detail.close(0);


Reply With Quote
  #3  
Old   
Klaas
 
Posts: n/a

Default Re: How do I optimize loading into DB ? - 11-07-2006 , 01:26 PM




Radzi wrote:
Quote:
Hi,
I've done some couple of tests. And it seems that SQLite is faster than
BDB. How can it be? I still believe that BDB *should* be faster than
SQLite. Or am I wrong?
Yes, but you often have to be careful to get the best performance. Are
the records sorted?

-Mike



Reply With Quote
  #4  
Old   
Radzi
 
Posts: n/a

Default Re: How do I optimize loading into DB ? - 11-07-2006 , 09:23 PM




Klaas wrote:
Quote:
Radzi wrote:
Hi,
I've done some couple of tests. And it seems that SQLite is faster than
BDB. How can it be? I still believe that BDB *should* be faster than
SQLite. Or am I wrong?

Yes, but you often have to be careful to get the best performance. Are
the records sorted?

-Mike

No. It was not sorted, since the records has to be read in sequence. I
was using the same input file for both. In SQLite I used Prepare and
bracket all insert in a Begin-Commit transaction. When I display record
no processed; I saw SQLite running at the same speed for all records;
whereas BDB started to slow down after 1 million records...


regards,

Radzi.



Reply With Quote
  #5  
Old   
Alex
 
Posts: n/a

Default Re: How do I optimize loading into DB ? - 11-07-2006 , 10:41 PM




Radzi wrote:
Quote:
No. It was not sorted, since the records has to be read in sequence. I
was using the same input file for both. In SQLite I used Prepare and
bracket all insert in a Begin-Commit transaction. When I display record
no processed; I saw SQLite running at the same speed for all records;
whereas BDB started to slow down after 1 million records...
Which access method are you using? Btree? Hash?

Have you configured a large enough cache?
http://www.oracle.com/technology/doc...cachesize.html

It might help to tune the page size, so that an optimal number of
records can fit on each page.
http://www.oracle.com/technology/doc.../pagesize.html

Are you using transactions? If so are you committing each record or
grouping the commits together?

- Alex



Reply With Quote
  #6  
Old   
Radzi
 
Posts: n/a

Default Re: How do I optimize loading into DB ? - 11-08-2006 , 02:44 AM



Hi,

Alex wrote:
Quote:
Radzi wrote:

No. It was not sorted, since the records has to be read in sequence. I
was using the same input file for both. In SQLite I used Prepare and
bracket all insert in a Begin-Commit transaction. When I display record
no processed; I saw SQLite running at the same speed for all records;
whereas BDB started to slow down after 1 million records...

Which access method are you using? Btree? Hash?
I was using Btree.

Quote:
Have you configured a large enough cache?
No. Just accept the default one. Similarly with the pagesize.

Quote:
http://www.oracle.com/technology/doc...cachesize.html

It might help to tune the page size, so that an optimal number of
records can fit on each page.
http://www.oracle.com/technology/doc.../pagesize.html

Are you using transactions? If so are you committing each record or
grouping the commits together?

Not using transaction with BDB...


Quote:
- Alex


Reply With Quote
  #7  
Old   
Alex
 
Posts: n/a

Default Re: How do I optimize loading into DB ? - 11-08-2006 , 02:43 PM




Quote:
Have you configured a large enough cache?

No. Just accept the default one. Similarly with the pagesize.

Does increasing the cache size solve the performance problems you were
seeing?



Reply With Quote
  #8  
Old   
Klaas
 
Posts: n/a

Default Re: How do I optimize loading into DB ? - 11-09-2006 , 01:09 PM




Radzi wrote:

Quote:
No. It was not sorted, since the records has to be read in sequence. I
was using the same input file for both. In SQLite I used Prepare and
bracket all insert in a Begin-Commit transaction. When I display record
no processed; I saw SQLite running at the same speed for all records;
whereas BDB started to slow down after 1 million records...
Radzi, I have had similar experiences with BDB. Cache makes a huge
difference--it essentially allows a certain portion of the records to
be inserted as fast as if they were sorted. Unfortunately, I was never
able to maintain that rate--perhaps it is a problem with the cache
flushing policy.

One strategy that I have found quite effective is the following: build
databases in chunks as large as you can at full speed (probably not as
large as 1 million records). Next, open cursors over all the chunks,
and insert the records into a new database in sorted order (the more
efficient means of doing this is maintain the current records from each
database in a heap).

I'm able to build arbitrarily-large databases in this manner.

-Mike



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.