dbTalk Databases Forums  

Database logs are huge.

comp.databases.berkeley-db comp.databases.berkeley-db


Discuss Database logs are huge. in the comp.databases.berkeley-db forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
AT
 
Posts: n/a

Default Database logs are huge. - 06-20-2006 , 12:15 PM






I work on a python based video downloader/player and I'm working on
replacing a pickle based on disk database with a berkely db with a
pickle per object (so that I can update each object without having to
rewrite the whole database.)

So, I started by creating my database without using any of the
transaction stuff, but then I had some corruption problems. So I
turned on transactions and logging and such, and now it seems to be
proof against crashing, but the log files get very large very fast. As
in 450MB within an hour or two of actively using the program.

The question is, what am I doing wrong and how do I fix it? Creating
such a huge database is not acceptable, and neither is losing data.
Can berkeley-db fill my needs in this case?

So, how do I currently create the database? I create the dbenv and I
call set_flags with DB_AUTO_COMMIT | DB_TXN_NOSYNC. Then I open the
environment with DB_INIT_LOG | DB_INIT_MPOOL | DB_INIT_TXN | DB_RECOVER
Quote:
DB_CREATE. After that, I open the database with no flags. If that
fails, I open the database with flags=DB_CREATE and dbtype=DB_HASH.

I use a transaction the first time I create the database, but after
that I depend on DB_AUTO_COMMIT. I use a cursor when loading the
database, but close it when done.

My behavior during use consists of lots and lots of writing of the same
key with updating values.

Finally, I close the db and the dbenv on program shutdown.



Reply With Quote
  #2  
Old   
Florian Weimer
 
Posts: n/a

Default Re: Database logs are huge. - 06-20-2006 , 01:38 PM






* christopher james lahey:

Quote:
The question is, what am I doing wrong and how do I fix it? Creating
such a huge database is not acceptable, and neither is losing data.
Can berkeley-db fill my needs in this case?
If losing data is not acceptable, you really should archive those log
files. 8-)

Quote:
My behavior during use consists of lots and lots of writing of the same
key with updating values.
You need to run checkpoints periodically. After that, some log files
can be safely removed (if necessary after archival). See:

<http://www.sleepycat.com/docs/ref/transapp/checkpoint.html>
<http://www.sleepycat.com/docs/ref/transapp/archival.html>


Reply With Quote
  #3  
Old   
AT
 
Posts: n/a

Default Re: Database logs are huge. - 06-20-2006 , 03:18 PM



Thanks for the pointers. This helps tremendously, but I wanted to
confirm my solution below, if you don't mind checking it out.

Florian Weimer wrote:
Quote:
If losing data is not acceptable, you really should archive those log
files. 8-)
Well, it's not acceptable to lose data because of a crash or a machine
losing power. Dealing with disk corruption is beyond the scope that I
have to deal with. It's all about user expectations. The user can
copy all the files of the database while the app is not running and
that will suffice.

Quote:
You need to run checkpoints periodically. After that, some log files
can be safely removed (if necessary after archival). See:

http://www.sleepycat.com/docs/ref/tr...heckpoint.html
http://www.sleepycat.com/docs/ref/tr.../archival.html
So, I've coded this up and all looks good, but there is much talk of
catastrophic failure. I'm pretty sure this isn't a big deal as in the
case of catastrophic failure, the enduser wouldn't have the tools to
fix the problem anyway. Does catastrophic failure mean disk
corruption? The sequence I've coded is stop database transactions,
dbenv.txn_checkpoint(), db.sync(), delete all files returned by
dbenv.log_archive(), allow database transactions again. Is this safe
in the cases I described above (crash, power outage)?



Reply With Quote
  #4  
Old   
AT
 
Posts: n/a

Default Re: Database logs are huge. - 06-20-2006 , 04:50 PM



Hubert Schmid wrote:
Quote:
(1) You can also set the flag DB_LOG_AUTOREMOVE on the database
environment. "If set, Berkley DB will automatically remove log files
that are no longer needed."
(http://www.sleepycat.com/docs/api_c/...OG_AUTOREMOVE).
I tried this but it didn't seem to remove them. The code I have to
remove them by hand works just fine.

Quote:
(2) You can use the function DB_ENV->txn_checkpoint without stopping
other transactions.
The database access is single threaded anyway, so stopping transactions
is automatic, but that's useful to know anyway.

The final thing I'd like to do is to make the log files smaller. As it
is, it builds up to 10MB, but if I could delete the log file every 1MB
or 5MB, that would be more space efficient, in exchange for being a bit
less time efficient. I looked and didn't find anything, so I suspect
this isn't possible, so if not, it might be something useful to add in
the future. If it is possible, even better. However, this is just
icing on the cake. How it is now is great.

Thanks again for all the help. This has made using berkeley-db
possible for this project.



Reply With Quote
  #5  
Old   
Alex
 
Posts: n/a

Default Re: Database logs are huge. - 06-20-2006 , 11:13 PM




Quote:
The final thing I'd like to do is to make the log files smaller. As it
is, it builds up to 10MB, but if I could delete the log file every 1MB
or 5MB, that would be more space efficient, in exchange for being a bit
less time efficient.
It sounds like you are looking for set_lg_max(size), search for the API
name here for more info:
http://pybsddb.sourceforge.net/bsddb3.html

- Alex



Reply With Quote
  #6  
Old   
AT
 
Posts: n/a

Default Re: Database logs are huge. - 06-21-2006 , 01:36 AM



christopher.james.lahey (AT) gmail (DOT) com writes:

Quote:
So, I started by creating my database without using any of the
transaction stuff, but then I had some corruption problems. So I
turned on transactions and logging and such, and now it seems to be
proof against crashing, but the log files get very large very fast. As
in 450MB within an hour or two of actively using the program.

The question is, what am I doing wrong
You are doing lots of modifications to the database. Transaction logs
grow with each modification, and at a rate roughly proportional to the
length of the changed bytes in the database.

Quote:
and how do I fix it?
Do less modifications, or smaller ones.

best regards
Patrick


Reply With Quote
  #7  
Old   
AT
 
Posts: n/a

Default Re: Database logs are huge. - 06-21-2006 , 10:46 AM




Alex wrote:
Quote:
It sounds like you are looking for set_lg_max(size), search for the API
name here for more info:
http://pybsddb.sourceforge.net/bsddb3.html

- Alex
Ah, I didn't find that because I was looking at
http://www.sleepycat.com/docs/api_c/env_list.html . From now on I'll
use http://www.sleepycat.com/docs/api_c/api_core.html for my list of
available APIs or perhaps the python page you listed. Thanks for the
pointer on set_lg_max. This is working great. It's awesome that every
time I ask for a feature, you guys just say, "That's already there." I
think I've got it all working exactly as I want it.

Thanks much,
Chris



Reply With Quote
  #8  
Old   
Florian Weimer
 
Posts: n/a

Default Re: Database logs are huge. - 06-24-2006 , 09:22 AM



* christopher james lahey:

Quote:
I tried this but it didn't seem to remove them.
Log removal can only occur after checkpoints.

Quote:
The code I have to remove them by hand works just fine.
I hope you query Berkeley DB to get a list of obsolete log files,
instead of guessing which ones to delete. 8-)

Quote:
The final thing I'd like to do is to make the log files smaller. As it
is, it builds up to 10MB, but if I could delete the log file every 1MB
or 5MB, that would be more space efficient, in exchange for being a bit
less time efficient.
Call DB_ENV->set_lg_max, or put an appropriate directive into the
DB_CONFIG file. See <http://www.sleepycat.com/docs/ref/log/config.html>


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.