I work on a python based video downloader/player and I'm working on
replacing a pickle based on disk database with a berkely db with a
pickle per object (so that I can update each object without having to
rewrite the whole database.)
So, I started by creating my database without using any of the
transaction stuff, but then I had some corruption problems. So I
turned on transactions and logging and such, and now it seems to be
proof against crashing, but the log files get very large very fast. As
in 450MB within an hour or two of actively using the program.
The question is, what am I doing wrong and how do I fix it? Creating
such a huge database is not acceptable, and neither is losing data.
Can berkeley-db fill my needs in this case?
So, how do I currently create the database? I create the dbenv and I
call set_flags with DB_AUTO_COMMIT | DB_TXN_NOSYNC. Then I open the
environment with DB_INIT_LOG | DB_INIT_MPOOL | DB_INIT_TXN | DB_RECOVER
Quote:
DB_CREATE. After that, I open the database with no flags. If that
fails, I open the database with flags=DB_CREATE and dbtype=DB_HASH.
|
I use a transaction the first time I create the database, but after
that I depend on DB_AUTO_COMMIT. I use a cursor when loading the
database, but close it when done.
My behavior during use consists of lots and lots of writing of the same
key with updating values.
Finally, I close the db and the dbenv on program shutdown.