![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
From this I have concluded that only one process should be allowed to create a cursor on a given database using DB_WRITECURSOR. So I wrote this perl script to test this. The code is included at the end of this post. It opens an environment, opens a db, opens a write cursor, then forks a new process which attempts to open a write cursor on the same db. |
#3
| |||
| |||
|
|
From the Berkeley DB Reference Guide (ref/build_unix/notes.html) 4. I get core dumps when running programs that fork children. Berkeley DB handles should not be shared across process forks, each forked child should acquire its own Berkeley DB handles. If you do call fork() in a process that has open DB handles, then only one of the processes may use or close any open handles. The other must act like the handles don't exist at all. If the BerkeleyDB perl module reference-counts and automatically closes the BDB handles than it may be necessary to explicitly call exec() or POSIX::_exit() to keep the handles from being closed in the one process. |
#4
| |||
| |||
|
|
Thanks, that helps clear it up. After looking around a little it appears that the preferred method is to use the same environment to open a new db for each process. My concern is that I thought that I read somewhere that a call to DB->open() has similar overhead as DB->stat() where it has to traverse the entire tree. For a large database, that would be a significant overhead penalty for each new process that is forked to handle a database request. |
|
Am I wrong about this or is there another way to get a new DB handle that will avoid this overhead? |
#5
| |||
| |||
|
|
With the exception of recno tables for which the DB_SNAPSHOT flag has been set, I don't *think* the overhead of DB->open() is related to table size. On the other hand, DB->close() will call DB->sync() unless you pass it the DB_NOSYNC flag, which can get quite expensive as it writes out *all* dirty pages, not just the ones dirtied by the process calling DB->sync(). So, if you're using transactions and logging or if this is a transient table, then you should consider using that flag. Thanks, I'll try that. I am suprised that this feels a little like |
| Am I wrong about this or is there another way to get a new DB handle that will avoid this overhead? Put it all in one threaded process. Using multiple processes with a shared environment is no more robust than a single threaded process, so this isn't a reliability hit...assuming your other libraries are thread-safe... I'm not sure that the BerkeleyDB perl module supports this? |
#6
| |||
| |||
|
|
With the exception of recno tables for which the DB_SNAPSHOT flag has been set, I don't *think* the overhead of DB->open() is related to table size. Am I wrong about this or is there another way to get a new DB handle that will avoid this overhead? |
|
Put it all in one threaded process. Using multiple processes with a shared environment is no more robust than a single threaded process, so this isn't a reliability hit...assuming your other libraries are thread-safe... |
| I suppose a real masochist could try writing a memory allocator that gave our chunks in a shared memory region, then tell DB to use that for its internal allocations using db_env_set_func_{malloc,realloc,free}(), hack DB to always use inter-process mutex locks, and see whether the handles could then be shared between forked child processes, but that would hardly be a supportable setup. |
![]() |
| Thread Tools | |
| Display Modes | |
| |