ubell (AT) sleepycat (DOT) com writes:
Quote:
Locker ids in the "top half" of the range belong to transactions while
those in the "bottom half" are for non-transactional cursor operations. |
Thanks, it makes a lot more sense now.
Quote:
The fact that you have a PENDING lock means that the thread which is
using locker id 1cc12 has been granted the lock but has not been
scheduled yet. It should be difficult to see a lock in this state
unless the thread has exited or there is some problem with the thread
scheduler that is preventing it from running. |
I've been debugging the situation (it can take a couple of days to
reproduce...) and I'm beginning to think it must be some kind of a
mutex problem in BDB or my build of it.
When the app gets stuck, the processes, all single-threaded, get stuck
waiting on a semaphore (msem_lock() keeps returning EAGAIN). Even
after I shut down all processes, the situation persists -- e.g db_stat
hangs doing the same thing. Application recovery makes it running
again, but of course the database locks remain. I'm planning to try
again, this time with --with-mutex=HPPA/gcc-assembly, but I'm afraid
I'd be only masking a problem.
Using the pthread library for mutexes didn't work; the utilities spew
errors like "db_stat: unable to lock mutex: Invalid argument" and the
database gets corrupted during concurrent access. I recall db-3.3
worked with pthreads, but that was using HP's compiler, I have gcc now.
The application BTW is Cyrus IMAP 2.2.12, and the problem is in the
transactional mailbox database. I believe it has seen so much use that
I wouldn't be the only one experiencing a database handling bug, if
there is one. All processes seem to exit cleanly as designed and I
don't see anything interesting in the logs.
Quote:
Note that threads should
not handle interrupts while waiting on events inside the Berkeley DB
library unless they return from the interrupt without blocking or
making other Berkeley DB library calls. |
Just for clarity, are we speaking "threads" as in real multithreaded
applications, or "threads of control" as in several separate processes?
I'm reviewing the code, but I don't think it does that.
--
http://www.hut.fi/u/iisakkil/ --Foo.