bogdan_dorian (AT) yahoo (DOT) com wrote:
Quote:
Hi Mike,
First of all, I want you to specify how does your data/keys pairs look
like and what's your cache size? |
This is a purely synthetic test. Keys are SHA-1 hashes of random data
(20 random bytes), values consist of the single character '0'. The
product is BTREE, only one thread is being used, and the cache size
varies. The performance characteristics I describe require more data
to manifest with a larger cache, but the same thing eventually occurs.
Quote:
Maybe you can tell us what kind of Berkeley DB product are you using
and on what platform. |
BDB 4.4.20 on fedora core 4. Underlying disk is software RAID-5 on
SATA-2 disks.
Quote:
Indeed, disk I/O is very painful from the performance point of view.
Yes, is possible to flush the cache by using DB->sync method (
http://www.sleepycat.com/docs/api_c/db_sync.html ), which flushes any
cached information to disk. |
Indeed, but it does not empty the cache or clear of likely-unusable
pages.
Quote:
Also, what will depend on your application is that if you are using
checkpoints, you have to know that the checkpoints write dirty pages
from cache to files, but be aware that in the same time checkpointing
is very I/O intensive: |
No worries about that.
Quote:
For example, you can try db_stat -m -h HOMRDIR for mpool summary
statistics and per-file statistics, or db_stat -MA -h HOMEDIR for
detaild infor per-file and summery of each page in pool. |
I'll try that. Is there any way to influence the way BDB desides which
pages to evict? I think a put()-centric eviction policy might perform
better for large batches of insertions.
-Mike