VLDB: Linear and Random Access -
06-05-2006
, 02:11 AM
Hello,
I hope this is the right place to ask a question about database
research... I am looking for a data structure that stores a set of
keys which supports the following features:
- can store sets larger than a few times the available RAM memory (say,
200GB) on second-level memory
- supports fast random access, i.e. the check if a given key is present,
without making assumptions about the distribution of queried keys
- supports fast linear access, i.e. for a another (large, but in-RAM) set
of keys, it can check if the keys are in the intersection
Keys are binary strings which are quite evenly distributed, i.e. it is
possible to calculate very smoothly distributed hash functions.
I try to stay way below disk latency time for linear access, and it would
be nice to reduce random access to one page read at most. I do think that
hash partitioning is the way to go, but I am not sure how to do the disk
layout, and whether an adapting layout is feasible (so that early linear
checks are cheap).
Thank you very much for any hints,
H. |