Kyoto Cabinet / Berkeley DB : Hash table size limitations
Asked Answered
C

1

7

I am having a hard time storing hundreds of millions of key/value pairs of 16/32bytes with a hash array on my SSD.

With Kyoto Cabinet: When it works fine, it inserts at 70000 record/s. Once it drops, it goes down to 10-500 records/s. With the default settings, the drop happens after around a million records. Looking at the documentation, that is the default number of buckets in the array, so it makes sense. I increased this number to 25 millions and indeed, it works fine until around 25 millions records. Problem is, as soon as I push the number of buckets to 30 millions or over, the insert rate is down to 10-500 records/s from the beginning. Kyoto Cabinet is not designed to increase the number of bucket after the database is created, so I cannot insert more than 25 millions records.

1/ Why would KC's insert rate get very low once the bucket number exceeds 25M ?

With Berkeley DB: The best speed I got is slightly lower than KC, closer to 50000 record/s, but still ok. With the default settings, just like KC, the speed drops suddenly after around a million records. I know BDB is designed to extend gradually its number of buckets. Regardless of that, It tried to increase the initial number, playing with HashNumElements and FillFactor, but any of these attempts made the situation worst. So I still cannot insert over 1-2 millions records with DBD. I tried activating non-synchronized transactions, tried different rates of checkpoints, increased caches. Nothing improves the drop down.

2/ What could cause BDB's insert rate to drop after 1-2 million inserts ?

Note: I'm working with java, and when the speed is dropping, the CPU usage lowers to 0-30% while at 100% when working at correct speed.
Note: Stopping the process and resuming the insertion changes nothing. So I don't think that is related to memory limits or garbage collection.

Thx.

Covariance answered 24/10, 2012 at 17:48 Comment(2)
What does your BDB environment look like? are you using transactions, replication, etc.? Also, can you post some sample code?Gelasius
It's current state is : pastebin.com/bWJpbipZ. I insert with database.put(transaction,k,v), read with database.get(transaction,k,v,LockMode.DEFAULT), and checkpoint every 500000 inserts with environment.checkpoint(null).Covariance
C
3

Below is how I managed to store billions of records despite the writing limitations encountered with KC.

With much effort, I still haven't solved the problem for neither Kyoto Cabinet nor Berkeley DB. However I came up with an interesting workaround using Kyoto Cabinet.

I noticed I cannot write more than 25M records on one KC file, but read has no such limitation −it is always fast, regardless of the size of the database. The solution I found is to create a new KC file (a new database) for every 25M new records. That way the reading happens on many KC files and is still fast, and the writing happens only on the last created file and is fast as well. Only remaining problem was to allow update/deletion of the records on the previous files. For that, I copied the SSTables approach, which is :

  • All the 0 to N-1 files are read-only, file N is read+write.
  • Any insert/update/deletion is written in file N.
  • Reads look into files N to 0, and return the first-seen/last-written insertion/update/deletion.
  • A bloom filter is attached to each file to avoid accessing a file that doesn't have the wanted record.
  • As soon as file N reaches 25M records, it becomes read-only and file N+1 is created.

Notes :

  • Just like with SSTables, If a lot of updates/deletions are performed, we might want to perform compaction. However contrary to SSTables, compaction here doesn't require to rewrite the file. Outdated records are simply removed from the KC files, and if a KC file gets very small, it can be either removed −reinserting the records in file N− or reopenned for new insertions −provided the next files are compact.
  • A deletion does not delete the record, but write a special value that identifies the record as deleted. During compaction, deleted records are removed for real.
  • Checking if a record exists usually requires to look into the database. Thanks to the bloom filters, most of the negative answers can be given without any disk access.
Covariance answered 25/10, 2012 at 22:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.