What is special about internal design of LMDB?
Asked Answered
N

1

10

What would be the performance difference (reads/writes) between some C++ implementation of in-memory B-Tree (for example google btree) and the LMDB (without taking into consideration all the feacures of LMDB such as transactions, isolation, shared access etc.)?

Nietzsche answered 8/2, 2016 at 21:32 Comment(0)
D
17

This 2014 lmdb design presentation by its architect Howard Chu covers the design and tradeoffs of lmdb.

To summarize: lmdb is a copy-on-write, bottom-up updated, double-buffered, b-tree where the implementation always favors simplicity whenever it clashes with other considerations.

The smart design choices make it one of the highest performance and corruption-resistant B-tree implementations out there.

  • copy-on-write means that data is never overwritten avoiding many possible corruption scenarios
  • bottom-up updates from leaf to root make the root update equivalent to a commit
  • double buffering of past versions keeps only the last-two roots in the db file
  • complex multi-level caching schemes are avoided - lmdb relies on the underlying OS for caching
  • The whole code base is very small compared to other DBs avoiding CPU cache misses

Obviously, these choices mean that lmdb is not friendly to complex scenarios such as:

  • multi-version DB rollbacks (only last 2 are available)
  • long-lived transactions and delayed commits: these lead to append-only behavior and potentially unlimited growth of the db file
  • multiple concurrent writers: lmdb favors simpler multiple readers and single writer schemes

There's much more in the full (over 100 pages) presentation. The above is just a summary of the spirit of lmdb.

lmdb is used as the core storage engine in prominent open source projects such as Open LDAP and Memcached and in both cases speed-ups of orders of magnitude have been observed compared to alternatives as can be seen in micro-benchmark results.

Dinka answered 1/3, 2016 at 20:18 Comment(3)
Can you clarify why LMDB is faster for writes conidering the need (in lmdb) of copy a number of pages before you actually write? In classic B-Tree I don't spend CPU for page copying (modify data in place).Nietzsche
Based on reading the slide-deck, my understanding is that lmdb simply optimizes for the most common case and avoids complexity. Modifying data in place (vs COW) is not free: it involves: 1) complex locking mechanisms 2) maintaining a transaction log for recovery etc. which lmdb avoids. Based on the lmdb experience such complexities eventually outweight the benefits (for the most common cases, not for all scenarios, of course). The authors also mention small code size as improving speed (full code fits in L1 cache of typical current CPUs)Dinka
the first link is dead. found it here: schd.ws/hosted_files/buildstuff14/96/…Talkathon

© 2022 - 2024 — McMap. All rights reserved.