lmdb.BadRslotError: mdb_txn_begin: MDB_BAD_RSLOT: Invalid reuse of reader locktable slot?

Asked 5/7, 2019 at 15:3 Answered 14/8, 2021 at 1:38

I've been experimenting with nearest neighbor algorithm for images with the style presented in this post (i.e. goal is to see how many nearly similar images there is). After getting the example adapted to my case running, I have seen couple of times the error "lmdb.BadRslotError: mdb_txn_begin: MDB_BAD_RSLOT: Invalid reuse of reader locktable slot" , and wondering what is the cause?

My hypothesis is that it was caused by opening the (same) lmdb twice in the same run (at least it hasn't appeared since fixing that), but not totally sure. One of the few search hits is given in another forum, but the answer is not definite.

So the error came from the .begin statement:

fn_lmdb = fn + '.lmdb'  # stores word <-> id mapping
env = lmdb.open(fn_lmdb, map_size=int(1e9))

with env.begin() as txn:
    ...

At the moment after I moved open next to the begin, the error has not yet appeared, but not sure if I fixed the cause or just a symptom... Have you stumbled to this one, and what was the solution?

Splat answered 5/7, 2019 at 15:3 Comment(0)

I've encountered the same issue, running with multiprocesses in Python. Since this is perhaps the only related question with this error in SO it wasn't easy to find a solution. Eventually I've reached this pull request on github and following the documentation made this change in my code:

lmdb.open(db_dir, create=False, subdir=True, readonly=True, lock=False)

lock: If False, don’t do any locking. If concurrent access is anticipated, the caller must manage all concurrency itself. For proper operation the caller must enforce single-writer semantics, and must ensure that no readers are using old transactions while a writer is active. The simplest approach is to use an exclusive lock so that no readers may be active at all when a writer begins.

My transactions are readonly, so that solution works for me.

I don't know what was causing the issue, my understanding according to the documentation is that the locks in the lock file aren't managed by the lmdb package or Python, and the transactions are simply trying to write to the same place in the file.

Hope it may help someone, since the fix I haven't encountered this problem again. So at the moment it seems to work.

Darvon answered 15/5, 2020 at 7:51 Comment(6)

Thanks! Can confirm, running with multiprocesses in Python caused the mdb_txn_begin error, using lock=False fixed the issue. – Surly 15/12, 2020 at 23:14

While not locking might work for you, it is not a general solution to the problem. – Questionary 4/8, 2021 at 21:55

@DirkGroeneveld Thanks for the comment (and the downvote I guess :) I don't believe I made a claim about this being a general solution to the issue, and if it was implied in my answer then I explicitly state here that it is a solution to a specific scenario. In fact, as long as you manage the concurrency properly it is a solution to the error. But if you have any suggestions on how to improve my answer to make it clearer I'm happy to take your feedback – Darvon 11/8, 2021 at 10:42

Your answer as such is fine, but it shouldn't be checkmarked, because it's not a solution to the problem. I have found since that the Python version of LMDB has this problem when you open the same DB more than once from the same process. You can work around it by making absolutely sure you never do that. I put a solution like that into github.com/allenai/allennlp/blob/main/allennlp/common/…, but it's pretty complicated. I guess I should put that into its own answer. – Questionary 14/8, 2021 at 1:36

I see only one flag option in mdb_txn_begin in LMDB documentation > lmdb.tech/doc/… There is no flag such as MDB_NOLOCK or lock=flase. – Breathe 3/7 at 7:25

@Breathe I assume the library was updated, and perhaps that issue was already solved. But in the current documentation lmdb.readthedocs.io/en/release/#interface lmdb.open is a shortcut for the Environment class constructor. Which has a parameter lock. – Darvon 4/7 at 2:40

This problem occurs when you open the same file twice from the same process. The only solution is to not do that. Either make sure to close() the file before you open another one, or re-use the LMDB env object.

I put a solution that does the latter into AllenNLP at https://github.com/allenai/allennlp/blob/main/allennlp/common/file_utils.py#L589. You could pattern off of that code. But it's often easier to just make sure you never open the same file twice.

Questionary answered 14/8, 2021 at 1:38 Comment(2)

That's interesting, I haven't worked with my code on LMDB for a while now. But I'm pretty sure that it happened to me when I've opened the file for reading only (single time with each process, multiple processes). In the code, I had different parts for writing (no parallelization there) and reading. The writing process would run once and terminate before the reading initiated. That's why setting lock=False solved that – Darvon 15/8, 2021 at 12:43

I think you're right. It will also happen when you open for reading. LMDB will still use locks in that case because it doesn't know whether someone will come along and open the file for writing later. I'll update the answer. – Questionary 16/8, 2021 at 18:18

Recommended topics

Hot tags