How to determine which package choosen?
The built-in module whichdb
may be used for that. For example:
In [34]: db = anydbm.open('test.db', 'c')
In [35]: db['test'] = '123'
In [36]: db.close()
In [37]: import whichdb
In [38]: dir(whichdb)
Out[38]:
['__builtins__',
'__doc__',
'__file__',
'__name__',
'__package__',
'_dbmerror',
'dbm',
'os',
'struct',
'sys',
'whichdb']
In [39]: whichdb.whichdb('test.db')
Out[39]: 'dbhash'
What database implementation best to use?
The shelve
module talks about some restrictions if the underlying DB engine is dbm
(i.e., the Python module called dbm
, which interfaces with Unix ndbm
or the BSD DB or the GNU GDBM compatibility interfaces for ndbm
):
[...] this means that (the pickled representation of) the objects stored in the database should be fairly small, and in rare cases key collisions may cause the database to refuse updates.
It's not clear whether this applies only to ndbm
proper, or the compatibility interfaces also; what "fairly small" means in numbers; and how "rare" are those cases.
Actually, Ruby, which also has bindings for DBM, has this to say:
Original Berkeley DB was limited to 2GB of data. Dbm libraries also sometimes limit the total size of a key/value pair, and the total size of all the keys that hash to the same value. These limits can be as little as 512 bytes. That said, gdbm and recent versions of Berkeley DB do away with these limits.
I'm assuming this is nothing to worry about, because it's quite unlikely that ndbm
will be used and because hitting any of these limitations would (hopefully) throw a descriptive exception, at which point we'll need to mess around further.
shelve.BsdDbShelf
, you dont need to havebsddbm
package available? – Emphasize