Shelve: choice of database
Asked Answered
E

3

8

shelve documentation says:

The choice of which database package will be used (such as dbm, gdbm or bsddb) depends on which interface is available.

What is that mean? How to determine which package choosen? How to strictly define which one must be choosed? What database implementation best to use?

Emphasize answered 30/12, 2012 at 16:50 Comment(0)
E
9

Found it here:
http://www.gossamer-threads.com/lists/python/python/13891

import shelve 
import gdbm 

def gdbm_shelve(filename, flag="c"): 
    return shelve.Shelf(gdbm.open(filename, flag)) 

db = gdbm_shelve("dbfile") 

ps
In linked page someone also found this somewhere, but his link is dead.

Emphasize answered 2/2, 2013 at 18:35 Comment(0)
T
5

I think there is no way to specify the underlaying database yourself. shelve uses anydbm and anydbm uses the whichdb module which tries the following underlaying implementations in the following order

  • dbhash
  • gdm
  • dbm
  • dumbdbm

You may use the shelve.BsdDbShelf subclass of Shelf to force the usage of bsd*d*b implementation.

Thrombosis answered 30/12, 2012 at 17:9 Comment(1)
So, in case of using shelve.BsdDbShelf, you dont need to have bsddbm package available?Emphasize
C
3

How to determine which package choosen?

The built-in module whichdb may be used for that. For example:

In [34]: db = anydbm.open('test.db', 'c')

In [35]: db['test'] = '123'

In [36]: db.close()

In [37]: import whichdb

In [38]: dir(whichdb)
Out[38]: 
['__builtins__',
 '__doc__',
 '__file__',
 '__name__',
 '__package__',
 '_dbmerror',
 'dbm',
 'os',
 'struct',
 'sys',
 'whichdb']

In [39]: whichdb.whichdb('test.db')
Out[39]: 'dbhash'

What database implementation best to use?

The shelve module talks about some restrictions if the underlying DB engine is dbm (i.e., the Python module called dbm, which interfaces with Unix ndbm or the BSD DB or the GNU GDBM compatibility interfaces for ndbm):

[...] this means that (the pickled representation of) the objects stored in the database should be fairly small, and in rare cases key collisions may cause the database to refuse updates.

It's not clear whether this applies only to ndbm proper, or the compatibility interfaces also; what "fairly small" means in numbers; and how "rare" are those cases.

Actually, Ruby, which also has bindings for DBM, has this to say:

Original Berkeley DB was limited to 2GB of data. Dbm libraries also sometimes limit the total size of a key/value pair, and the total size of all the keys that hash to the same value. These limits can be as little as 512 bytes. That said, gdbm and recent versions of Berkeley DB do away with these limits.

I'm assuming this is nothing to worry about, because it's quite unlikely that ndbm will be used and because hitting any of these limitations would (hopefully) throw a descriptive exception, at which point we'll need to mess around further.

Cylindrical answered 12/12, 2017 at 10:39 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.