The correct method to associate multiple values to a single key is to pack a list or dictionary value using for instance "json.dumps".
Here is an exemple:
#!/usr/bin/python
from json import dumps
from json import loads
import bsddb
# write
users = bsddb.hashopen("users.db","w")
primarykey = 'amz'
users[primarykey] = dumps(dict(username="amz", age=30, bio="craftsman"))
users.close()
# read
users = bsddb.hashopen("users.db","r")
for key in users.keys():
print loads(users[key])
users.close()
This is the basic pattern to be used with bsddb and is applicable to other key/value dbs like leveldb.
Extra:
Given the fact that bsddb hashmap keys are ordered lexicographically (ie. like python 2 strings) you can build hashmaps with keys having a predicatable order saving you the trouble of going through all the table.
To put that feature to good use you have to build useful keys. Again you need a packing function that translates python sort order to lexigraphic order (ie. 11 > 2
but "11" < "2"
) . Here is an exemple such packing function:
def pack(*values):
def __pack(value):
if type(value) is int:
return '1' + struct.pack('>q', value)
elif type(value) is str:
return '2' + struct.pack('>q', len(value)) + value
else:
data = dumps(value, encoding='utf-8')
return '3' + struct.pack('>q', len(data)) + data
return ''.join(map(__pack, values))
This is a kind of naive, you could go the extra mile and support float
and better pack int
to save space.
For instance, given the simplified schema User(username, age)
you can build another hashmap we call age_index
with which you can easily retrieve every user at
age of 30. The hashmap can look like the following:
key | value
-----------------
29 tom | X
30 amz | X
30 joe | X
30 moh | X
This is a human readable view of the hasmap: the key is actually packed with the above pack
function. As you can see the key is the composition of the age
and the primarykey
of the the item stored previously. In this case the value is not used because we don't need it. Mind the fact that each key is and must be unique.
Once that schema in place, you do "select queries", called range queries in bsddb using Cursor.set_range(key)
. This will set the cursor at the nearest key and return the key/value pair associated (the semantic can be a bit different depending on the database).
For instance to retrieve the first person that has age=30
, one use the following code:
def get_first_person_with_age_thirty()
key, _ = age_index.set_range(pack(30)) # we don't need value
age, pk = unpack(key)
# set_range, will set the key to "30" and a pk
# or with a key prefix superior to 30.
# So we need to check that age is really 30.
if age == 30:
return loads(users[pk])
This will return the document associated with user amz
.
To go further one needs to make use of the other bsddb interface which entry point is bsddb.db.DB
and bsddb.db.DBEnv
(documentation]. With this interface you can have several hashmap bound to the same transaction context even if it's not required to use transactions.
bsddb
module, since it is deprecated. If you want adict
-like interface to a DB then use theshelve
module. Also: if you want to save more than a single object as a value then use alist
as value andappend
the values when you insert them. – Lemayshelve
? A database in general? – Lemay