What is the difference between pickle and shelve?
Asked Answered
N

2

101

When is it appropriate to use pickle, and when is it appropriate to use shelve? That is to say, what do they do differently from each other?

From my research, I understood that pickle can turn every Python object into stream of bytes which can be persisted into a file. Then why do we need shelve as well? Isn't pickle faster?

Nervous answered 5/11, 2010 at 3:42 Comment(3)
Is it like the case that pickle is like a very low-level stuff and shelve gives us more ways to store complex objects?Nervous
shelve provides a dictionary-style interface to pickling. A dictionary interface to pickling is convenient for implementing things like caching of results (so you don't ever recalculate) -- the keys being *args,**kwds and the value being the calculated results.Simper
Note that shelve DOES NOT enable one to store objects that pickle cannot pickle. If you are looking for a better version of shelve that can both store the majority of python objects as well as provides a more flexible dictionary interface to disk or database… then you might want to look at klepto. See: https://mcmap.net/q/212500/-pickle-versus-shelve-storing-large-dictionaries-in-pythonSimper
O
128

pickle is for serializing some object (or objects) as a single bytestream in a file.

shelve builds on top of pickle and implements a serialization dictionary where objects are pickled, but associated with a key (some string), so you can load your shelved data file and access your pickled objects via keys. This could be more convenient were you to be serializing many objects.

Here is an example of usage between the two. (should work in latest versions of Python 2.7 and Python 3.x).

pickle Example

import pickle

integers = [1, 2, 3, 4, 5]

with open('pickle-example.p', 'wb') as pfile:
    pickle.dump(integers, pfile)

This will dump the integers list to a binary file called pickle-example.p.

Now try reading the pickled file back.

import pickle

with open('pickle-example.p', 'rb') as pfile:
    integers = pickle.load(pfile)
    print(integers)

The above should output [1, 2, 3, 4, 5].

shelve Example

import shelve

integers = [1, 2, 3, 4, 5]

# If you're using Python 2.7, import contextlib and use
# the line:
# with contextlib.closing(shelve.open('shelf-example', 'c')) as shelf:
with shelve.open('shelf-example', 'c') as shelf:
    shelf['ints'] = integers

Notice how you add objects to the shelf via dictionary-like access.

Read the object back in with code like the following:

import shelve

# If you're using Python 2.7, import contextlib and use
# the line:
# with contextlib.closing(shelve.open('shelf-example', 'r')) as shelf:
with shelve.open('shelf-example', 'r') as shelf:
    for key in shelf.keys():
        print(repr(key), repr(shelf[key]))

The output will be 'ints', [1, 2, 3, 4, 5].

Ornstead answered 5/11, 2010 at 3:47 Comment(6)
Can you shed some light on what the 'c' flag is for in the call to shelve.open? I googled but could not find an answer. Thanks in advance.Ernest
@Ernest the 'c' flag tells shelve to open the file for reading and writing, or to create the file if needed. It's also described in shelve.open's documentation, which shares the same flags as anydbm.open.Ornstead
@birryree What version of Python does your example work with? On 2.7.10, it's indicating that there is no Context Manager implemented for shelve: ----> 5 with shelve.open('shelf-example-file.txt', 'c') as shelf: 6 for k, v in my_dict: 7 shelf[k] = v AttributeError: DbfilenameShelf instance has no attribute '__exit__'Rapper
@Rapper I must have verified my code in 3.x but not 2.7.x, probably assuming context manager usage wouldn't be different between them. I updated my examples to say that if you're using Python 2.7, you should import contextlib in your code and then wrap your shelve.open with contextlib.closing().Ornstead
This answer would benefit from a paragraph about when to prefer which, and whether pickle is actually significantly fasterRapp
This answer fails to explain: why not just build the dictionary and store it with Pickle (since it can serialize dictionaries)? To my understanding, the actual reason (shown by the code, but not addressed in the explanation) is that you can work with the shelf "as a dictionary" while the underlying file is still open, and without reading the whole thing in to memory. That is to say, shelve doesn't have separate dump/load steps; access is provided by opening the shelf.Murmur
G
10

According to pickle documentation:

Serialization is a more primitive notion than persistence; although pickle reads and writes file objects, it does not handle the issue of naming persistent objects, nor the (even more complicated) issue of concurrent access to persistent objects. The pickle module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure. Perhaps the most obvious thing to do with these byte streams is to write them onto a file, but it is also conceivable to send them across a network or store them in a database. The shelve module provides a simple interface to pickle and unpickle objects on DBM-style database files.

Gardal answered 27/1, 2020 at 21:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.