Saving and loading multiple objects in pickle file?
Asked Answered
O

9

100

I have a class that serves players in a game, creates them and other things.

I need to save these player objects in a file to use it later. I've tried the pickle module but I don't know how to save multiple objects and again loading them? Is there a way to do that or should I use other classes such as lists and save and load my objects in a list?

Is there a better way?

Outlying answered 21/12, 2013 at 7:47 Comment(3)
Using list as container seems reasonable.Gleam
asking after 1 year. Cannot we use shelve library of python for the same task. If not what would would be the drawbackPeripeteia
Something you can use to inspect pickle files : pickleviewer.comBurgle
C
109

Using a list, tuple, or dict is by far the most common way to do this:

import pickle
PIK = "pickle.dat"

data = ["A", "b", "C", "d"]
with open(PIK, "wb") as f:
    pickle.dump(data, f)
with open(PIK, "rb") as f:
    print pickle.load(f)

That prints:

['A', 'b', 'C', 'd']

However, a pickle file can contain any number of pickles. Here's code producing the same output. But note that it's harder to write and to understand:

with open(PIK, "wb") as f:
    pickle.dump(len(data), f)
    for value in data:
        pickle.dump(value, f)
data2 = []
with open(PIK, "rb") as f:
    for _ in range(pickle.load(f)):
        data2.append(pickle.load(f))
print data2

If you do this, you're responsible for knowing how many pickles are in the file you write out. The code above does that by pickling the number of list objects first.

Camel answered 22/12, 2013 at 2:38 Comment(4)
Thanks I have your idea but I thought for multiple list objects it may cause memory issues & I decided to save each player in a separate file but do you think listing pickle objects my cause memory problems?Outlying
Don't have enough info. How many players? How big is each player's pickle? How much RAM is available? If you have a great many players, it would be best to incorporate a database and store pickles in that (instead of inventing your own database, one painful step at a time).Camel
Why do all pickle examples always use binary mode? Binary file writing is one frontier my work has not yet broached whatsoever...I know nothing about it or why anyone uses it anywhere.Mark
@Aerovistae binary mode is used because Windows will mess with end-of-line characters in text mode.Genia
S
158

Two additions to Tim Peters' accepted answer.

First, you need not store the number of items you pickled separately if you stop loading when you hit the end of the file:

def loadall(filename):
    with open(filename, "rb") as f:
        while True:
            try:
                yield pickle.load(f)
            except EOFError:
                break

items = loadall(myfilename)

This assumes the file contains only pickles; if there's anything else in there, the generator will try to treat whatever else is in there as pickles too, which could be dangerous.

Second, this way, you do not get a list but rather a generator. This will load only one item into memory at a time, which is useful if the dumped data is very large -- one possible reason why you may have wanted to pickle multiple items separately in the first place. You can still iterate over items with a for loop as if it were a list.

Sexdecillion answered 26/2, 2015 at 15:9 Comment(7)
This should be the top answerSubtorrid
Just be aware that calling load(myfilename) does not actually load the data or read from the file until you iterate over the result. If you want to load them immediately, use something like list(load(myfilename)) or a for loop.Galer
Will this approach not leave the file handle open until the generator happens to be garbage collected, leading to potential locking issues? To solve this, should we put the yield outside the with open() block? Granted this leads to unnecessary reads to iterate through the pickle file, but I think I'd prefer this to dangling file handles. Unless we are sure this method will always be called quickly to EOF, and we close the file when the end of the file is reached. (But if we're bothering to yield individual elements it is probably because we don't need to unpickle all objects in a file.)Zealotry
@Chris: If the iterator is used to its end, the with open will terminate and properly close the file. If it may not be used to its end, we will often not care about the open file. If it may not be used to its end and we don't like the open file, then, yes, the above construction is not the best way to go.Sexdecillion
IMO, we don't need a generator. Loading 1 item at a time is done by pickle.load, not by the generator, isn't it? As Chris and Lutz mentioned, the loadall method is supposed to be used until EOF because of closing, but if that's the case, why do we use a generator in the first place? :)Linguiform
@starriet: because it saves the caller the hassle of opening and closing a file themselves and so allows for simple and idiomatic code at the place where the pickle items are used.Sexdecillion
@starriet - also because some APIs require that a generator be passed as an argument.Rehnberg
C
109

Using a list, tuple, or dict is by far the most common way to do this:

import pickle
PIK = "pickle.dat"

data = ["A", "b", "C", "d"]
with open(PIK, "wb") as f:
    pickle.dump(data, f)
with open(PIK, "rb") as f:
    print pickle.load(f)

That prints:

['A', 'b', 'C', 'd']

However, a pickle file can contain any number of pickles. Here's code producing the same output. But note that it's harder to write and to understand:

with open(PIK, "wb") as f:
    pickle.dump(len(data), f)
    for value in data:
        pickle.dump(value, f)
data2 = []
with open(PIK, "rb") as f:
    for _ in range(pickle.load(f)):
        data2.append(pickle.load(f))
print data2

If you do this, you're responsible for knowing how many pickles are in the file you write out. The code above does that by pickling the number of list objects first.

Camel answered 22/12, 2013 at 2:38 Comment(4)
Thanks I have your idea but I thought for multiple list objects it may cause memory issues & I decided to save each player in a separate file but do you think listing pickle objects my cause memory problems?Outlying
Don't have enough info. How many players? How big is each player's pickle? How much RAM is available? If you have a great many players, it would be best to incorporate a database and store pickles in that (instead of inventing your own database, one painful step at a time).Camel
Why do all pickle examples always use binary mode? Binary file writing is one frontier my work has not yet broached whatsoever...I know nothing about it or why anyone uses it anywhere.Mark
@Aerovistae binary mode is used because Windows will mess with end-of-line characters in text mode.Genia
H
29

Try this:

import pickle

file = open('test.pkl','wb')
obj_1 = ['test_1', {'ability', 'mobility'}]
obj_2 = ['test_2', {'ability', 'mobility'}]
obj_3 = ['test_3', {'ability', 'mobility'}]

pickle.dump(obj_1, file)
pickle.dump(obj_2, file)
pickle.dump(obj_3, file)

file.close()

file = open('test.pkl', 'rb')
obj_1 = pickle.load(file)
obj_2 = pickle.load(file)
obj_3 = pickle.load(file)
print(obj_1)
print(obj_2)
print(obj_3)
file.close()
Halophyte answered 19/2, 2018 at 16:56 Comment(0)
L
13

If you're dumping it iteratively, you'd have to read it iteratively as well.

You can run a loop (as the accepted answer shows) to keep unpickling rows until you reach the end-of-file (at which point an EOFError is raised).

data = []
with open("data.pickle", "rb") as f:
    while True:
        try:
            data.append(pickle.load(f))
        except EOFError:
            break

Minimal Verifiable Example

import pickle

# Dumping step
data = [{'a': 1}, {'b': 2}]
with open('test.pkl', 'wb') as f:
    for d in data:
        pickle.dump(d, f)

# Loading step
data2 = []
with open('test.pkl', 'rb') as f:
    while True:
        try:
            data2.append(pickle.load(f))
        except EOFError:
            break

data2
# [{'a': 1}, {'b': 2}]

data == data2
# True

Of course, this is under the assumption that your objects have to be pickled individually. You can also store your data as a single list of object, then use a single pickle/unpickle call (no need for loops).

data = [{'a':1}, {'b':2}]  # list of dicts as an example
with open('test.pkl', 'wb') as f:
    pickle.dump(data, f)

with open('test.pkl', 'rb') as f:
    data2 = pickle.load(f)

data2
# [{'a': 1}, {'b': 2}]
Latonya answered 26/6, 2019 at 6:6 Comment(0)
D
8

I will give an object-oriented demo using pickle to store and restore one or multi object:

class Worker(object):

    def __init__(self, name, addr):
        self.name = name
        self.addr = addr

    def __str__(self):
        string = u'[<Worker> name:%s addr:%s]' %(self.name, self.addr)
        return string

# output one item
with open('testfile.bin', 'wb') as f:
    w1 = Worker('tom1', 'China')
    pickle.dump(w1, f)

# input one item
with open('testfile.bin', 'rb') as f:
    w1_restore = pickle.load(f)
print 'item: %s' %w1_restore

# output multi items
with open('testfile.bin', 'wb') as f:
    w1 = Worker('tom2', 'China')
    w2 = Worker('tom3', 'China')
    pickle.dump([w1, w2], f)

# input multi items
with open('testfile.bin', 'rb') as f:
    w_list = pickle.load(f)

for w in w_list:
    print 'item-list: %s' %w

output:

item: [<Worker> name:tom1 addr:China]
item-list: [<Worker> name:tom2 addr:China]
item-list: [<Worker> name:tom3 addr:China]
Delicacy answered 9/5, 2015 at 1:23 Comment(0)
T
0

It's easy if you use klepto, which gives you the ability to transparently store objects in files or databases. It uses a dict API, and allows you to dump and/or load specific entries from an archive (in the case below, serialized objects stored one entry per file in a directory called scores).

>>> import klepto
>>> scores = klepto.archives.dir_archive('scores', serialized=True)
>>> scores['Guido'] = 69 
>>> scores['Fernando'] = 42
>>> scores['Polly'] = 101
>>> scores.dump()
>>> # access the archive, and load only one 
>>> results = klepto.archives.dir_archive('scores', serialized=True)
>>> results.load('Polly')
>>> results
dir_archive('scores', {'Polly': 101}, cached=True)
>>> results['Polly']
101
>>> # load all the scores
>>> results.load()
>>> results['Guido']
69
>>>
Thebes answered 19/2, 2016 at 13:16 Comment(0)
P
0

Here is how to dump two (or more dictionaries) using pickle, and extract it back:

import pickle

dict_1 = {1: 'one', 2: 'two'}
dict_2 = {1: {1: 'one'}, 2: {2: 'two'}}

F = open('data_file1.pkl', 'wb')
pickle.dump(dict_1, F)
pickle.dump(dict_2, F)
F.close()

=========================================

import pickle

dict_1 = {1: 'one', 2: 'two'}
dict_2 = {1: {1: 'one'}, 2: {2: 'two'}}

F = open('data_file1.pkl', 'rb')
G = pickle.load(F)
print(G)
H = pickle.load(F)
print(H)
F.close()
Ptomaine answered 22/9, 2020 at 10:43 Comment(0)
P
0

Suppose we have saved objects in the file of an Employee class. Here is the code to read all objects, one by one, from file:

 e = Employee()    

with open(filename, 'rb') as a:
    while True:
        try:
            e = pickle.load(a)
            e.ShowRecord()
        except EOFError:
            break    
Phosphaturia answered 15/12, 2020 at 18:59 Comment(0)
W
0

Dictionaries are handy to save any sorts of objects with pickle, as you can retrieve them with keys.

import pickle

def save_to_pickle(filepath, **kwargs):
    objects_dict = kwargs
    with open(filepath, 'wb') as file:
        pickle.dump(objects_dict, file)


a = [1, 2, 3]
b = {"x": 10, "y": 20}
filepath = 'saved_objects.pkl'
save_to_pickle(filepath, array_a=a, dictionary_b=b)

with open(filepath, "rb") as file:
    print(pickle.load(file)["array_a"])

Wilterdink answered 28/2, 2024 at 14:18 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.