Dictionary vs Object - which is more efficient and why?
Asked Answered
M

8

150

What is more efficient in Python in terms of memory usage and CPU consumption - Dictionary or Object?

Background: I have to load huge amount of data into Python. I created an object that is just a field container. Creating 4M instances and putting them into a dictionary took about 10 minutes and ~6GB of memory. After dictionary is ready, accessing it is a blink of an eye.

Example: To check the performance I wrote two simple programs that do the same - one is using objects, other dictionary:

Object (execution time ~18sec):

class Obj(object):
  def __init__(self, i):
    self.i = i
    self.l = []
all = {}
for i in range(1000000):
  all[i] = Obj(i)

Dictionary (execution time ~12sec):

all = {}
for i in range(1000000):
  o = {}
  o['i'] = i
  o['l'] = []
  all[i] = o

Question: Am I doing something wrong or dictionary is just faster than object? If indeed dictionary performs better, can somebody explain why?

Millrace answered 26/8, 2009 at 18:55 Comment(2)
You should really use xrange instead of range when generating large sequences like that. Of course, since you're dealing with seconds of execution time, it won't make much difference, but still, it's a good habit.Sympathy
unless if it is python3Mealy
B
183

Have you tried using __slots__?

From the documentation:

By default, instances of both old and new-style classes have a dictionary for attribute storage. This wastes space for objects having very few instance variables. The space consumption can become acute when creating large numbers of instances.

The default can be overridden by defining __slots__ in a new-style class definition. The __slots__ declaration takes a sequence of instance variables and reserves just enough space in each instance to hold a value for each variable. Space is saved because __dict__ is not created for each instance.

So does this save time as well as memory?

Comparing the three approaches on my computer:

test_slots.py:

class Obj(object):
  __slots__ = ('i', 'l')
  def __init__(self, i):
    self.i = i
    self.l = []
all = {}
for i in range(1000000):
  all[i] = Obj(i)

test_obj.py:

class Obj(object):
  def __init__(self, i):
    self.i = i
    self.l = []
all = {}
for i in range(1000000):
  all[i] = Obj(i)

test_dict.py:

all = {}
for i in range(1000000):
  o = {}
  o['i'] = i
  o['l'] = []
  all[i] = o

test_namedtuple.py (supported in 2.6):

import collections

Obj = collections.namedtuple('Obj', 'i l')

all = {}
for i in range(1000000):
  all[i] = Obj(i, [])

Run benchmark (using CPython 2.5):

$ lshw | grep product | head -n 1
          product: Intel(R) Pentium(R) M processor 1.60GHz
$ python --version
Python 2.5
$ time python test_obj.py && time python test_dict.py && time python test_slots.py 

real    0m27.398s (using 'normal' object)
real    0m16.747s (using __dict__)
real    0m11.777s (using __slots__)

Using CPython 2.6.2, including the named tuple test:

$ python --version
Python 2.6.2
$ time python test_obj.py && time python test_dict.py && time python test_slots.py && time python test_namedtuple.py 

real    0m27.197s (using 'normal' object)
real    0m17.657s (using __dict__)
real    0m12.249s (using __slots__)
real    0m12.262s (using namedtuple)

So yes (not really a surprise), using __slots__ is a performance optimization. Using a named tuple has similar performance to __slots__.

Barbarbarbara answered 26/8, 2009 at 19:17 Comment(8)
That is great - thanks! I've tried the same on my machine - object with slots is the most efficient approach (I got ~7sec).Millrace
There are also named tuples, docs.python.org/library/collections.html#collections.namedtuple , a class factory for objects with slots. It definitely neater and maybe even more optimized.Encode
I tested named tuples as well, and updated the answer with the results.Barbarbarbara
I ran your code a few times and was surprised my results differ - slots=3sec obj=11sec dict=12sec namedtuple=16sec. I'm using CPython 2.6.6 on Win7 64bitHelbonna
To emphasize the punchline - namedtuple got the worst results instead of the bestHelbonna
Strange. I'll try the benchmark on Windows when I get the chance.Barbarbarbara
Why is using a normal object so much slower than using __dict__ if, under the hood, the normal object is using dictionaries?Nadene
Hi. I tried the benchmark proposed here and the best time was obtained using a plain dictionary (Wall time: 1.98 s). I did the test in a jupyter notebook using %%time and %%timeit. Python version 3.8.5 and OS Windows 8.1. Is this "normal"? I thought that the __slots__ test (Wall time: 8.8 s) would outperform the other tests.Behlau
P
17

Attribute access in an object uses dictionary access behind the scenes - so by using attribute access you are adding extra overhead. Plus in the object case, you are incurring additional overhead because of e.g. additional memory allocations and code execution (e.g. of the __init__ method).

In your code, if o is an Obj instance, o.attr is equivalent to o.__dict__['attr'] with a small amount of extra overhead.

Porcine answered 26/8, 2009 at 19:3 Comment(3)
Did you test this? o.__dict__["attr"] is the one with extra overhead, taking an extra bytecode op; obj.attr is faster. (Of course attribute access isn't going to be slower than subscription access--it's a critical, heavily optimized code path.)Diallage
Obviously if you actually do o.__dict__["attr"] it will be slower - I only meant to say that it was equivalent to that, not that it was implemented exactly in that way. I guess it's not clear from my wording. I also mentioned other factors such as memory allocations, constructor call time etc.Porcine
Is this still the case with recent versions of python3, 11 years later?Ungula
B
11

Have you considered using a namedtuple? (link for python 2.4/2.5)

It's the new standard way of representing structured data that gives you the performance of a tuple and the convenience of a class.

It's only downside compared with dictionaries is that (like tuples) it doesn't give you the ability to change attributes after creation.

Banditry answered 26/8, 2009 at 23:39 Comment(0)
A
10

Here is a copy of @hughdbrown answer for python 3.6.1, I've made the count 5x larger and added some code to test the memory footprint of the python process at the end of each run.

Before the downvoters have at it, Be advised that this method of counting the size of objects is not accurate.

from datetime import datetime
import os
import psutil

process = psutil.Process(os.getpid())


ITER_COUNT = 1000 * 1000 * 5

RESULT=None

def makeL(i):
    # Use this line to negate the effect of the strings on the test 
    # return "Python is smart and will only create one string with this line"

    # Use this if you want to see the difference with 5 million unique strings
    return "This is a sample string %s" % i

def timeit(method):
    def timed(*args, **kw):
        global RESULT
        s = datetime.now()
        RESULT = method(*args, **kw)
        e = datetime.now()

        sizeMb = process.memory_info().rss / 1024 / 1024
        sizeMbStr = "{0:,}".format(round(sizeMb, 2))

        print('Time Taken = %s, \t%s, \tSize = %s' % (e - s, method.__name__, sizeMbStr))

    return timed

class Obj(object):
    def __init__(self, i):
       self.i = i
       self.l = makeL(i)

class SlotObj(object):
    __slots__ = ('i', 'l')
    def __init__(self, i):
       self.i = i
       self.l = makeL(i)

from collections import namedtuple
NT = namedtuple("NT", ["i", 'l'])

@timeit
def profile_dict_of_nt():
    return [NT(i=i, l=makeL(i)) for i in range(ITER_COUNT)]

@timeit
def profile_list_of_nt():
    return dict((i, NT(i=i, l=makeL(i))) for i in range(ITER_COUNT))

@timeit
def profile_dict_of_dict():
    return dict((i, {'i': i, 'l': makeL(i)}) for i in range(ITER_COUNT))

@timeit
def profile_list_of_dict():
    return [{'i': i, 'l': makeL(i)} for i in range(ITER_COUNT)]

@timeit
def profile_dict_of_obj():
    return dict((i, Obj(i)) for i in range(ITER_COUNT))

@timeit
def profile_list_of_obj():
    return [Obj(i) for i in range(ITER_COUNT)]

@timeit
def profile_dict_of_slot():
    return dict((i, SlotObj(i)) for i in range(ITER_COUNT))

@timeit
def profile_list_of_slot():
    return [SlotObj(i) for i in range(ITER_COUNT)]

profile_dict_of_nt()
profile_list_of_nt()
profile_dict_of_dict()
profile_list_of_dict()
profile_dict_of_obj()
profile_list_of_obj()
profile_dict_of_slot()
profile_list_of_slot()

And these are my results

Time Taken = 0:00:07.018720,    provile_dict_of_nt,     Size = 951.83
Time Taken = 0:00:07.716197,    provile_list_of_nt,     Size = 1,084.75
Time Taken = 0:00:03.237139,    profile_dict_of_dict,   Size = 1,926.29
Time Taken = 0:00:02.770469,    profile_list_of_dict,   Size = 1,778.58
Time Taken = 0:00:07.961045,    profile_dict_of_obj,    Size = 1,537.64
Time Taken = 0:00:05.899573,    profile_list_of_obj,    Size = 1,458.05
Time Taken = 0:00:06.567684,    profile_dict_of_slot,   Size = 1,035.65
Time Taken = 0:00:04.925101,    profile_list_of_slot,   Size = 887.49

My conclusion is:

  1. Slots have the best memory footprint and are reasonable on speed.
  2. dicts are the fastest, but use the most memory.
August answered 5/7, 2017 at 1:40 Comment(5)
Man, you should turn this into a question. I ran it on my own computer, too, just to make sure (I didn't have psutil installed, so I took that part out). Anyway, this is baffling to me, and means the original question is not fully answered. All the other answers are like "namedtuple is great" and "use slots", and apparently a brand new dict object every time is faster than them? I guess dicts are really well optimised?Micropyle
It seems to be the result of the makeL function returning a string. If you return an empty list, instead, the results roughly match up with hughdbrown's from python2. Except namedtuples are always slower than SlotObj :(Micropyle
There could be a small problem: makeL could run with different speeds in each '@timeit' round since strings are cached in python - but maybe I'm wrong.Mealy
@BarnabasSzabolcs should create a new string every time because it has to substitute in the value "This is a sample string %s" % iAugust
Yes, that's true within the loop, but in the second test i starts from 0 again.Mealy
C
4
from datetime import datetime

ITER_COUNT = 1000 * 1000

def timeit(method):
    def timed(*args, **kw):
        s = datetime.now()
        result = method(*args, **kw)
        e = datetime.now()

        print method.__name__, '(%r, %r)' % (args, kw), e - s
        return result
    return timed

class Obj(object):
    def __init__(self, i):
       self.i = i
       self.l = []

class SlotObj(object):
    __slots__ = ('i', 'l')
    def __init__(self, i):
       self.i = i
       self.l = []

@timeit
def profile_dict_of_dict():
    return dict((i, {'i': i, 'l': []}) for i in xrange(ITER_COUNT))

@timeit
def profile_list_of_dict():
    return [{'i': i, 'l': []} for i in xrange(ITER_COUNT)]

@timeit
def profile_dict_of_obj():
    return dict((i, Obj(i)) for i in xrange(ITER_COUNT))

@timeit
def profile_list_of_obj():
    return [Obj(i) for i in xrange(ITER_COUNT)]

@timeit
def profile_dict_of_slotobj():
    return dict((i, SlotObj(i)) for i in xrange(ITER_COUNT))

@timeit
def profile_list_of_slotobj():
    return [SlotObj(i) for i in xrange(ITER_COUNT)]

if __name__ == '__main__':
    profile_dict_of_dict()
    profile_list_of_dict()
    profile_dict_of_obj()
    profile_list_of_obj()
    profile_dict_of_slotobj()
    profile_list_of_slotobj()

Results:

hbrown@hbrown-lpt:~$ python ~/Dropbox/src/StackOverflow/1336791.py 
profile_dict_of_dict ((), {}) 0:00:08.228094
profile_list_of_dict ((), {}) 0:00:06.040870
profile_dict_of_obj ((), {}) 0:00:11.481681
profile_list_of_obj ((), {}) 0:00:10.893125
profile_dict_of_slotobj ((), {}) 0:00:06.381897
profile_list_of_slotobj ((), {}) 0:00:05.860749
Coated answered 26/8, 2009 at 19:19 Comment(0)
P
3

There is no question.
You have data, with no other attributes (no methods, nothing). Hence you have a data container (in this case, a dictionary).

I usually prefer to think in terms of data modeling. If there is some huge performance issue, then I can give up something in the abstraction, but only with very good reasons.
Programming is all about managing complexity, and the maintaining the correct abstraction is very often one of the most useful way to achieve such result.

About the reasons an object is slower, I think your measurement is not correct.
You are performing too little assignments inside the for loop, and therefore what you see there is the different time necessary to instantiate a dict (intrinsic object) and a "custom" object. Although from the language perspective they are the same, they have quite a different implementation.
After that, the assignment time should be almost the same for both, as in the end members are maintained inside a dictionary.

Passenger answered 26/8, 2009 at 19:28 Comment(0)
L
0

There is yet another way with the help of recordclass library to reduce memory usage if data structure isn't supposed to contain reference cycles.

Let's compare two classes:

class DataItem:
    __slots__ = ('name', 'age', 'address')
    def __init__(self, name, age, address):
        self.name = name
        self.age = age
        self.address = address

and

$ pip install recordclass

>>> from recordclass import make_dataclass
>>> DataItem2 = make_dataclass('DataItem', 'name age address')
>>> inst = DataItem('Mike', 10, 'Cherry Street 15')
>>> inst2 = DataItem2('Mike', 10, 'Cherry Street 15')
>>> print(inst2)
DataItem(name='Mike', age=10, address='Cherry Street 15')
>>> print(sys.getsizeof(inst), sys.getsizeof(inst2))
64 40

It became possible since dataobject-based subclasses doesn't support cyclic garbage collection, which is not needed in such cases.

Lomond answered 25/12, 2018 at 10:48 Comment(0)
M
0

Here are my test runs of the very nice script of @Jarrod-Chesney. For comparison, I also run it against python2 with "range" replaced by "xrange".

By curiosity, I also added similar tests with OrderedDict (ordict) for comparison.

Python 3.6.9:

Time Taken = 0:00:04.971369,    profile_dict_of_nt,     Size = 944.27
Time Taken = 0:00:05.743104,    profile_list_of_nt,     Size = 1,066.93
Time Taken = 0:00:02.524507,    profile_dict_of_dict,   Size = 1,920.35
Time Taken = 0:00:02.123801,    profile_list_of_dict,   Size = 1,760.9
Time Taken = 0:00:05.374294,    profile_dict_of_obj,    Size = 1,532.12
Time Taken = 0:00:04.517245,    profile_list_of_obj,    Size = 1,441.04
Time Taken = 0:00:04.590298,    profile_dict_of_slot,   Size = 1,030.09
Time Taken = 0:00:04.197425,    profile_list_of_slot,   Size = 870.67

Time Taken = 0:00:08.833653,    profile_ordict_of_ordict, Size = 3,045.52
Time Taken = 0:00:11.539006,    profile_list_of_ordict, Size = 2,722.34
Time Taken = 0:00:06.428105,    profile_ordict_of_obj,  Size = 1,799.29
Time Taken = 0:00:05.559248,    profile_ordict_of_slot, Size = 1,257.75

Python 2.7.15+:

Time Taken = 0:00:05.193900,    profile_dict_of_nt,     Size = 906.0
Time Taken = 0:00:05.860978,    profile_list_of_nt,     Size = 1,177.0
Time Taken = 0:00:02.370905,    profile_dict_of_dict,   Size = 2,228.0
Time Taken = 0:00:02.100117,    profile_list_of_dict,   Size = 2,036.0
Time Taken = 0:00:08.353666,    profile_dict_of_obj,    Size = 2,493.0
Time Taken = 0:00:07.441747,    profile_list_of_obj,    Size = 2,337.0
Time Taken = 0:00:06.118018,    profile_dict_of_slot,   Size = 1,117.0
Time Taken = 0:00:04.654888,    profile_list_of_slot,   Size = 964.0

Time Taken = 0:00:59.576874,    profile_ordict_of_ordict, Size = 7,427.0
Time Taken = 0:10:25.679784,    profile_list_of_ordict, Size = 11,305.0
Time Taken = 0:05:47.289230,    profile_ordict_of_obj,  Size = 11,477.0
Time Taken = 0:00:51.485756,    profile_ordict_of_slot, Size = 11,193.0

So, on both major versions, the conclusions of @Jarrod-Chesney are still looking good.

Mature answered 18/4, 2020 at 12:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.