How to get the difference between two dictionaries in Python?
Asked Answered
K

21

168

I have two dictionaries, and I need to find the difference between the two, which should give me both a key and a value.

I have searched and found some addons/packages like datadiff and dictdiff-master, but when I try to import them in Python 2.7, it says that no such modules are defined.

I used a set here:

first_dict = {}
second_dict = {}
 
value = set(second_dict) - set(first_dict)
print value

My output is:

>>> set(['SCD-3547', 'SCD-3456'])

I am getting only keys, and I need to also get the values.

Karyn answered 28/9, 2015 at 4:20 Comment(1)
Would you also need to find a difference if the keys are identical but their values differ?Talca
G
115

Try the following snippet, using a dictionary comprehension:

value = { k : second_dict[k] for k in set(second_dict) - set(first_dict) }

In the above code we find the difference of the keys and then rebuild a dict taking the corresponding values.

Gametophore answered 28/9, 2015 at 4:26 Comment(14)
Since both dict and set are hashmaps, I don't know why dict can't support a difference() method, given that set does.Broadside
That just gives you the dict for the keys which were in the second dict but not in the first. What about the things which were in the first but not in second?Sicular
@Sicular You can do something like the following to also compare values: value = { k : second_dict[k] for k, _ in set(second_dict.items()) - set(first_dict.items()) } Doing dict.items() gives tuples from keys to values, and those are compared in the set difference. So this will give all new keys as well as changed values.Vaporing
@Vaporing - Yes! I'll update my answer to reflect this... https://mcmap.net/q/143503/-how-to-get-the-difference-between-two-dictionaries-in-pythonSicular
I am getting TypeError: unhashable type: 'dict'Becker
To be sure to see a difference that might be there you have to do it again while switching second_dict and first_dict making the result symmetric.Nunnally
I think you don't need to convert the dict keys to sets, you can just use the set operations on the keys, like so: second_dict.keys() - first_dict.keys()Unstuck
How about { k: v for k,v in dict1.items() if k not in dict2} ?Baily
I'm impressed how @Sicular used these comments to enhance his submitted answer to the OP's question. Nice 'circle back around'.Stonemason
@Óscar López I have a question, what if it has the same keys but a different hash.Vickers
Set elements must be immutable. For example, a tuple may be included in a set: x = { 10 , ('a', 'b', 'c'), 'hello' , 2.71} But lists and dictionaries are mutable, so they can't be set elements, so you get TypeError: unhashable type: 'list' or TypeError: unhashable type: 'dict'Jacintojack
@MuneebAhmadKhurram The hash of a key is used for lookup, not for determining if a key is already present. There is always the possibility of a hash collision, as a hash function only guarantees that hash(x) == hash(y) if x == y; it says nothing about the relationship between hash(x) and hash(y) if x != y. When distinct keys have the same hash, it can slow down key access, but not break it.Morbilli
This doesn't work for dictionaries with same keys but different valuesMessiah
@Messiah lol, and it will never, ever work like that. Dictionaries by their own definition map a single key to a unique value, there's no such thing as having "same keys with different values", you want a multimap for that.Caracara
C
220

I think it's better to use the symmetric difference operation of sets to do that Here is the link to the doc.

>>> dict1 = {1:'donkey', 2:'chicken', 3:'dog'}
>>> dict2 = {1:'donkey', 2:'chimpansee', 4:'chicken'}
>>> set1 = set(dict1.items())
>>> set2 = set(dict2.items())
>>> set1 ^ set2
{(2, 'chimpansee'), (4, 'chicken'), (2, 'chicken'), (3, 'dog')}

It is symmetric because:

>>> set2 ^ set1
{(2, 'chimpansee'), (4, 'chicken'), (2, 'chicken'), (3, 'dog')}

This is not the case when using the difference operator.

>>> set1 - set2
{(2, 'chicken'), (3, 'dog')}
>>> set2 - set1
{(2, 'chimpansee'), (4, 'chicken')}

However it may not be a good idea to convert the resulting set to a dictionary because you may lose information:

>>> dict(set1 ^ set2)
{2: 'chicken', 3: 'dog', 4: 'chicken'}
Carlton answered 23/1, 2017 at 14:37 Comment(12)
Excellent, this is more or less the same solution suggested by Raymond Hettinger nearly 9 years ago on another forum: code.activestate.com/recipes/576644-diff-two-dictionaries/#c1Uncurl
Elegant solution. But it cannot apply on the dicts with unhashable values.Pileus
I suspect this answer is what OP really needed and should be the accepted answer.Sybilla
Helps me to do the math in one line : dict(set(a.items()) ^ set(b.items()))Sherrard
TypeError: unhashable type: 'dict'Resurgent
Nice. But you probably also want to know which dict was responsible for which key/value pair.Barbosa
@LeiYang your dictionary maybe surrounded by "[]' which means it is list. So first make dictionary from list.Mas
@LeiYang you get TypeError: unhashable type: 'dict' because one of the values inside your "top level dict" is another dict. This proposed solution only works for flat dictionariesSwineherd
I am using dictionary inside dictionary and it is failing with error message : TypeError: unhashable type: 'dict'Getter
Set elements must be immutable. For example, a tuple may be included in a set: x = { 10 , ('a', 'b', 'c'), 'hello' , 2.71} But lists and dictionaries are mutable, so they can't be set elements, so you get TypeError: unhashable type: 'list' or TypeError: unhashable type: 'dict'Jacintojack
None of the presented solutions which utilize set(...) will operate on unhashable types present in either dictionary keys or values.Cassiani
it won't work if you have nested dictionariesMaestro
G
115

Try the following snippet, using a dictionary comprehension:

value = { k : second_dict[k] for k in set(second_dict) - set(first_dict) }

In the above code we find the difference of the keys and then rebuild a dict taking the corresponding values.

Gametophore answered 28/9, 2015 at 4:26 Comment(14)
Since both dict and set are hashmaps, I don't know why dict can't support a difference() method, given that set does.Broadside
That just gives you the dict for the keys which were in the second dict but not in the first. What about the things which were in the first but not in second?Sicular
@Sicular You can do something like the following to also compare values: value = { k : second_dict[k] for k, _ in set(second_dict.items()) - set(first_dict.items()) } Doing dict.items() gives tuples from keys to values, and those are compared in the set difference. So this will give all new keys as well as changed values.Vaporing
@Vaporing - Yes! I'll update my answer to reflect this... https://mcmap.net/q/143503/-how-to-get-the-difference-between-two-dictionaries-in-pythonSicular
I am getting TypeError: unhashable type: 'dict'Becker
To be sure to see a difference that might be there you have to do it again while switching second_dict and first_dict making the result symmetric.Nunnally
I think you don't need to convert the dict keys to sets, you can just use the set operations on the keys, like so: second_dict.keys() - first_dict.keys()Unstuck
How about { k: v for k,v in dict1.items() if k not in dict2} ?Baily
I'm impressed how @Sicular used these comments to enhance his submitted answer to the OP's question. Nice 'circle back around'.Stonemason
@Óscar López I have a question, what if it has the same keys but a different hash.Vickers
Set elements must be immutable. For example, a tuple may be included in a set: x = { 10 , ('a', 'b', 'c'), 'hello' , 2.71} But lists and dictionaries are mutable, so they can't be set elements, so you get TypeError: unhashable type: 'list' or TypeError: unhashable type: 'dict'Jacintojack
@MuneebAhmadKhurram The hash of a key is used for lookup, not for determining if a key is already present. There is always the possibility of a hash collision, as a hash function only guarantees that hash(x) == hash(y) if x == y; it says nothing about the relationship between hash(x) and hash(y) if x != y. When distinct keys have the same hash, it can slow down key access, but not break it.Morbilli
This doesn't work for dictionaries with same keys but different valuesMessiah
@Messiah lol, and it will never, ever work like that. Dictionaries by their own definition map a single key to a unique value, there's no such thing as having "same keys with different values", you want a multimap for that.Caracara
D
87

Another solution would be dictdiffer (https://github.com/inveniosoftware/dictdiffer).

import dictdiffer                                          

a_dict = {                                                 
  'a': 'foo',
  'b': 'bar',
  'd': 'barfoo'
}                                                          

b_dict = {                                                 
  'a': 'foo',                                              
  'b': 'BAR',
  'c': 'foobar'
}                                                          

for diff in list(dictdiffer.diff(a_dict, b_dict)):         
    print(diff)

A diff is a tuple with the type of change, the changed value, and the path to the entry.

('change', 'b', ('bar', 'BAR'))
('add', '', [('c', 'foobar')])
('remove', '', [('d', 'barfoo')])
Donation answered 22/11, 2017 at 11:9 Comment(2)
The most practical solution for debugging.Gatling
The other solutions did not solve my issue because of my dict-in-dict structure. This solution could handle it. Thanks!Excruciating
C
23

You can use DeepDiff:

pip install deepdiff

Among other things, it lets you recursively calculate the difference of dictionaries, iterables, strings and other objects:

>>> from deepdiff import DeepDiff

>>> d1 = {1:1, 2:2, 3:3, "foo":4}
>>> d2 = {1:1, 2:4, 3:3, "bar":5, 6:6}
>>> DeepDiff(d1, d2)
{'dictionary_item_added': [root['bar'], root[6]],
 'dictionary_item_removed': [root['foo']],
 'values_changed': {'root[2]': {'new_value': 4, 'old_value': 2}}}

It lets you see what changed (even types), what was added and what was removed. It also lets you do many other things like ignoring duplicates and ignoring paths (defined by regex).

Creamcups answered 9/1, 2022 at 16:54 Comment(0)
P
20

A solution is to use the unittest module:

from unittest import TestCase
TestCase().assertDictEqual(expected_dict, actual_dict)

Obtained from How can you test that two dictionaries are equal with pytest in python

Pact answered 3/8, 2021 at 20:11 Comment(3)
This is pretty cool but doesn't work in Python 2: ValueError: no such test method in <class 'unittest.case.TestCase'>: runTestGratification
@Gratification All my condolences if you're still using Python 2 !Pact
Please drop python 2 dude - it's no more supported officially by pythonSwirsky
G
12

You were right to look at using a set, we just need to dig in a little deeper to get your method to work.

First, the example code:

test_1 = {"foo": "bar", "FOO": "BAR"}
test_2 = {"foo": "bar", "f00": "b@r"}

We can see right now that both dictionaries contain a similar key/value pair:

{"foo": "bar", ...}

Each dictionary also contains a completely different key value pair. But how do we detect the difference? Dictionaries don't support that. Instead, you'll want to use a set.

Here is how to turn each dictionary into a set we can use:

set_1 = set(test_1.items())
set_2 = set(test_2.items())

This returns a set containing a series of tuples. Each tuple represents one key/value pair from your dictionary.

Now, to find the difference between set_1 and set_2:

print set_1 - set_2
>>> {('FOO', 'BAR')}

Want a dictionary back? Easy, just:

dict(set_1 - set_2)
>>> {'FOO': 'BAR'}
Giovanna answered 28/9, 2015 at 4:45 Comment(1)
Please note this is not symmetrical, you'll need to do (set 2 - set 1) in addition to (set 1 - set 2). Else you will not capture all the differences, such as {("f00": "b@r")}, which is missing here in the output.Maniemanifest
A
12

I would recommend using something already written by good developers. Like pytest. It has a deal with any data type, not only dicts. And, BTW, pytest is very good at testing.

from _pytest.assertion.util import _compare_eq_any

print('\n'.join(_compare_eq_any({'a': 'b'}, {'aa': 'vv'}, verbose=3)))

Output is:

Left contains 1 more item:
{'a': 'b'}
Right contains 1 more item:
{'aa': 'vv'}
Full diff:
- {'aa': 'vv'}
?    -    ^^
+ {'a': 'b'}
?        ^

If you don't like using private functions (started with _), just have a look at the source code and copy/paste the function to your code.

P.S.: Tested with pytest==6.2.4

Alleyn answered 3/11, 2021 at 19:50 Comment(1)
Thanks so much for this! Incredibly useful for environments where you have pytest but don't want to install a new package just to do a diff in a notebook!Headline
M
10

This is my own version, from combining https://mcmap.net/q/143503/-how-to-get-the-difference-between-two-dictionaries-in-python with https://mcmap.net/q/143503/-how-to-get-the-difference-between-two-dictionaries-in-python, and now I see it is quite similar to https://mcmap.net/q/143503/-how-to-get-the-difference-between-two-dictionaries-in-python:

def dict_diff(dict_a, dict_b, show_value_diff=True):
  result = {}
  result['added']   = {k: dict_b[k] for k in set(dict_b) - set(dict_a)}
  result['removed'] = {k: dict_a[k] for k in set(dict_a) - set(dict_b)}
  if show_value_diff:
    common_keys =  set(dict_a) & set(dict_b)
    result['value_diffs'] = {
      k:(dict_a[k], dict_b[k])
      for k in common_keys
      if dict_a[k] != dict_b[k]
    }
  return result
Mueller answered 20/8, 2021 at 11:38 Comment(2)
first_dict is not definedLithesome
Thanks for noticing, @Crystal, should be fixed now.Mueller
S
8

This function gives you all the diffs (and what stayed the same) based on the dictionary keys only. It also highlights some nice Dict comprehension, Set operations and python 3.6 type annotations :)

from typing import Dict, Any, Tuple
def get_dict_diffs(a: Dict[str, Any], b: Dict[str, Any]) -> Tuple[Dict[str, Any], Dict[str, Any], Dict[str, Any], Dict[str, Any]]:

    added_to_b_dict: Dict[str, Any] = {k: b[k] for k in set(b) - set(a)}
    removed_from_a_dict: Dict[str, Any] = {k: a[k] for k in set(a) - set(b)}
    common_dict_a: Dict[str, Any] = {k: a[k] for k in set(a) & set(b)}
    common_dict_b: Dict[str, Any] = {k: b[k] for k in set(a) & set(b)}
    return added_to_b_dict, removed_from_a_dict, common_dict_a, common_dict_b

If you want to compare the dictionary values:

values_in_b_not_a_dict = {k : b[k] for k, _ in set(b.items()) - set(a.items())}
Sicular answered 31/1, 2018 at 14:23 Comment(2)
Wouldn't common_dict_a and common_dict_b be the same? Whatever is common to A and B is one set of key:value pairs. No need to duplicate.Notecase
The keys are the same; but the values might be different. That is why that is there.Mueller
C
8

A function using the symmetric difference set operator, as mentioned in other answers, which preserves the origins of the values:

def diff_dicts(a, b, missing=KeyError):
    """
    Find keys and values which differ from `a` to `b` as a dict.

    If a value differs from `a` to `b` then the value in the returned dict will
    be: `(a_value, b_value)`. If either is missing then the token from 
    `missing` will be used instead.

    :param a: The from dict
    :param b: The to dict
    :param missing: A token used to indicate the dict did not include this key
    :return: A dict of keys to tuples with the matching value from a and b
    """
    return {
        key: (a.get(key, missing), b.get(key, missing))
        for key in dict(
            set(a.items()) ^ set(b.items())
        ).keys()
    }

Example

print(diff_dicts({'a': 1, 'b': 1}, {'b': 2, 'c': 2}))

# {'c': (<class 'KeyError'>, 2), 'a': (1, <class 'KeyError'>), 'b': (1, 2)}

How this works

We use the symmetric difference set operator on the tuples generated from taking items. This generates a set of distinct (key, value) tuples from the two dicts.

We then make a new dict from that to collapse the keys together and iterate over these. These are the only keys that have changed from one dict to the next.

We then compose a new dict using these keys with a tuple of the values from each dict substituting in our missing token when the key isn't present.

Clevie answered 2/12, 2019 at 15:14 Comment(1)
This works great! But it not working when one or both dicts contains lists: set(a.items()) ^ set(b.items()) TypeError: unhashable type: 'list'Drusilla
C
8

Not sure this is what the OP asked for, but this is what I was looking for when I came across this question - specifically, how to show key by key the difference between two dicts:

Pitfall: when one dict has a missing key, and the second has it with a None value, the function would assume they are similar

This is not optimized at all - suitable for small dicts

def diff_dicts(a, b, drop_similar=True):
    res = a.copy()

    for k in res:
        if k not in b:
            res[k] = (res[k], None)

    for k in b:
        if k in res:
            res[k] = (res[k], b[k])
        else:
            res[k] = (None, b[k])

    if drop_similar:
        res = {k:v for k,v in res.items() if v[0] != v[1]}

    return res


print(diff_dicts({'a': 1}, {}))
print(diff_dicts({'a': 1}, {'a': 2}))
print(diff_dicts({'a': 2}, {'a': 2}))
print(diff_dicts({'a': 2}, {'b': 2}))
print(diff_dicts({'a': 2}, {'a': 2, 'b': 1}))

Output:

{'a': (1, None)}
{'a': (1, 2)}
{}
{'a': (2, None), 'b': (None, 2)}
{'b': (None, 1)}
Coldblooded answered 26/4, 2021 at 8:19 Comment(0)
I
5

What about this? Not as pretty but explicit.

orig_dict = {'a' : 1, 'b' : 2}
new_dict = {'a' : 2, 'v' : 'hello', 'b' : 2}

updates = {}
for k2, v2 in new_dict.items():
    if k2 in orig_dict:    
        if v2 != orig_dict[k2]:
            updates.update({k2 : v2})
    else:
        updates.update({k2 : v2})

#test it
#value of 'a' was changed
#'v' is a completely new entry
assert all(k in updates for k in ['a', 'v'])
Ibo answered 15/6, 2018 at 15:21 Comment(0)
D
5
def flatten_it(d):
    if isinstance(d, list) or isinstance(d, tuple):
        return tuple([flatten_it(item) for item in d])
    elif isinstance(d, dict):
        return tuple([(flatten_it(k), flatten_it(v)) for k, v in sorted(d.items())])
    else:
        return d

dict1 = {'a': 1, 'b': 2, 'c': 3}
dict2 = {'a': 1, 'b': 1}

print set(flatten_it(dict1)) - set(flatten_it(dict2)) # set([('b', 2), ('c', 3)])
# or 
print set(flatten_it(dict2)) - set(flatten_it(dict1)) # set([('b', 1)])
Disperse answered 10/10, 2018 at 8:8 Comment(0)
M
3

Old question, but thought I'd share my solution anyway. Pretty simple.

dicta_set = set(dicta.items()) # creates a set of tuples (k/v pairs)
dictb_set = set(dictb.items())
setdiff = dictb_set.difference(dicta_set) # any set method you want for comparisons
for k, v in setdiff: # unpack the tuples for processing
    print(f"k/v differences = {k}: {v}")

This code creates two sets of tuples representing the k/v pairs. It then uses a set method of your choosing to compare the tuples. Lastly, it unpacks the tuples (k/v pairs) for processing.

Moussaka answered 25/5, 2020 at 0:25 Comment(0)
W
1

This will return a new dict (only changed data).

def get_difference(obj_1: dict, obj_2: dict) -> dict:
result = {}

for key in obj_1.keys():
    value = obj_1[key]

    if isinstance(value, dict):
        difference = get_difference(value, obj_2.get(key, {}))

        if difference:
            result[key] = difference

    elif value != obj_2.get(key):
        result[key] = obj_2.get(key, None)

return result
Whangee answered 26/3, 2021 at 14:35 Comment(0)
M
1

For one side comparison you can use dict comprehension:

dict1 = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
dict2 = {'a': OMG, 'b': 2, 'c': 3, 'd': 4}

data = {a:dict1[a] for a in dict1 if dict1[a] != dict2[a]}

output: {'a': 1}

Myotome answered 12/8, 2022 at 14:4 Comment(0)
E
0

Here is a variation that lets you update dict1 values if you know the values in dict2 are right.

Consider:

dict1.update((k, dict2.get(k)) for k, v in dict1.items())
Egression answered 23/6, 2022 at 18:31 Comment(0)
R
0
a_dic={'a':1, 'b':2}
b_dic={'a':1, 'b':20}

sharedmLst = set(a_dic.items()).intersection(b_dic.items())
diff_from_b = set(a_dic.items()) - sharedmLst
diff_from_a = set(b_dic.items()) - sharedmLst

print("Among the items in a_dic, the item different from b_dic",diff_from_b)
print("Among the items in b_dic, the item different from a_dic",diff_from_a)

Result :
Among the items in a_dic, the item different from b_dic {('b', 2)}
Among the items in b_dic, the item different from a_dic {('b', 20)}
Raimundo answered 1/12, 2022 at 14:56 Comment(1)
Please include some explanation into your answer besides code.Opalina
T
0

This solution works perfectly with unhashable dicts, which fixes this error:

TypeError: Unhashable type 'dict'.

Start with the top-ranked solution from @Roedy. We create a dictionary of lists, which are a good example of something that is non-hashable:

>>> dict1 = {1:['donkey'], 2:['chicken'], 3:['dog']}
>>> dict2 = {1:['donkey'], 2:['chimpansee'], 4:['chicken']}

Then we preprocess to make each value hashable using str(value):

>>> set1 = set([(key, str(value)) for key, value in dict1.items()])
>>> set2 = set([(key, str(value)) for key, value in dict2.items()])

Then we continue as per answer from @Reody:

>>> set1 ^ set2
{(3, "['dog']"), (4, "['chicken']"), (2, "['chimpansee']"), (2,"['chicken']")}
Trudy answered 31/1, 2023 at 13:43 Comment(2)
Note: The values are now display-only, as they have been converted to a string so they can be hashed. However, now we know the differences, we can refer back to the original dictionary.Trudy
Note: Efficiency wise, this is not all that fast as it converts all values to a string, it may be quicker to use the hash function on each value.Trudy
Y
0

For testing, the datatest package will check for differences in dictionaries, numpy arrays, pandas dataframes, etc. Datatest also lets you also set a tolerance for floating point comparisons.

from datatest import validate, accepted
def test_compare_dict():
    expected = {"key1": 0.5}
    actual = {"key1": 0.499}
    with accepted.tolerance(0.1):
        validate(expected, actual)

Differences result in a datatest.ValidationError that contains the relevant Invalid, Deviation, Missing, or Extra items.

Yeti answered 30/4, 2023 at 19:37 Comment(0)
J
0

On Python 3 I'm getting unhashable type: 'dict' errors. I know OP asked for Python 2.7, but since it's already deprecated, here's Python 3 compatible function:

def dict_diff(a, b):
    diff = {}
    for k,v in a.items():
        if k not in b:
            diff[k] = v
        elif v != b[k]:
            diff[k] = '%s != %s' % (v, b[k])
    for k,v in b.items():
        if k not in a:
            diff[k] = v

    return diff

the output is following:

  d1 = {1:'donkey', 2:'chicken', 3:'dog'}
  d2 = {1:'donkey', 2:'chimpansee', 4:'chicken'}

  diff = dict_diff(d1, d2)
  # {2: 'chicken != chimpansee', 3: 'dog', 4: 'chicken'}
Jaime answered 1/3 at 11:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.