Compare dictionaries ignoring specific keys
Asked Answered
M

9

41

How can I test if two dictionaries are equal while taking some keys out of consideration. For example,

equal_dicts(
    {'foo':1, 'bar':2, 'x':55, 'y': 77 },
    {'foo':1, 'bar':2, 'x':66, 'z': 88 },
    ignore_keys=('x', 'y', 'z')
)

should return True.

UPD: I'm looking for an efficient, fast solution.

UPD2. I ended up with this code, which appears to be the fastest:

def equal_dicts_1(a, b, ignore_keys):
    ka = set(a).difference(ignore_keys)
    kb = set(b).difference(ignore_keys)
    return ka == kb and all(a[k] == b[k] for k in ka)

Timings: https://gist.github.com/2651872

Miksen answered 7/5, 2012 at 10:55 Comment(1)
I appreciate that you compiled these various answers into a timings gist. However, one thing I noticed when looking them over they aren't all doing the same thing. Some compare both dictionaries keys...some just compare the keys from the first dictionary...some handle key errors and others don't. I wish they were all down to the minimal case, or up to the max to show truly which is faster, but either way, I appreciate your compilation as it was helpful.Heall
M
43
def equal_dicts(d1, d2, ignore_keys):
    d1_filtered = {k:v for k,v in d1.items() if k not in ignore_keys}
    d2_filtered = {k:v for k,v in d2.items() if k not in ignore_keys}
    return d1_filtered == d2_filtered

EDIT: This might be faster and more memory-efficient:

def equal_dicts(d1, d2, ignore_keys):
    ignored = set(ignore_keys)
    for k1, v1 in d1.iteritems():
        if k1 not in ignored and (k1 not in d2 or d2[k1] != v1):
            return False
    for k2, v2 in d2.iteritems():
        if k2 not in ignored and k2 not in d1:
            return False
    return True
Morning answered 7/5, 2012 at 11:1 Comment(7)
+1 (better than my answer!) Also, if one happens to be using Python 3, you can use a dict comprehension (scroll down a bit) in place of the dict(<generator expression>) idiom.Consistory
This is a straightforward solution, but in my situation efficiency matters.Miksen
The second one appears to be buggy: equal_dicts({'a':3,'b':5}, {'a':3,'b':6}, 'b') == False (should be True).Miksen
Just testing d[k1] != v1 without the k1 not in d2 check, and catching KeyError is possibly faster (avoids hashing k1 the third time).Consistory
@thg435 - fixed and/or priority with parentheses and now your test returns True.Morning
@dbaupp the big performance gain is the use of the set() ... at 100.000 value in both dictionary with 50.000 ignore_keys, this method on my laptop (windows and python 2.6) averages 0.066 seconds. The dictionary are the same to force the full iteration. That's fast enough I guess. My method (updated with set) averages 0.072 seconds! I real circumstances, this check wouldn't be noticeable!Swayback
Don't forget: iteritems() must be replaced by items() in Python 3+.Menjivar
D
16

Using dict comprehensions:

>>> {k: v for k,v in d1.items() if k not in ignore_keys} == \
... {k: v for k,v in d2.items() if k not in ignore_keys}

Use .viewitems() instead on Python 2.

Dreadfully answered 7/5, 2012 at 11:11 Comment(3)
Thanks, but see my comment to eumiro's answer. I prefer not to build two expensive memory structures just to compare them.Miksen
then you can write the loop out manually , but you might find the comprehension faster anyway because of C implementationDreadfully
Comparison of two dict comprehensions is a beautiful one-liner. And I agree this method may even be faster, depending on data size.Grasshopper
P
5

If you need this check when testing, you can use the ANY from the unittest.mock library. Here is an example.

from unittest.mock import ANY
actual = {'userName':'bob', 'lastModified':'2012-01-01'}
expected = {'userName':'bob', 'lastModified': ANY}
assert actual == expected

See more

Perverse answered 24/6, 2022 at 12:52 Comment(2)
Is there an equivalent of this for PyTest?Webber
you can use this in pytestPerverse
M
4

Here's another variant:

set(ignore_keys).issuperset(k for (k, v) in d1.items() ^ d2.items())

Its virtues:

  • C speed identification of differences between the dicts
  • C speed check for membership in the set of ignored keys
  • Early-out if a single mismatch is found
Moray answered 28/11, 2021 at 17:0 Comment(4)
Any reason why d1^d2 doesn't work but d1.items() ^ d2.items() does?Shaeffer
@Shaeffer See python.org/dev/peps/pep-3106 . That is the only public document explaining the rationale for the design choices.Moray
Is there a reason this one is not the accepted or highly voted answer? This is extremely memory efficient and accomplishes the same task in less lines.Achernar
This doesn't work when the dict contains another dictChalcis
C
1

Very very crudely, you could just delete any ignored keys and compare those dictionaries:

def equal_dicts(d1, d2, ignore_keys=()):
    d1_, d2_ = d1.copy(), d2.copy()
    for k in ignore_keys:
        try:
            del d1_[k]
        except KeyError: 
            pass
        try:
            del d2_[k]
        except KeyError: 
            pass

    return d1_ == d2_

(Note that we don't need a deep copy here, we just need to avoid modifying d1 and d2.)

Consistory answered 7/5, 2012 at 11:3 Comment(0)
S
1
def compare_dict(d1, d2, ignore):
    for k in d1:
        if k in ignore:
            continue
        try:
            if d1[k] != d2[k]:
                return False
        except KeyError:
            return False
    return True

Comment edit: You can do something like compare_dict(d1, d2, ignore) and compare_dict(d2, d1, ignore) or duplicate the for

def compare_dict(d1, d2, ignore):
    ignore = set(ignore)
    for k in d1:
        if k in ignore:
            continue
        try:
            if d1[k] != d2[k]:
                return False
        except KeyError:
            return False

    for k in d2:
        if k in ignore:
            continue
        try:
            if d1[k] != d2[k]:
                return False
        except KeyError:
            return False
    return True

Whatever is faster and cleaner! Update: cast set(ignore)

Swayback answered 7/5, 2012 at 11:10 Comment(1)
Thanks, but I don't think this will work when d2 has extra keys.Miksen
D
0

If we know that both dictionaries have the same keys:

def equal_dicts(dic1: dict, dict2: dict, keys_to_ignore: set) -> bool:
    return all(dic1[key] == dict2[key] for key in dic1.keys() if key not in keys_to_ignore)

If we don't know that both dictionaries have the same keys, the above method will fail if dict2 has some non-ignored keys that are missing from dict1, so we can alter the method to fist check that dict2 doesn't have any extra keys:

def equal_dicts(dic1: dict, dict2: dict, keys_to_ignore: set) -> bool:
    return (
        all (key in dict1 for key in dic2.keys() if key not in keys_to_ignore)
        and all(dic1[key] == dict2[key] for key in dic1.keys() if key not in keys_to_ignore)
    )
Disillusion answered 4/1 at 7:28 Comment(0)
L
-2

Optimal solution for the case of ignoring only one key

return all(
    (x == y or (x[1] == y[1] == 'key to ignore')) for x, y in itertools.izip(
          d1.iteritems(), d2.iteritems()))
Lupe answered 14/7, 2016 at 20:16 Comment(1)
Beware: this probably didn't work correctly in all cases in earlier Python versions (e.g. differently sized hash tables etc), but an analogous implementation certainly longer works in Python 3.6+ because dict.items() etc methods now return items in insertion-order, not hashtable-order.Skiles
S
-2

in case your dictionary contained lists or other dictionaries:

def equal_dicts(d1, d2, ignore_keys, equal):
    # print('got d1', d1)
    # print('got d2', d2)
    if isinstance(d1, str):
        if not isinstance(d2, str):
            return False
        return d1 == d2
    for k in d1:
        if k in ignore_keys:
            continue
        if not isinstance(d1[k], dict) and not isinstance(d1[k], list) and d2.get(k) != d1[k]:
            print(k)
            equal = False
        elif isinstance(d1[k], list):
            if not isinstance(d2.get(k), list):
                equal = False
            if len(d1[k]) != len(d2[k]):
                return False
            if len(d1[k]) > 0 and isinstance(d1[k][0], dict):
                if not isinstance(d2[k][0], dict):
                    return False
                d1_sorted = sorted(d1[k], key=lambda item: item.get('created'))
                d2_sorted = sorted(d2[k], key=lambda item: item.get('created'))
                equal = all(equal_dicts(x, y, ignore_keys, equal) for x, y in zip(d1_sorted, d2_sorted)) and equal
            else:
                equal = all(equal_dicts(x, y, ignore_keys, equal) for x, y in zip(d1[k], d2[k])) and equal
        elif isinstance(d1[k], dict):
            if not isinstance(d2.get(k), dict):
                equal = False
            print(k)
            equal = equal_dicts(d1[k], d2[k], ignore_keys, equal) and equal
    return equal
Scrivings answered 1/2, 2018 at 11:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.