Python. How to subtract 2 dictionaries
Asked Answered
N

9

18

I have 2 dictionaries, A and B. A has 700000 key-value pairs and B has 560000 key-values pairs. All key-value pairs from B are present in A, but some keys in A are duplicates with different values and some have duplicated values but unique keys. I would like to subtract B from A, so I can get the remaining 140000 key-value pairs. When I subtract key-value pairs based on key identity, I remove lets say 150000 key-value pairs because of the repeated keys. I want to subtract key-value pairs based on the identity of BOTH key AND value for each key-value pair, so I get 140000. Any suggestion would be welcome.

This is an example:

A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}

I DO want to get: A-B = {'10':1, '12':1, '10':2, '11':3}

I DO NOT want to get:

a) When based on keys:

{'10':1, '12':1, '10':2}

or

b) When based on values:

{'11':3}
Nautical answered 3/2, 2016 at 20:34 Comment(15)
Possible duplicate of How to remove a key from a dictionary?Vigilant
No @Code-Apprendice, that post does not answer my question. I don't want to remove keys from a dict, but to subtract key-value pairs.Nautical
@Lucas: Isn't that just semantics? Removing the key removes the value.Epigraphic
@Nautical try difference in set.Cantina
@Nautical How is removing a key different than subtracting key-value pairs? What do you mean by "subtract key-value"? Apparently your question is not entirely clear. Please add more details so that we can understand what you want to do.Vigilant
Hi @Steven Rumbalski, the problem is that some keys are duplicates but with different values, so when I remove the keys in the way you say, I will remove key-value pairs that have same keys but different values. I don't want that.Nautical
@viakondratiuk That isn't quite what is asked for. In your link, what is wanted is to find the difference between the values for each key. Here Lucas wants to remove every duplicate key.Dignify
@Lucas: Are your values integers? If so, your collections can be collections.Counter a subclass of dict. collections.Counter has a subtract method.Epigraphic
@Lucas: Your question would be well served with a small example of what you are asking for.Epigraphic
I just edited the post. I hope it is clearer this time.Nautical
@Lucas: If A = {'x':10, 'y':5, 'z':1} and B = {'x':10, 'y':3} should the result be {'y':2, 'z':1} or {'y':5, 'z':1}?Epigraphic
@Lucas: how can k:v pairs from B be in A and then A also have duplicated keys with different values? A key can only appear once in a dictionary?Lashawnda
@Steven Rumbalsky and the others. I just added an example in the edited post. Thank you and the others for your feedback.Nautical
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3} is not possible. If you do this at the python prompt, you will get something like {'11': 3, '10': 2, '12': 1} for A.Leisurely
@Nautical Why not accepting the answer that gave the solution?Tintoretto
R
11
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}

You can't have duplicate keys in Python. If you run the above, it will get reduced to:

A={'11': 3, '10': 2, '12': 1}
B={'11': 2}

But to answer you question, to do A - B (based on dict keys):

all(map( A.pop, B))   # use all() so it works for Python 2 and 3.
print A # {'10': 2, '12': 1}
Rochellrochella answered 10/1, 2018 at 4:32 Comment(7)
At least in Python 3, map does not seem to work as described by Monty After running map( A.pop, B ), A is unchanged. (Perhaps because in Python 3, map returns an iterator.)Door
@mpb, good catch! have to put it inside all() or something so to consume the iterator. Works for Python 2 and 3Rochellrochella
This did not work for me since all() returns a bool. Did I miss something?Weyermann
what version of python do you have? btw, why downvote, if I can still help you?Rochellrochella
Sorry this is months after the fact, but I downvoted because "it did not work". If I made the mistake, I'm happy to turn that frown upside down. I'm using Python 3.Weyermann
I'd avoid using map just for its side effect, and avoid using all to force evaluation. If any of your values are falsey all will stop popping values from A prematurely! In general there's nothing wrong with an imperative for-loop, and in this case PaulMcG's non-mutative comprehension answer seems like the best solution.Charie
If you need to consume an iterator use collection.deque() not all() since all will stop if one of the keys in B is falsy.Restriction
L
47

To get items in A that are not in B, based just on key:

C = {k:v for k,v in A.items() if k not in B}

To get items in A that are not in B, based on key and value:

C = {k:v for k,v in A.items() if k not in B or v != B[k]}

To update A in place (as in A -= B) do:

from collections import deque
consume = deque(maxlen=0).extend
consume(A.pop(key, None) for key in B)

(Unlike using map() with A.pop, calling A.pop with a None default will not break if a key from B is not present in A. Also, unlike using all, this iterator consumer will iterate over all values, regardless of truthiness of the popped values.)

Leisurely answered 3/2, 2016 at 20:41 Comment(1)
This is the most logical/readable, and probably fastest, and it easily tweakable whether the values have to be equal (or just keys being equal) as well.Chickenlivered
C
19

An easy, intuitive way to do this is

dict(set(a.items()) - set(b.items()))
Chemoreceptor answered 3/2, 2016 at 20:43 Comment(5)
This won't work when any of the values is not hashable.Chickenlivered
When would the values not be hashable?Weyermann
Simply put, it's just when it isn't hashable. Builtin unhashables are lists and dicts. It's why, for example, you can't have a list as a dict key.Chemoreceptor
@HuckIt If you store a list in a dict, you'll have problemsMambo
If you store a list in a dict, you'll have problems -> that's pretty common in dicts derived from JSON.Hertford
R
11
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}

You can't have duplicate keys in Python. If you run the above, it will get reduced to:

A={'11': 3, '10': 2, '12': 1}
B={'11': 2}

But to answer you question, to do A - B (based on dict keys):

all(map( A.pop, B))   # use all() so it works for Python 2 and 3.
print A # {'10': 2, '12': 1}
Rochellrochella answered 10/1, 2018 at 4:32 Comment(7)
At least in Python 3, map does not seem to work as described by Monty After running map( A.pop, B ), A is unchanged. (Perhaps because in Python 3, map returns an iterator.)Door
@mpb, good catch! have to put it inside all() or something so to consume the iterator. Works for Python 2 and 3Rochellrochella
This did not work for me since all() returns a bool. Did I miss something?Weyermann
what version of python do you have? btw, why downvote, if I can still help you?Rochellrochella
Sorry this is months after the fact, but I downvoted because "it did not work". If I made the mistake, I'm happy to turn that frown upside down. I'm using Python 3.Weyermann
I'd avoid using map just for its side effect, and avoid using all to force evaluation. If any of your values are falsey all will stop popping values from A prematurely! In general there's nothing wrong with an imperative for-loop, and in this case PaulMcG's non-mutative comprehension answer seems like the best solution.Charie
If you need to consume an iterator use collection.deque() not all() since all will stop if one of the keys in B is falsy.Restriction
M
5

dict-views:

Keys views are set-like since their entries are unique and hashable. If all values are hashable, so that (key, value) pairs are unique and hashable, then the items view is also set-like. (Values views are not treated as set-like since the entries are generally not unique.) For set-like views, all of the operations defined for the abstract base class collections.abc.Set are available (for example, ==, <, or ^).

So you can:

>>> A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
>>> B = {'11':1, '11':2}
>>> A.items() - B.items()
{('11', 3), ('12', 1), ('10', 2)}
>>> dict(A.items() - B.items())
{'11': 3, '12': 1, '10': 2}

For python 2 use dict.viewitems.

P.S. You can't have duplicate keys in dict.

>>> A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
>>> A
{'10': 2, '11': 3, '12': 1}
>>> B = {'11':1, '11':2}
>>> B
{'11': 2}
Motch answered 11/8, 2021 at 12:47 Comment(0)
C
3

Another way of using the efficiency of sets. This might be more multipurpose than the answer by @brien. His answer is very nice and concise, so I upvoted it.

diffKeys = set(a.keys()) - set(b.keys())
c = dict()
for key in diffKeys:
  c[key] = a.get(key)

EDIT: There is the assumption here, based on the OP's question, that dict B is a subset of dict A, that the key/val pairs in B are in A. The above code will have unexpected results if you are not working strictly with a key/val subset. Thanks to Steven for pointing this out in his comment.

Counterscarp answered 3/2, 2016 at 20:58 Comment(2)
This is different than @brien's answer. This considers keys only whereas the other answer considers key-value pairs. They will give different answers.Epigraphic
@StevenRumbalski: Yes! True. I should have pointed that out, and will clarify it in my answer. I was working from the OPs stated presumption that all of the existing key/val pairs from b are in a. So B is a subset.Counterscarp
B
2

Since I can not (yet) comment: the accepted answer will fail if there are some keys in B not present in A.

Using dict.pop with a default would circumvent it (borrowed from How to remove a key from a Python dictionary?):

all(A.pop(k, None) for k in B)

or

tuple(A.pop(k, None) for k in B)
Bernat answered 27/11, 2018 at 13:48 Comment(0)
D
1
result = A.copy()
[result.pop(key) for key in B if B[key] == A[key]]
Dignify answered 3/2, 2016 at 20:41 Comment(0)
F
-1

Based on only keys assuming A is a superset of B or B is a subset of A:

Python 3: c = {k:a[k] for k in a.keys() - b.keys()}

Python 2: c = {k:a[k] for k in list(set(a.keys())-set(b.keys()))}

Based on keys and can be used to update a in place as well @PaulMcG answer

Fabri answered 17/11, 2018 at 13:9 Comment(4)
TypeError: unsupported operand type(s) for -: 'list' and 'list'. You need to make it a list.Loosetongued
@jeffrycopps >>> a = {'f':5,'g':6,'c':7,'d':4} >>> b = {'f':5,'g':6,'d':4} >>> c = {k:a[k] for k in a.keys() - b.keys()} >>> c {'c': 7}Fabri
Try running your code on 2.7. It would fail. Btw, the question doesn't say Python3.Loosetongued
you are correct Python 2 cannot subtract list but can subtract sets so c = {k:a[k] for k in list(set(a.keys())-set(b.keys()))}Fabri
J
-1

For subtracting the dictionaries, you could do :

A.subtract(B)

Note: This will give you negative values in a situation where B has keys that A does not.

Josuejosy answered 27/11, 2020 at 7:32 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.