`if key in dict` vs. `try/except` - which is more readable idiom?
Asked Answered
L

11

114

I have a question about idioms and readability, and there seems to be a clash of Python philosophies for this particular case:

I want to build dictionary A from dictionary B. If a specific key does not exist in B, then do nothing and continue on.

Which way is better?

try:
    A["blah"] = B["blah"]
except KeyError:
    pass

or

if "blah" in B:
    A["blah"] = B["blah"]

"Do and ask for forgiveness" vs. "simplicity and explicitness".

Which is better and why?

Liverwort answered 22/12, 2010 at 18:48 Comment(12)
The second example might be better written as if "blah" in B.keys(), or if B.has_key("blah").Organdy
does A.update(B) not work for you?Schoolroom
@Luke: has_key has been deprecated in favor of in and checking B.keys() changes an O(1) operation into an O(n) one.Acoustics
But how will update() check for specific keys?Auspex
@Luke: not it's not. .has_key is deprecated and keys creates unneeded list in py2k, and is redundant in py3kSchoolroom
@A A: what does check for specific keys mean? if the key doesn't exist in B it won't exist in A according to OP's code, the same happens with updateSchoolroom
possible duplicate of [Python if vs try-except ](https://mcmap.net/q/189914/-python-if-vs-try-except)Dualpurpose
'build' A, as in, A is empty to start out with? And we only want certain keys? Use a comprehension: A = dict((k, v) for (k, v) in B if we_want_to_include(k)).Molality
How does if B.get("blah") compare to if "blah" in B ?Jacobah
Please avoid using the blanket term 'better' in question or title: if you mean performance (faster or less memory), then say that; if you mean readability or better idiom, then say that. A related question asking about speed:Python: is “except KeyError” faster than “if key in dict”?Coulter
But again, since all you're doing is merging dicts/sets of keys/Counters, then use dict() constructor, dict comprehension,collections.Counter with + operator, set.union()` or + operator, collections.defaultdict, etc.Coulter
How would the answer differ if you had a nested dict?Face
A
86

Exceptions are not conditionals.

The conditional version is clearer. That's natural: this is straightforward flow control, which is what conditionals are designed for, not exceptions.

The exception version is primarily used as an optimization when doing these lookups in a loop: for some algorithms it allows eliminating tests from inner loops. It doesn't have that benefit here. It has the small advantage that it avoids having to say "blah" twice, but if you're doing a lot of these you should probably have a helper move_key function anyway.

In general, I'd strongly recommend sticking with the conditional version by default unless you have a specific reason not to. Conditionals are the obvious way to do this, which is usually a strong recommendation to prefer one solution over another.

Atalya answered 22/12, 2010 at 19:38 Comment(6)
I don't agree. If you say "do X, and if that doesn't work, do Y". Main reason against the conditional solution here, you have to write "blah" more often, which leads to a more error-prone situation.Vivianna
And, expecially in Python, EAFP is very widely used.Vivianna
This answer would be correct for any language I know except for Python.Dionnadionne
If you're using exceptions as if they're conditionals in Python, I hope nobody else has to read it.Atalya
So, what is the final verdict? : )Carmine
I'm a complete noob when it comes to python and threading, but I think using EAFP is an atomic action. Using conditionals is not. If an atomic transaction is important, then unless this dict is a threadsafe dict, these conditionals are not threadsafe too.Hawaiian
N
88

There is also a third way that avoids both exceptions and double-lookup, which can be important if the lookup is expensive:

value = B.get("blah", None)
if value is not None: 
    A["blah"] = value

In case you expect the dictionary to contain None values, you can use some more esoteric constants like NotImplemented, Ellipsis or make a new one:

MyConst = object()
def update_key(A, B, key):
    value = B.get(key, MyConst)
    if value is not MyConst: 
        A[key] = value

Anyway, using update() is the most readable option for me:

a.update((k, b[k]) for k in ("foo", "bar", "blah") if k in b)
Nash answered 22/12, 2010 at 20:4 Comment(0)
A
86

Exceptions are not conditionals.

The conditional version is clearer. That's natural: this is straightforward flow control, which is what conditionals are designed for, not exceptions.

The exception version is primarily used as an optimization when doing these lookups in a loop: for some algorithms it allows eliminating tests from inner loops. It doesn't have that benefit here. It has the small advantage that it avoids having to say "blah" twice, but if you're doing a lot of these you should probably have a helper move_key function anyway.

In general, I'd strongly recommend sticking with the conditional version by default unless you have a specific reason not to. Conditionals are the obvious way to do this, which is usually a strong recommendation to prefer one solution over another.

Atalya answered 22/12, 2010 at 19:38 Comment(6)
I don't agree. If you say "do X, and if that doesn't work, do Y". Main reason against the conditional solution here, you have to write "blah" more often, which leads to a more error-prone situation.Vivianna
And, expecially in Python, EAFP is very widely used.Vivianna
This answer would be correct for any language I know except for Python.Dionnadionne
If you're using exceptions as if they're conditionals in Python, I hope nobody else has to read it.Atalya
So, what is the final verdict? : )Carmine
I'm a complete noob when it comes to python and threading, but I think using EAFP is an atomic action. Using conditionals is not. If an atomic transaction is important, then unless this dict is a threadsafe dict, these conditionals are not threadsafe too.Hawaiian
M
15

From what I understand, you want to update dict A with key,value pairs from dict B

update is a better choice.

A.update(B)

Example:

>>> A = {'a':1, 'b': 2, 'c':3}
>>> B = {'d': 2, 'b':5, 'c': 4}
>>> A.update(B)
>>> A
{'a': 1, 'c': 4, 'b': 5, 'd': 2}
>>> 
Martz answered 22/12, 2010 at 18:51 Comment(2)
"If a specific key does not exist in B" Sorry, should've been more clear, but I only want to copy over values if specific keys in B exist. Not all in B.Liverwort
@Liverwort - A.update({k: v for k, v in B.iteritems() if k in specificset})Latium
A
11

Direct quote from Python performance wiki:

Except for the first time, each time a word is seen the if statement's test fails. If you are counting a large number of words, many will probably occur multiple times. In a situation where the initialization of a value is only going to occur once and the augmentation of that value will occur many times it is cheaper to use a try statement.

So it seems that both options are viable depending from situation. For more details you might like to check this link out: Try-except-performance

Albert answered 28/4, 2012 at 14:23 Comment(1)
that's an interesting read, but I think somewhat incomplete. The dict used only has 1 element and I suspect larger dicts will have a significant impact on performanceCaboose
V
3

I think the general rule here is will A["blah"] normally exist, if so try-except is good if not then use if "blah" in b:

I think "try" is cheap in time but "except" is more expensive.

Vardhamana answered 22/12, 2010 at 18:52 Comment(2)
Don't approach code from an optimization perspective by default; approach it from a readability and maintainability perspective. Unless the goal is specifically optimization, this is the wrong criteria (and if it is optimization, the answer is benchmarking, not guessing).Atalya
I should probably have put the last point in brackets or somehow vaguer - my main point was the first one and I think it has the added advantage of the second.Vardhamana
L
3

I think the second example is what you should go for unless this code makes sense:

try:
    A["foo"] = B["foo"]
    A["bar"] = B["bar"]
    A["baz"] = B["baz"]
except KeyError:
    pass

Keep in mind that code will abort as soon as there is a key that isn't in B. If this code makes sense, then you should use the exception method, otherwise use the test method. In my opinion, because it's shorter and clearly expresses the intent, it's a lot easier to read than the exception method.

Of course, the people telling you to use update are correct. If you are using a version of Python that supports dictionary comprehensions, I would strongly prefer this code:

updateset = {'foo', 'bar', 'baz'}
A.update({k: B[k] for k in updateset if k in B})
Latium answered 22/12, 2010 at 19:23 Comment(1)
"Keep in mind that code will abort as soon as there is a key that isn't in B." - this is why it's best practice to put only the absolute minimum in the try: block, usually this is a single line. The first example would be better as part of a loop, such as for key in ["foo", "bar", "baz"]: try: A[key] = B[key]Pauiie
A
2

The rule in other languages is to reserve exceptions for exceptional conditions, i.e. errors that don't occur in regular use. Don't know how that rule applies to Python, as StopIteration shouldn't exist by that rule.

Arlina answered 22/12, 2010 at 18:56 Comment(3)
I think this chestnut originated from languages where exception handling is expensive and so can have a significant impact on performance. I've never seen any real justification or reasoning behind it.Tranche
@JohnLaRooy No, performance isn't really the reason. Exceptions are a kind of non-local goto, which some people consider to impede readability of the code. However, use of exceptions in this way is considered idiomatic in Python so the above doesn't apply.Bumpkin
conditional returns are also "non-local goto" and many people prefer that style instead of inspecting sentinels at the end of the code block.Trochaic
T
2

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can capture the condition value dictB.get('hello', None) in a variable value in order to both check if it's not None (as dict.get('hello', None) returns either the associated value or None) and then use it within the body of the condition:

# dictB = {'hello': 5, 'world': 42}
# dictA = {}
if value := dictB.get('hello', None):
  dictA["hello"] = value
# dictA is now {'hello': 5}
Tulley answered 27/4, 2019 at 15:14 Comment(3)
This fails if value == 0Josey
Note that dict.get(key, None) is the same as dict.get(key). (docs)Benyamin
What if the target value is None exactly? e.g. dictA["A"]=NoneBales
O
1

Personally, I lean towards the second method (but using has_key):

if B.has_key("blah"):
  A["blah"] = B["blah"]

That way, each assignment operation is only two lines (instead of 4 with try/except), and any exceptions that get thrown will be real errors or things you've missed (instead of just trying to access keys that aren't there).

As it turns out (see the comments on your question), has_key is deprecated - so I guess it's better written as

if "blah" in B:
  A["blah"] = B["blah"]
Organdy answered 22/12, 2010 at 18:52 Comment(0)
A
1

Though the accepted answer's emphasize on "look before you leap" principle might apply to most languages, more pythonic might be the first approach, based on the python principles. Not to mention it is a legitimate coding style in python. Important thing is to make sure you are using the try except block in the right context and is following best practices. Eg. doing too many things in a try block, catching a very broad exception, or worse- the bare except clause etc.

Easier to ask for forgiveness than permission. (EAFP)

See the python docs reference here.

Also, this blog from Brett, one of the core devs, touches most of this in brief.

See another SO discussion here:

Affirm answered 26/9, 2019 at 4:37 Comment(0)
J
1

In addition to discussing readability, I think performance also matters in some scenarios. A quick timeit benchmark indicates that a test (i.e. “asking permission”) is actually slightly faster than handling the exception (i.e. “asking forgiveness”).

Here’s the code to set up the benchmark, generating a largeish dictionary of random key-value pairs:

setup = """
import random, string
d = {"".join(random.choices(string.ascii_letters, k=3)): "".join(random.choices(string.ascii_letters, k=3)) for _ in range(10000)}
"""

Then the if test:

stmt1 = """
key = "".join(random.choices(string.ascii_letters, k=3))
if key in d:
    _ = d[key]
"""

gives us:

>>> timeit.timeit(stmt=stmt1, setup=setup, number=1000000)
1.6444563979999884

whereas the approach utilizing the exception

stmt2 = """
key = "".join(random.choices(string.ascii_letters, k=3))
try:
    _ = d[key]
except KeyError:
    pass
"""

gives us:

>>> timeit.timeit(stmt=stmt2, setup=setup, number=1000000)
1.8868465850000575

Interestingly, hoisting the key generation from the actual benchmark into the setup and therewith looking for the same key over and over, delivers vastly different numbers:

>>> timeit.timeit(stmt=stmt1, setup=setup, number=100000000)
2.3290171539999847
>>> timeit.timeit(stmt=stmt2, setup=setup, number=100000000)
26.412447488999987

I don’t want to speculate whether this emphasizes the benefits of a test vs. exception handling, or if the dictionary buffers the result of the previous lookup and thus biases the benchmark results towards testing… 🤔

Jaal answered 5/1, 2021 at 10:12 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.