How do I check if my loop never ran at all?
Asked Answered
N

7

5

How do I check if my loop never ran at all?

This somehow looks too complicated to me:

x = _empty = object()
for x in data:
    ... # process x
if x is _empty:
    raise ValueError("Empty data iterable: {!r:100}".format(data))

Ain't there a easier solution?

The above solution is from curiousefficiency.org

Update

  • data can contain None items.
  • data is an iterator, and I don't want to use it twice.
Nusku answered 2/2, 2015 at 10:3 Comment(5)
Is data a list or other such container?Leitmotif
If data is a list, why not use if not data:?Leitmotif
Why do you think this is too complicated? It's straightforward and readable.Edmanda
@HåkenLid Many reasons make the quoted code complicated: (1) You have to read and keep in mind the first line without understanding its purpose yet. (2) The loop normally changes x, which is unusual since we just set x: what is going on? (3) The test at the end does not work if the last element of data is object(): is this intended? can this happen? are we really testing for the emptiness of data? (4) The test intended to say "is data empty " actually reads "is x empty". Another reason why it is complicated is that there is a simpler solution (see my answer). :)Zipporah
As pointed out by @HåkenLid, (3) is actually not a problem, because object() creates a new object (it's not a singleton).Zipporah
Q
4

The original code is best.

x = _empty = object()

_empty is called a sentinel value. In Python it's common to create a sentinel with object(), since it makes it obvious that the only purpose of _empty is to be a dummy value. But you could have used any mutable, for instance an empty list [].

Mutable objects are always guaranteed to be unique when you compare them with is, so you can safely use them as sentinel values, unlike immutables such as None or 0.

>>> None is None
True
>>> object() is object()
False
>>> [] is []
False
Quadrangular answered 2/2, 2015 at 11:6 Comment(1)
I really don't agree that the original code is best: my comment to the question explains why (essentially: the code is complicated, and there is a much cleaner and clearer solution).Zipporah
A
3

By "never ran", do you mean that data had no elements?

If so, the simplest solution is to check it before running the loop:

if not data:
    raise Exception('Empty iterable')

for x in data:
    ...

However, as mentioned in the comments below, it will not work with some iterables, like files, generators, etc., so should be applied carefully.

Acclimate answered 2/2, 2015 at 10:14 Comment(4)
If data is a sequence, this'll work - if it's an iterable it'll always be true even if it'll never yield any elementsSaliferous
That will work with some iterables, but not all (e.g. a file or a generator function/expression).Kitts
I think checking for a sentinel value is okay; not sure what problem the OP has with it.Thrombocyte
Since the updated question mentions that it is using an iterator, this solution does not work.Zipporah
Z
1

The following simple solution works with any iterable. It is based on the idea that we can check if there is a (first) element, and then keep iterating if there was one. The result is much clearer:

import itertools

try:
    first_elmt = next(data)
except StopIteration:
    raise ValueError("Empty data iterator: {!r:100}".format(data))

for x in itertools.chain([first_elmt], data):
    …

PS: Note that it assumes that data is an iterator (as in the question). If it is merely an iterable, the code should be run on data_iter = iter(data) instead of on data (otherwise, say if data is a list, the loop would duplicate the first element).

Zipporah answered 2/2, 2015 at 10:53 Comment(2)
object() creates an unique object every time, so that wouldn't be a problem.Edmanda
You're right, my bad. I removed the related comment.Zipporah
L
1

I propose the following:

loop_has_run = False
for x in data:
    loop_has_run = True
    ... # process x
if not loop_has_run:
    raise ValueError("Empty data iterable: {!r:100}".format(data))

I contend that this is better than the example in the question, because:

  • The intent is clearer (since the variable name specifies its meaning directly).
  • No objects are created or destroyed (which can have a negative performance impact).
  • It doesn't require paying attention to the subtle point that object() always returns a unique value.

Note that the loop_has_run = True assignment should be put at the start of the loop, in case (for example) the loop body contains break.

Lebrun answered 2/2, 2015 at 11:23 Comment(3)
This is not bad, but this has the unfortunate side effect of forcing a useless assignment at each iteration but the first one (which makes your second point about the penalty of creating an object moot). There is no need for such a penalty (see my answer, which is what I would use in real life :).Zipporah
@EOL: In general the penalty of an assignment is much cheaper than the penalty of creating/destroying an object. Obviously the effect of this on actual runtime will vary, especially depending on the typical number of iterations of the loop (always measure!). I believe your solution implicitly performs allocation due to the call to itertools.chain, but I haven't confirmed this.Lebrun
Agreed, the existence of a performance penalty compared to the solution quoted in the question depends on the size of the data. Now, this does not change the fact that doing the assignment over and over in the loop is wasteful and algorithmically dubious—again, I don't think it's too bad, though. :) I'm not sure what "allocation" that would be in my solution you can be referring to: maybe the creation of a list?Zipporah
C
0

You can add a loop_flag default as False, when loop executed, change it into True:

loop_flag = False
x = _empty = object()

for x in data:
    loop_flag = True
    ... # process x

if loop_flag:
    print "loop executed..."
Circumambient answered 2/2, 2015 at 10:8 Comment(3)
Not all that helpful either.Leitmotif
Why have both a test on x and a loop_flag? This is unnecessarily redundant.Zipporah
loop_flag=True should be the first statement. Otherwise a continue in "process x" would create unintended results.Nusku
A
0

The intent of that code isn't immediately obvious. Sure people would understand it after a while, but the code could be made clearer.

The solution I offer requires more lines of code, but that code is in a class that can be stored elsewhere. In addition this solution will work for iterables and iterators as well as sized containers.

Your code would be changed to:

it = HadItemsIterable(data)
for x in it:
    ...
if it.had_items:
    ...

The code for the class is as follows:

from collections.abc import Iterable
class HadItemsIterable(Iterable):

    def __init__(self, iterable):
        self._iterator = iter(iterable)

    @property
    def had_items(self):
        try:
            return self._had_items
        except AttributeError as e:
            raise ValueError("Not iterated over items yet")

    def __iter__(self):
        try:
            first = next(self._iterator)
        except StopIteration:
            if hasattr(self, "_had_items"):
                raise
            self._had_items = False
            raise
        self._had_items = True
        yield first
        yield from self._iterator
Aboutship answered 2/2, 2015 at 10:32 Comment(0)
N
0

What about this solution?

data=[]

count=None
for count, item in enumerate(data):
    print (item)

if count is None:
    raise ValueError('data is empty')
Nusku answered 2/2, 2015 at 12:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.