Finding the index of elements based on a condition using python list comprehension
Asked Answered
S

6

160

The following Python code appears to be very long winded when coming from a Matlab background

>>> a = [1, 2, 3, 1, 2, 3]
>>> [index for index,value in enumerate(a) if value > 2]
[2, 5]

When in Matlab I can write:

>> a = [1, 2, 3, 1, 2, 3];
>> find(a>2)
ans =
     3     6

Is there a short hand method of writing this in Python, or do I just stick with the long version?


Thank you for all the suggestions and explanation of the rationale for Python's syntax.

After finding the following on the numpy website, I think I have found a solution I like:

http://docs.scipy.org/doc/numpy/user/basics.indexing.html#boolean-or-mask-index-arrays

Applying the information from that website to my problem above, would give the following:

>>> from numpy import array
>>> a = array([1, 2, 3, 1, 2, 3])
>>> b = a>2 
array([False, False, True, False, False, True], dtype=bool)
>>> r = array(range(len(b)))
>>> r(b)
[2, 5]

The following should then work (but I haven't got a Python interpreter on hand to test it):

class my_array(numpy.array):
    def find(self, b):
        r = array(range(len(b)))
        return r(b)


>>> a = my_array([1, 2, 3, 1, 2, 3])
>>> a.find(a>2)
[2, 5]
Stampede answered 1/9, 2011 at 12:31 Comment(1)
How about [idx for idx in range(len(a)) if a[idx] > 2]? The reason this is a bit awkward to do in Python is because it doesn't use indexes as much as other languages.Outstay
H
111

Another way:

>>> [i for i in range(len(a)) if a[i] > 2]
[2, 5]

In general, remember that while find is a ready-cooked function, list comprehensions are a general, and thus very powerful solution. Nothing prevents you from writing a find function in Python and use it later as you wish. I.e.:

>>> def find_indices(lst, condition):
...   return [i for i, elem in enumerate(lst) if condition(elem)]
... 
>>> find_indices(a, lambda e: e > 2)
[2, 5]

Note that I'm using lists here to mimic Matlab. It would be more Pythonic to use generators and iterators.

Hartzel answered 1/9, 2011 at 12:35 Comment(5)
The OP could've written it as [i for i,v in enumerate(a) if v > 2] instead.Outstay
That's not shorter, it's longer. Replace index with i and value with v in the original and count the characters.Disconcerted
@NullUser, agf: you're right, but the main point is the second part :)Hartzel
Using enumerate over range(len(...)) is both more robust and more efficient.Aspergillum
@Mike Graham: I agree - will change teh find_indices function to use enumerateHartzel
A
84
  • In Python, you wouldn't use indexes for this at all, but just deal with the values—[value for value in a if value > 2]. Usually dealing with indexes means you're not doing something the best way.

  • If you do need an API similar to Matlab's, you would use numpy, a package for multidimensional arrays and numerical math in Python which is heavily inspired by Matlab. You would be using a numpy array instead of a list.

     >>> import numpy
     >>> a = numpy.array([1, 2, 3, 1, 2, 3])
     >>> a
     array([1, 2, 3, 1, 2, 3])
     >>> numpy.where(a > 2)
     (array([2, 5]),)
     >>> a > 2
     array([False, False,  True, False, False,  True], dtype=bool)
     >>> a[numpy.where(a > 2)]
     array([3, 3])
     >>> a[a > 2]
     array([3, 3])
    
Aspergillum answered 1/9, 2011 at 13:20 Comment(3)
you have lists, one for ranges and one for angles, you want to filter out the range values that are above some threshold. How do you also filter the angles corresponding to those ranges in a "best way" fashion?Marquisette
filtered_ranges_and_angles = [(range, angle) for range, angle in zip(ranges, angles) if should_be_kept(range)]Aspergillum
"In Python, you wouldn't use indexes for this at all, but just deal with the values" this statement shows you haven't done enough data analysis and machine learning modeling. Indices of one tensor based on certain condition are used to filter another tensor.Betsybetta
C
30

For me it works well:

>>> import numpy as np
>>> a = np.array([1, 2, 3, 1, 2, 3])
>>> np.where(a > 2)[0]
[2 5]
Cancel answered 11/9, 2017 at 0:9 Comment(0)
J
8

Maybe another question is, "what are you going to do with those indices once you get them?" If you are going to use them to create another list, then in Python, they are an unnecessary middle step. If you want all the values that match a given condition, just use the builtin filter:

matchingVals = filter(lambda x : x>2, a)

Or write your own list comprhension:

matchingVals = [x for x in a if x > 2]

If you want to remove them from the list, then the Pythonic way is not to necessarily remove from the list, but write a list comprehension as if you were creating a new list, and assigning back in-place using the listvar[:] on the left-hand-side:

a[:] = [x for x in a if x <= 2]

Matlab supplies find because its array-centric model works by selecting items using their array indices. You can do this in Python, certainly, but the more Pythonic way is using iterators and generators, as already mentioned by @EliBendersky.

Jelene answered 1/9, 2011 at 13:16 Comment(3)
Paul, I haven't yet come across a need for this in a script/function/class. It's more for interactive testing of a class I am writing.Stampede
@Mike - thanks for the edit, but I really did mean a[:] = ... - see Alex Martelli's answer to this question #1353385.Jelene
@Paul, I assumed (and hoped!) you didn't really mean it from your description that you were going to "create a new list"; I find that programs tend to be eaier to understand and maintain when they mutate existing data very sparingly. In any event, I'm sorry to overstep -- you should certainly be able to edit your post back to whatever you want.Aspergillum
S
7

Even if it's a late answer: I think this is still a very good question and IMHO Python (without additional libraries or toolkits like numpy) is still lacking a convenient method to access the indices of list elements according to a manually defined filter.

You could manually define a function, which provides that functionality:

def indices(list, filtr=lambda x: bool(x)):
    return [i for i,x in enumerate(list) if filtr(x)]

print(indices([1,0,3,5,1], lambda x: x==1))

Yields: [0, 4]

In my imagination the perfect way would be making a child class of list and adding the indices function as class method. In this way only the filter method would be needed:

class MyList(list):
    def __init__(self, *args):
        list.__init__(self, *args)
    def indices(self, filtr=lambda x: bool(x)):
        return [i for i,x in enumerate(self) if filtr(x)]

my_list = MyList([1,0,3,5,1])
my_list.indices(lambda x: x==1)

I elaborated a bit more on that topic here: http://tinyurl.com/jajrr87

Shwa answered 20/1, 2016 at 14:57 Comment(0)
S
0

The following should then work (but I haven't got a Python interpreter on hand to test it):

class my_array(numpy.array):
    def find(self, b):
        r = array(range(len(b)))
        return r(b)


>>> a = my_array([1, 2, 3, 1, 2, 3])
>>> a.find(a>2)
[2, 5]

That's a good solution. But built-in types are not meant to be subclassed. You can use composition instead of inheritance. This should work:

import numpy

class my_array:
    def __init__(self, data):
        self.data = numpy.array(data)

    def find(self, b):
        r = numpy.array(list(range(len(self.data))))
        return list(r[b])

>>> a = my_array([1, 2, 3, 1, 2, 3])
>>> a.find(a.data>2)  
[2,5]
Shanney answered 28/12, 2022 at 19:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.