Explicitly select items from a list or tuple
Asked Answered
C

9

165

I have the following Python list (can also be a tuple):

myList = ['foo', 'bar', 'baz', 'quux']

I can say

>>> myList[0:3]
['foo', 'bar', 'baz']
>>> myList[::2]
['foo', 'baz']
>>> myList[1::2]
['bar', 'quux']

How do I explicitly pick out items whose indices have no specific patterns? For example, I want to select [0,2,3]. Or from a very big list of 1000 items, I want to select [87, 342, 217, 998, 500]. Is there some Python syntax that does that? Something that looks like:

>>> myBigList[87, 342, 217, 998, 500]
Chewink answered 9/7, 2011 at 1:49 Comment(2)
This appears to be a duplicate. The other question has more up votes but this seems like it has a better answer with timings.Forlini
Does this answer your question? Access multiple elements of list knowing their indexUprising
P
208
list( myBigList[i] for i in [87, 342, 217, 998, 500] )

I compared the answers with python 2.5.2:

  • 19.7 usec: [ myBigList[i] for i in [87, 342, 217, 998, 500] ]

  • 20.6 usec: map(myBigList.__getitem__, (87, 342, 217, 998, 500))

  • 22.7 usec: itemgetter(87, 342, 217, 998, 500)(myBigList)

  • 24.6 usec: list( myBigList[i] for i in [87, 342, 217, 998, 500] )

Note that in Python 3, the 1st was changed to be the same as the 4th.


Another option would be to start out with a numpy.array which allows indexing via a list or a numpy.array:

>>> import numpy
>>> myBigList = numpy.array(range(1000))
>>> myBigList[(87, 342, 217, 998, 500)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: invalid index
>>> myBigList[[87, 342, 217, 998, 500]]
array([ 87, 342, 217, 998, 500])
>>> myBigList[numpy.array([87, 342, 217, 998, 500])]
array([ 87, 342, 217, 998, 500])

The tuple doesn't work the same way as those are slices.

Psychro answered 9/7, 2011 at 1:53 Comment(6)
Preferably as a list comp, [myBigList[i] for i in [87, 342, 217, 998, 500]], but I like this approach the best.Buckskins
@MedhatHelmy That's already in the answer. The third option used from operator import itemgetter in the initialization part of python -mtimeit.Psychro
I wonder, just from a language design perspective, why myBigList[(87, 342, 217, 998, 500)] doesn't work when myBigList is a regular python list? When I try that I get TypeError: list indices must be integers or slices, not tuple. That would be so much easier than typing out the comprehension - is there a language design/implementation issue involved?Lauryn
@sparc_spread, this is because lists in Python only accept integers or slices. Passing an integer makes sure that only one item is retrieved from an existing list. Passing a slice makes sure a part of it is retrieved, but passing a tuple is like passing a data-type(tuple) as an argument to another data-type(list) which is syntactically incorrect.Francium
why list( myBigList[i] for i in [87, 342, 217, 998, 500] ) and not [ myBigList[i] for i in [87, 342, 217, 998, 500] ]Coprolite
@Coprolite Because in Python 2, it does not leak the loop variable as a new scope is created for the generator expression. One does not need to do it for Python 3 as the list comprehension has been changed to do that. I prefer it though as it makes it easier to swap out list for any other generator consuming function. This is also the reason that I don't care for either dictionary comprehensions nor set comprehensions.Psychro
D
60

What about this:

from operator import itemgetter
itemgetter(0,2,3)(myList)
('foo', 'baz', 'quux')
Dalton answered 9/7, 2011 at 1:52 Comment(1)
This is the sexiest so far. Love that operator module!Choi
P
18

Maybe a list comprehension is in order:

L = ['a', 'b', 'c', 'd', 'e', 'f']
print [ L[index] for index in [1,3,5] ]

Produces:

['b', 'd', 'f']

Is that what you are looking for?

Phenylamine answered 9/7, 2011 at 2:0 Comment(0)
G
11

It isn't built-in, but you can make a subclass of list that takes tuples as "indexes" if you'd like:

class MyList(list):

    def __getitem__(self, index):
        if isinstance(index, tuple):
            return [self[i] for i in index]
        return super(MyList, self).__getitem__(index)


seq = MyList("foo bar baaz quux mumble".split())
print seq[0]
print seq[2,4]
print seq[1::2]

printing

foo
['baaz', 'mumble']
['bar', 'quux']
Guillerminaguillermo answered 9/7, 2011 at 1:57 Comment(1)
(+1) Neat solution! With this extension, handling arrays in Python starts to look much R or Matlab.Radar
V
7
>>> map(myList.__getitem__, (2,2,1,3))
('baz', 'baz', 'bar', 'quux')

You can also create your own List class which supports tuples as arguments to __getitem__ if you want to be able to do myList[(2,2,1,3)].

Vindicable answered 9/7, 2011 at 2:2 Comment(6)
While this works it's usually not a good idea to directly invoke magic variables. You're better off using a list comprehension or a helper module like operator.Choi
@jathanism: I have to respectfully disagree. Though if you are concerned about forward compatibility (as opposed to public/private) I can definitely see where you're coming from.Vindicable
That is where I'm coming from. :) Following that, it's the same reason why it's better to use len(myList) over myList.__len__().Choi
a creative solution.I don't think it's a bad idea to invoke magic variable. programmer selects their preferred way based on programming circumstances.Abagail
Using magic methods is generally bad, so it's better to just avoid it. It's never necessary except maybe for performance reasons. IDK if there's anything specific about __getitem__(), but for other examples, see Why does calling Python's 'magic method' not do type conversion like it would for the corresponding operator? and Is there any case where len(someObj) does not call someObj's __len__ function?.Lunette
Those magic method counterexamples are all about how the magic method implements a "building block" that may be used as part of a more complicated protocol. I see no reason not to use it, since we are not expecting the more complicated protocol (or in this case the method is the protocol basically). Happy to be proven wrong though. Using this will make your code possibly throw IndexError on invalid indices etc. (but than again your code may also work with negative indexes which can be a positive) -- docs.python.org/3/reference/datamodel.html#object.__getitem__Vindicable
V
4

I just want to point out, even syntax of itemgetter looks really neat, but it's kinda slow when perform on large list.

import timeit
from operator import itemgetter
start=timeit.default_timer()
for i in range(1000000):
    itemgetter(0,2,3)(myList)
print ("Itemgetter took ", (timeit.default_timer()-start))

Itemgetter took 1.065209062149279

start=timeit.default_timer()
for i in range(1000000):
    myList[0],myList[2],myList[3]
print ("Multiple slice took ", (timeit.default_timer()-start))

Multiple slice took 0.6225321444745759

Vuillard answered 1/11, 2016 at 14:50 Comment(1)
First snippet, please add myList = np.array(range(1000000)) otherwise you will get error.Sulphurous
J
2

Another possible solution:

sek=[]
L=[1,2,3,4,5,6,7,8,9,0]
for i in [2, 4, 7, 0, 3]:
   a=[L[i]]
   sek=sek+a
print (sek)
Jefferson answered 18/11, 2017 at 20:32 Comment(0)
M
1

like often when you have a boolean numpy array like mask

[mylist[i] for i in np.arange(len(mask), dtype=int)[mask]]

A lambda that works for any sequence or np.array:

subseq = lambda myseq, mask : [myseq[i] for i in np.arange(len(mask), dtype=int)[mask]]

newseq = subseq(myseq, mask)

Muff answered 16/7, 2018 at 15:26 Comment(0)
A
1

Here is a one line lambda:

list(map(lambda x: mylist[x],indices))

where:

mylist=['a','b','c','d','e','f','g','h','i','j']
indices = [3, 5, 0, 2, 6]

output:

['d', 'f', 'a', 'c', 'g']
Act answered 28/7, 2022 at 18:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.