Access multiple elements of list knowing their index [duplicate]
Asked Answered
D

11

343

I need to choose some elements from the given list, knowing their index. Let say I would like to create a new list, which contains element with index 1, 2, 5, from given list [-2, 1, 5, 3, 8, 5, 6]. What I did is:

a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [ a[i] for i in b]

Is there any better way to do it? something like c = a[b] ?

Dys answered 16/8, 2013 at 11:20 Comment(2)
by the way, I found another solution here. I haven't test it yet, but I think I can post it here once you are interested in code.activestate.com/recipes/…Dys
That is the same solution as mentioned in the question, but wrapped in a lambda function.Huffy
C
327

You can use operator.itemgetter:

from operator import itemgetter 
a = [-2, 1, 5, 3, 8, 5, 6]
b = [1, 2, 5]
print(itemgetter(*b)(a))
# Result:
(1, 5, 5)

Or you can use numpy:

import numpy as np
a = np.array([-2, 1, 5, 3, 8, 5, 6])
b = [1, 2, 5]
print(list(a[b]))
# Result:
[1, 5, 5]

But really, your current solution is fine. It's probably the neatest out of all of them.

Catcher answered 16/8, 2013 at 11:25 Comment(5)
+1 for mentioning that c = [a[i] for i in b] is perfectly fine. Note that the itemgetter solution will not do the same thing if b has less than 2 elements.Quadric
Side Note: Using itemgetter while working in multi-process doesn't work. Numpy works great in multi-process.Wenn
Additional comment, a[b] works only when a is a numpy array, i.e. you create it with a numpy function.Carrasco
I have benchmarked the non numpy options and itemgetter appears to be the fastest, even slightly faster than simply typing out the desired indexes inside parentheses, using Python 3.44Bribe
@citizen2077, can you give an example of the syntax you describe?Bouldon
D
65

Alternatives:

>>> map(a.__getitem__, b)
[1, 5, 5]

>>> import operator
>>> operator.itemgetter(*b)(a)
(1, 5, 5)
Deerskin answered 16/8, 2013 at 11:24 Comment(6)
The problem w/ the first one is that __getitem__ doesn't seem to be compasable eg how to map the type of the item? map(type(a.__getitem__), b) Bouldon
@alancalvitti, lambda x: type(a.__getitem__(x)), b. In this case using [..] is more compact: lambda x: type(a[x]), bDeerskin
just convert back into a list: list(map(a.__getitem__, b))Fernandafernande
How can I use the same method for indices stored in a 2D list ? For example , I have main_arr =[27.5, 31.0, 29.8, 29.8, 32.3, 34.4, 28.8, 31.0, 32.2, 26.0, 29.4, 31.0, 29.3, 29.3, 30.9, 30.7, 29.9, 29.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 56.6, 0.0, 0.0, 0.0, 0.0] , and I want to get values at indices given by pixels = [[2,5,8,11,14,17], [1,4,7,10,13,16], [0,3,6,9,12,14]] . One way I can think of is calling your method in the loop. But is there a more elegant way ?Orrin
@Solen'ya, [[main_arr[p] for p in ps] for ps in pixels] or [operator.itemgetter(*ps)(main_arr) for ps in pixels]Deerskin
This is what I ended up doing but I was asking if there is a better way .Orrin
S
15

Another solution could be via pandas Series:

import pandas as pd

a = pd.Series([-2, 1, 5, 3, 8, 5, 6])
b = [1, 2, 5]
c = a[b]

You can then convert c back to a list if you want:

c = list(c)
Speculator answered 16/9, 2017 at 17:56 Comment(0)
H
11

Basic and not very extensive testing comparing the execution time of the five supplied answers:

def numpyIndexValues(a, b):
    na = np.array(a)
    nb = np.array(b)
    out = list(na[nb])
    return out

def mapIndexValues(a, b):
    out = map(a.__getitem__, b)
    return list(out)

def getIndexValues(a, b):
    out = operator.itemgetter(*b)(a)
    return out

def pythonLoopOverlap(a, b):
    c = [ a[i] for i in b]
    return c

multipleListItemValues = lambda searchList, ind: [searchList[i] for i in ind]

using the following input:

a = range(0, 10000000)
b = range(500, 500000)

simple python loop was the quickest with lambda operation a close second, mapIndexValues and getIndexValues were consistently pretty similar with numpy method significantly slower after converting lists to numpy arrays.If data is already in numpy arrays the numpyIndexValues method with the numpy.array conversion removed is quickest.

numpyIndexValues -> time:1.38940598 (when converted the lists to numpy arrays)
numpyIndexValues -> time:0.0193445 (using numpy array instead of python list as input, and conversion code removed)
mapIndexValues -> time:0.06477512099999999
getIndexValues -> time:0.06391049500000001
multipleListItemValues -> time:0.043773591
pythonLoopOverlap -> time:0.043021754999999995
Healy answered 11/9, 2015 at 6:54 Comment(2)
I do not know what Python interpreter you use but the first method numpyIndexValues does not work since a, b are of type range. I am guessing that you ment to convert a, b to numpy.ndarrays first?Marquettamarquette
@Marquettamarquette Yes I was wasn't comparing apples with apples, I had created numpy arrays as input in the test case for the numpyIndexValues. I have fixed this now and all use the same lists as input.Healy
S
4

List comprehension is clearly the most immediate and easiest to remember - in addition to being quite pythonic!

In any case, among the proposed solutions, it is not the fastest (I have run my test on Windows using Python 3.8.3):

import timeit
from itertools import compress
import random
from operator import itemgetter
import pandas as pd

__N_TESTS__ = 10_000

vector = [str(x) for x in range(100)]
filter_indeces = sorted(random.sample(range(100), 10))
filter_boolean = random.choices([True, False], k=100)

# Different ways for selecting elements given indeces

# list comprehension
def f1(v, f):
   return [v[i] for i in filter_indeces]

# itemgetter
def f2(v, f):
   return itemgetter(*f)(v)

# using pandas.Series
# this is immensely slow
def f3(v, f):
   return list(pd.Series(v)[f])

# using map and __getitem__
def f4(v, f):
   return list(map(v.__getitem__, f))

# using enumerate!
def f5(v, f):
   return [x for i, x in enumerate(v) if i in f]

# using numpy array
def f6(v, f):
   return list(np.array(v)[f])

print("{:30s}:{:f} secs".format("List comprehension", timeit.timeit(lambda:f1(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Operator.itemgetter", timeit.timeit(lambda:f2(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using Pandas series", timeit.timeit(lambda:f3(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using map and __getitem__", timeit.timeit(lambda: f4(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Enumeration (Why anyway?)", timeit.timeit(lambda: f5(vector, filter_indeces), number=__N_TESTS__)))

My results are:

List comprehension :0.007113 secs
Operator.itemgetter :0.003247 secs
Using Pandas series :2.977286 secs
Using map and getitem :0.005029 secs
Enumeration (Why anyway?) :0.135156 secs
Numpy :0.157018 secs

Suu answered 9/10, 2021 at 14:35 Comment(0)
I
3

Here's a simpler way:

a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [e for i, e in enumerate(a) if i in b]
Ieyasu answered 6/9, 2019 at 16:49 Comment(2)
The OP way of [a[i] for i in b] is simpler than what you suggest.Jolee
I wonder how is that "simpler"? You are iterating over all elements of a, checking if their index is in b and add them. On the other hand the code in the question simply takes the elements from a which are at the indexes in b. Sounds simpler to me...Towhead
I
2

I'm sure this has already been considered: If the amount of indices in b is small and constant, one could just write the result like:

c = [a[b[0]]] + [a[b[1]]] + [a[b[2]]]

Or even simpler if the indices itself are constants...

c = [a[1]] + [a[2]] + [a[5]]

Or if there is a consecutive range of indices...

c = a[1:3] + [a[5]]
Igraine answered 23/8, 2016 at 12:19 Comment(2)
Thank you for reminding me that [a] + [b] = [a, b]Sissie
Note though that + makes copies of the lists. You'd likely want extend insteadnto modify the list in place.Somaliland
D
1

Static indexes and small list?

Don't forget that if the list is small and the indexes don't change, as in your example, sometimes the best thing is to use sequence unpacking:

_,a1,a2,_,_,a3,_ = a

The performance is much better and you can also save one line of code:

 %timeit _,a1,b1,_,_,c1,_ = a
10000000 loops, best of 3: 154 ns per loop 
%timeit itemgetter(*b)(a)
1000000 loops, best of 3: 753 ns per loop
 %timeit [ a[i] for i in b]
1000000 loops, best of 3: 777 ns per loop
 %timeit map(a.__getitem__, b)
1000000 loops, best of 3: 1.42 µs per loop
Dividers answered 9/8, 2019 at 14:13 Comment(0)
T
0

The results for the latest pandas==1.4.2 as of June 2022 are as follows.

Note that simple slicing is no longer possible and benchmark results are faster.

import timeit
import pandas as pd
print(pd.__version__)
# 1.4.2

pd.Series([-2, 1, 5, 3, 8, 5, 6])[1, 2, 5]
# KeyError: 'key of type tuple not found and not a MultiIndex'

pd.Series([-2, 1, 5, 3, 8, 5, 6]).iloc[[1, 2, 5]].tolist()
# [1, 5, 5]

def extract_multiple_elements():
    return pd.Series([-2, 1, 5, 3, 8, 5, 6]).iloc[[1, 2, 5]].tolist()

__N_TESTS__ = 10_000
t1 = timeit.timeit(extract_multiple_elements, number=__N_TESTS__)
print(round(t1, 3), 'seconds')
# 1.035 seconds
Transitory answered 9/6, 2022 at 8:35 Comment(0)
W
-1

My answer does not use numpy or python collections.

One trivial way to find elements would be as follows:

a = [-2, 1, 5, 3, 8, 5, 6]
b = [1, 2, 5]
c = [i for i in a if i in b]

Drawback: This method may not work for larger lists. Using numpy is recommended for larger lists.

Wikiup answered 28/8, 2014 at 10:2 Comment(6)
No need to iterate a. [a[i] for i in b]Deerskin
This method doesn't even work in any other case. What if a had another 5 in it?Catcher
IMO, faster to do this sort of intersection using setsPlater
If you are worried about IndexErrors if b has numbers that exceed a's size, try [a[i] if i<len(a) else None for i in b]Tehuantepec
This doesn't answer the question. It is not even what was asked for. b is a list of indexes to take from a, not elements. You are simply taking the elements in a which also exist in b. Again, not what is asked for...Towhead
This post should be closed, the reason is explained by @TowheadMairemaise
D
-1

Kind of pythonic way:

c = [x for x in a if a.index(x) in b]
Darby answered 26/3, 2020 at 18:55 Comment(2)
I would say this is less "pythonic" than even the OP's example -- you've managed to turn their O(n) solution into an O(n^2) solution while also nearly doubling the length of the code. You will also want to note that approach will fail if the list contains objects will fuzzy or partial equality, e.g. if a contains float('nan'), this will always raise a ValueError.Perambulator
This will give wrong results if a has duplicate items (index returns the index of the first occurrence of the element)Towhead

© 2022 - 2024 — McMap. All rights reserved.