Access elements of a Matrix by a list of indices in Python to apply a max(val, 0.5) to each value without a for loop

M

3

5

I know how to access elements in a vector by indices doing:

test = numpy.array([1,2,3,4,5,6])
indices = list([1,3,5])
print(test[indices])

which gives the correct answer : [2 4 6]

But I am trying to do the same thing using a 2D matrix, something like:

currentGrid = numpy.array(  [[0,   0.1],
                             [0.9, 0.9],
                             [0.1, 0.1]])
indices = list([(0,0),(1,1)])
print(currentGrid[indices])

this should display me "[0.0 0.9]" for the value at (0,0) and the one at (1,1) in the matrix. But instead it displays "[ 0.1 0.1]". Also if I try to use 3 indices with :

indices = list([(0,0),(1,1),(0,2)])

I now get the following error:

Traceback (most recent call last):
  File "main.py", line 43, in <module>
    print(currentGrid[indices])
IndexError: too many indices for array

I ultimately need to apply a simple max() operation on all the elements at these indices and need the fastest way to do that for optimization purposes.

What am I doing wrong ? How can I access specific elements in a matrix to do some operation on them in a very efficient way (not using list comprehension nor a loop).

Malignity answered 28/2, 2019 at 16:27 Comment(1)

Today, I accidentally found the solution to your problem. See the Edit 4 in my answer. – Sakti 1/3, 2019 at 15:33

S

2

There are already some great answers to your problem. Here just a quick and dirty solution for your particular code:

for i in indices:
    print(currentGrid[i[0],i[1]])

Edit:

If you do not want to use a for loop you need to do the following:

Assume you have 3 values of your 2D-matrix (with the dimensions x1 and x2 that you want to access. The values have the "coordinates"(indices) V1(x11|x21), V2(x12|x22), V3(x13|x23). Then, for each dimension of your matrix (2 in your case) you need to create a list with the indices for this dimension of your points. In this example, you would create one list with the x1 indices: [x11,x12,x13] and one list with the x2 indices of your points: [x21,x22,x23]. Then you combine these lists and use them as index for the matrix:

indices = [[x11,x12,x13],[x21,x22,x23]]

or how you write it:

indices = list([(x11,x12,x13),(x21,x22,x23)])

Now with the points that you used ((0,0),(1,1),(2,0)) - please note you need to use (2,0) instead of (0,2), because it would be out of range otherwise:

indices = list([(0,1,2),(0,1,0)])
print(currentGrid[indices])

This will give you 0, 0.9, 0.1. And on this list you can then apply the max() command if you like (just to consider your whole question):

maxValue = max(currentGrid[indices])

Edit2:

Here an example how you can transform your original index list to get it into the correct shape:

originalIndices = [(0,0),(1,1),(2,0)]

x1 = []
x2 = []

for i in originalIndices:
    x1.append(i[0])
    x2.append(i[1])

newIndices = [x1,x2]
print(currentGrid[newIndices])

Edit3:

I don't know if you can apply max(x,0.5) to a numpy array with using a loop. But you could use Pandas instead. You can cast your list into a pandas Series and then apply a lambda function:

import pandas as pd
maxValues = pd.Series(currentGrid[newIndices]).apply(lambda x: max(x,0.5))

This will give you a pandas array containing 0.5,0.9,0.5, which you can simply cast back to a list maxValues = list(maxValues).

Just one note: In the background you will always have some kind of loop running, also with this command. I doubt, that you will get much better performance by this. If you really want to boost performance, then use a for loop, together with numba (you simply need to add a decorator to your function) and execute it in parallel. Or you can use the multiprocessing library and the Pool function, see here. Just to give you some inspiration.

Edit4:

Accidentally I saw this page today, which allows to do exactly what you want with Numpy. The solution (considerin the newIndices vector from my Edit2) to your problem is:

maxfunction = numpy.vectorize(lambda i: max(i,0.5))
print(maxfunction(currentGrid[newIndices]))

Sakti answered 28/2, 2019 at 16:44 Comment(2)

Thanks for your answer but I specified that I do not want to use a for loop – Malignity 28/2, 2019 at 17:47

Yes it helps a lot! Thank you very much. Do you know if it's possible to apply the max individually on each value? I guess I wasn't clear enough. I want to do something like val = max (val, 0.5) for each value at each index. Is it possible without using a for loop for optimization concerns? – Malignity 28/2, 2019 at 21:18

H

4

The problem is the arrangement of the indices you're passing to the array. If your array is two-dimensional, your indices must be two lists, one containing the vertical indices and the other one the horizontal ones. For instance:

idx_i, idx_j = zip(*[(0, 0), (1, 1), (0, 2)])
print currentGrid[idx_j, idx_i]
# [0.0, 0.9, 0.1]

Note that the first element when indexing arrays is the last dimension, e.g.: (y, x). I assume you defined yours as (x, y) otherwise you'll get an IndexError

Harlamert answered 28/2, 2019 at 16:37 Comment(2)

Thanks for your answer. I understand using two lists is the right way to access the elements by indices. But is it possible then to do an operation on each of these elements without using a for loop? I am trying to find a less cpu intensive way to a max(val, 0.5) on all these elements – Malignity 28/2, 2019 at 17:46

If you only care about the extracted values you can either use the map function or a list comprehension. Both are way faster than a for loop, but avoid lambda functions + map (see this). If you want to operate those values "inside" the array, numpy may have some function that can map a function to those positions. – Harlamert 28/2, 2019 at 21:7

S

2

There are already some great answers to your problem. Here just a quick and dirty solution for your particular code:

for i in indices:
    print(currentGrid[i[0],i[1]])

Edit:

If you do not want to use a for loop you need to do the following:

Assume you have 3 values of your 2D-matrix (with the dimensions x1 and x2 that you want to access. The values have the "coordinates"(indices) V1(x11|x21), V2(x12|x22), V3(x13|x23). Then, for each dimension of your matrix (2 in your case) you need to create a list with the indices for this dimension of your points. In this example, you would create one list with the x1 indices: [x11,x12,x13] and one list with the x2 indices of your points: [x21,x22,x23]. Then you combine these lists and use them as index for the matrix:

indices = [[x11,x12,x13],[x21,x22,x23]]

or how you write it:

indices = list([(x11,x12,x13),(x21,x22,x23)])

Now with the points that you used ((0,0),(1,1),(2,0)) - please note you need to use (2,0) instead of (0,2), because it would be out of range otherwise:

indices = list([(0,1,2),(0,1,0)])
print(currentGrid[indices])

This will give you 0, 0.9, 0.1. And on this list you can then apply the max() command if you like (just to consider your whole question):

maxValue = max(currentGrid[indices])

Edit2:

Here an example how you can transform your original index list to get it into the correct shape:

originalIndices = [(0,0),(1,1),(2,0)]

x1 = []
x2 = []

for i in originalIndices:
    x1.append(i[0])
    x2.append(i[1])

newIndices = [x1,x2]
print(currentGrid[newIndices])

Edit3:

I don't know if you can apply max(x,0.5) to a numpy array with using a loop. But you could use Pandas instead. You can cast your list into a pandas Series and then apply a lambda function:

import pandas as pd
maxValues = pd.Series(currentGrid[newIndices]).apply(lambda x: max(x,0.5))

This will give you a pandas array containing 0.5,0.9,0.5, which you can simply cast back to a list maxValues = list(maxValues).

Just one note: In the background you will always have some kind of loop running, also with this command. I doubt, that you will get much better performance by this. If you really want to boost performance, then use a for loop, together with numba (you simply need to add a decorator to your function) and execute it in parallel. Or you can use the multiprocessing library and the Pool function, see here. Just to give you some inspiration.

Edit4:

Accidentally I saw this page today, which allows to do exactly what you want with Numpy. The solution (considerin the newIndices vector from my Edit2) to your problem is:

maxfunction = numpy.vectorize(lambda i: max(i,0.5))
print(maxfunction(currentGrid[newIndices]))

Sakti answered 28/2, 2019 at 16:44 Comment(2)

Thanks for your answer but I specified that I do not want to use a for loop – Malignity 28/2, 2019 at 17:47

Yes it helps a lot! Thank you very much. Do you know if it's possible to apply the max individually on each value? I guess I wasn't clear enough. I want to do something like val = max (val, 0.5) for each value at each index. Is it possible without using a for loop for optimization concerns? – Malignity 28/2, 2019 at 21:18

L

0

2D indices have to be accessed like this:

print(currentGrid[indices[:,0], indices[:,1]])

The row indices and the column indices are to be passed separately as lists.

Logistician answered 28/2, 2019 at 16:38 Comment(1)

Thanks, you are right, this works. But is it possible to do a max(value, 0.5) on all these elements without using a for loop? – Malignity 28/2, 2019 at 17:48

Edit:

Edit2:

Edit3:

Edit4:

Edit:

Edit2:

Edit3:

Edit4:

Recommended topics

Hot tags