How to slice numpy rows using start and end index
Asked Answered
C

2

4
index = np.array([[1,2],[2,4],[1,5],[5,6]])
z = np.zeros(shape = [4,10], dtype = np.float32)

What is the efficient way to set z[np.arange(4),index[:,0]], z[np.arange(4), index[:,1]] and everything between them as 1?

expected output:

array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])
Cypher answered 9/3, 2018 at 6:54 Comment(1)
Just to be clear, show z after the desired change.July
S
4

We can leverage NumPy broadcasting for a vectorized solution by simply comparing the start and end indices against the ranged array covering the length of columns to give us a mask that represents all the places in the output array required to be assigned as 1s.

So, the solution would be something like this -

ncols = z.shape[1]
r = np.arange(z.shape[1])
mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
z[mask] = 1

Sample run -

In [39]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
    ...: z = np.zeros(shape = [4,10], dtype = np.float32)

In [40]: ncols = z.shape[1]
    ...: r = np.arange(z.shape[1])
    ...: mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
    ...: z[mask] = 1

In [41]: z
Out[41]: 
array([[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 1., 0., 0., 0.]], dtype=float32)

If z is always a zeros-initialized array, we can directly get the output from mask -

z = mask.astype(int)

Sample run -

In [37]: mask.astype(int)
Out[37]: 
array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])

Benchmarking

Comparing @hpaulj's foo0 and mine foo4 as listed in @hpaulj's post for a set with 1000 rows and variable number of columns. We are starting with 10 columns as that was how the input sample was listed and we are giving it a bigger number of rows - 1000. We would increase the number of columns to 1000.

Here's the timings -

In [14]: ncols = 10
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [15]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.27 ms per loop
1000 loops, best of 3: 594 µs per loop

In [16]: ncols = 100
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [17]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.49 ms per loop
100 loops, best of 3: 2.74 ms per loop

In [38]: ncols = 300
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [39]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 657 µs per loop
1000 loops, best of 3: 600 µs per loop

In [40]: ncols = 1000
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [41]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 673 µs per loop
1000 loops, best of 3: 1.78 ms per loop

Thus, choosing the best one would depend on the number of columns of the problem set between the loopy and the broadcasting based vectorized one.

Square answered 9/3, 2018 at 9:14 Comment(3)
As clever as this is, it doesn't seem to be faster than plain row iteration.July
@July Needed a bit more extensive benchmarking. Check out updated post.Square
Yes, my Index was wide, [100,500] for a (n,1000) array, giving an advantage to a sliced selection over a masked one.July
J
1

I think this is what you want to do - but with the loop:

In [35]: z=np.zeros((4,10),int)
In [36]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
In [37]: for i in range(4):
    ...:     z[i,index[i,0]:index[i,1]] = 1
    ...:     
In [38]: z
Out[38]: 
array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 0, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0]])

Since there differing length slices, this will be tricky to do with one array expression. Maybe not impossible, but tricky enough that it might not be worth trying.

Look at the indices of the 1s in this z:

In [40]: np.where(z)
Out[40]: 
(array([0, 1, 1, 2, 2, 2, 2, 3], dtype=int32),
 array([1, 2, 3, 1, 2, 3, 4, 5], dtype=int32))

Is there a regular pattern that could be generated [0,1,2,3] and index?

I can generate the 2nd row with a concatenation of slices:

In [39]: np.r_[1:2, 2:4, 1:5, 5:6]
Out[39]: array([1, 2, 3, 1, 2, 3, 4, 5])

But notice that r_ involves several iterations - to generate the input, to generate the expanded slices, and to concatenate them.

I can generate the first row of the where with:

In [41]: index[:,1]-index[:,0]
Out[41]: array([1, 2, 4, 1])
In [42]: np.arange(4).repeat(_)
Out[42]: array([0, 1, 1, 2, 2, 2, 2, 3])

and as expected, those 2 index arrays give us all the 1s:

In [43]: z[Out[42],Out[39]]
Out[43]: array([1, 1, 1, 1, 1, 1, 1, 1])

Or to generate Out[39] from index:

In [50]: np.concatenate([np.arange(i,j) for i,j in index])
Out[50]: array([1, 2, 3, 1, 2, 3, 4, 5])

Comparing my solutions with @Divakar's

def foo0(z,index):
    for i in range(z.shape[0]):
        z[i,index[i,0]:index[i,1]] = 1
    return z

def foo4(z,index):
    r = np.arange(z.shape[1])
    mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
    z[mask] = 1
    return z

For this small example, row iteration is faster:

In [155]: timeit foo0(z,index)
7.12 µs ± 224 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [156]: timeit foo4(z,index)
19.8 µs ± 890 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Even for larger arrays, the row iteration approach is faster:

In [157]: Z.shape
Out[157]: (1000, 1000)
In [158]: Index.shape
Out[158]: (1000, 2)
In [159]: timeit foo0(Z,Index)
1.72 ms ± 16.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [160]: timeit foo4(Z,Index)
7.47 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
July answered 9/3, 2018 at 7:1 Comment(4)
i too think best would be to just .. go for the loop. I don't like that though. If it simply cannot be done efficiently without a loop, what should be the answer accepted by me?Cypher
I put both approaches into functions. The In[37] is consistently faster, for this and large samples.July
Not pretty sure why numpy doesn't have this indexing mode. Adding a slice version to range indexing sounds very reasonable.Danieladaniele
@July - Just one trivial change to be consistent with the question i asked: Add 1 at the end index as wellCypher

© 2022 - 2024 — McMap. All rights reserved.