How to slice numpy rows using start and end index

Asked 9/3, 2018 at 6:54 Answered 9/3, 2018 at 9:14

index = np.array([[1,2],[2,4],[1,5],[5,6]])
z = np.zeros(shape = [4,10], dtype = np.float32)

What is the efficient way to set z[np.arange(4),index[:,0]], z[np.arange(4), index[:,1]] and everything between them as 1?

expected output:

array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])

Cypher answered 9/3, 2018 at 6:54 Comment(1)

Just to be clear, show z after the desired change. – July 9/3, 2018 at 6:59

We can leverage NumPy broadcasting for a vectorized solution by simply comparing the start and end indices against the ranged array covering the length of columns to give us a mask that represents all the places in the output array required to be assigned as 1s.

So, the solution would be something like this -

ncols = z.shape[1]
r = np.arange(z.shape[1])
mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
z[mask] = 1

Sample run -

In [39]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
    ...: z = np.zeros(shape = [4,10], dtype = np.float32)

In [40]: ncols = z.shape[1]
    ...: r = np.arange(z.shape[1])
    ...: mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
    ...: z[mask] = 1

In [41]: z
Out[41]: 
array([[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 1., 0., 0., 0.]], dtype=float32)

If z is always a zeros-initialized array, we can directly get the output from mask -

z = mask.astype(int)

Sample run -

In [37]: mask.astype(int)
Out[37]: 
array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])

Benchmarking

Comparing @hpaulj's foo0 and mine foo4 as listed in @hpaulj's post for a set with 1000 rows and variable number of columns. We are starting with 10 columns as that was how the input sample was listed and we are giving it a bigger number of rows - 1000. We would increase the number of columns to 1000.

Here's the timings -

In [14]: ncols = 10
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [15]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.27 ms per loop
1000 loops, best of 3: 594 µs per loop

In [16]: ncols = 100
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [17]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.49 ms per loop
100 loops, best of 3: 2.74 ms per loop

In [38]: ncols = 300
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [39]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 657 µs per loop
1000 loops, best of 3: 600 µs per loop

In [40]: ncols = 1000
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [41]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 673 µs per loop
1000 loops, best of 3: 1.78 ms per loop

Thus, choosing the best one would depend on the number of columns of the problem set between the loopy and the broadcasting based vectorized one.

Square answered 9/3, 2018 at 9:14 Comment(3)

As clever as this is, it doesn't seem to be faster than plain row iteration. – July 9/3, 2018 at 22:7

@July Needed a bit more extensive benchmarking. Check out updated post. – Square 10/3, 2018 at 6:7

Yes, my Index was wide, [100,500] for a (n,1000) array, giving an advantage to a sliced selection over a masked one. – July 10/3, 2018 at 7:54

I think this is what you want to do - but with the loop:

In [35]: z=np.zeros((4,10),int)
In [36]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
In [37]: for i in range(4):
    ...:     z[i,index[i,0]:index[i,1]] = 1
    ...:     
In [38]: z
Out[38]: 
array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 0, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0]])

Since there differing length slices, this will be tricky to do with one array expression. Maybe not impossible, but tricky enough that it might not be worth trying.

Look at the indices of the 1s in this z:

In [40]: np.where(z)
Out[40]: 
(array([0, 1, 1, 2, 2, 2, 2, 3], dtype=int32),
 array([1, 2, 3, 1, 2, 3, 4, 5], dtype=int32))

Is there a regular pattern that could be generated [0,1,2,3] and index?

I can generate the 2nd row with a concatenation of slices:

In [39]: np.r_[1:2, 2:4, 1:5, 5:6]
Out[39]: array([1, 2, 3, 1, 2, 3, 4, 5])

But notice that r_ involves several iterations - to generate the input, to generate the expanded slices, and to concatenate them.

I can generate the first row of the where with:

In [41]: index[:,1]-index[:,0]
Out[41]: array([1, 2, 4, 1])
In [42]: np.arange(4).repeat(_)
Out[42]: array([0, 1, 1, 2, 2, 2, 2, 3])

and as expected, those 2 index arrays give us all the 1s:

In [43]: z[Out[42],Out[39]]
Out[43]: array([1, 1, 1, 1, 1, 1, 1, 1])

Or to generate Out[39] from index:

In [50]: np.concatenate([np.arange(i,j) for i,j in index])
Out[50]: array([1, 2, 3, 1, 2, 3, 4, 5])

Comparing my solutions with @Divakar's

def foo0(z,index):
    for i in range(z.shape[0]):
        z[i,index[i,0]:index[i,1]] = 1
    return z

def foo4(z,index):
    r = np.arange(z.shape[1])
    mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
    z[mask] = 1
    return z

For this small example, row iteration is faster:

In [155]: timeit foo0(z,index)
7.12 µs ± 224 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [156]: timeit foo4(z,index)
19.8 µs ± 890 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Even for larger arrays, the row iteration approach is faster:

In [157]: Z.shape
Out[157]: (1000, 1000)
In [158]: Index.shape
Out[158]: (1000, 2)
In [159]: timeit foo0(Z,Index)
1.72 ms ± 16.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [160]: timeit foo4(Z,Index)
7.47 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

July answered 9/3, 2018 at 7:1 Comment(4)

i too think best would be to just .. go for the loop. I don't like that though. If it simply cannot be done efficiently without a loop, what should be the answer accepted by me? – Cypher 9/3, 2018 at 7:14

I put both approaches into functions. The In[37] is consistently faster, for this and large samples. – July 9/3, 2018 at 7:29

Not pretty sure why numpy doesn't have this indexing mode. Adding a slice version to range indexing sounds very reasonable. – Danieladaniele 9/3, 2018 at 7:41

@July - Just one trivial change to be consistent with the question i asked: Add 1 at the end index as well – Cypher 9/3, 2018 at 8:20

Benchmarking

Recommended topics

Hot tags