I want to make a simple neural network which uses the ReLU function. Can someone give me a clue of how can I implement the function using numpy.
There are a couple of ways.
>>> x = np.random.random((3, 2)) - 0.5
>>> x
array([[-0.00590765, 0.18932873],
[-0.32396051, 0.25586596],
[ 0.22358098, 0.02217555]])
>>> np.maximum(x, 0)
array([[ 0. , 0.18932873],
[ 0. , 0.25586596],
[ 0.22358098, 0.02217555]])
>>> x * (x > 0)
array([[-0. , 0.18932873],
[-0. , 0.25586596],
[ 0.22358098, 0.02217555]])
>>> (abs(x) + x) / 2
array([[ 0. , 0.18932873],
[ 0. , 0.25586596],
[ 0.22358098, 0.02217555]])
If timing the results with the following code:
import numpy as np
x = np.random.random((5000, 5000)) - 0.5
print("max method:")
%timeit -n10 np.maximum(x, 0)
print("multiplication method:")
%timeit -n10 x * (x > 0)
print("abs method:")
%timeit -n10 (abs(x) + x) / 2
We get:
max method:
10 loops, best of 3: 239 ms per loop
multiplication method:
10 loops, best of 3: 145 ms per loop
abs method:
10 loops, best of 3: 288 ms per loop
So the multiplication seems to be the fastest.
x
in maximum(x, 0, x)
means "please change x
in place rather than allocating a new matrix". (source) –
Listlessness You can do it in much easier way:
def ReLU(x):
return x * (x > 0)
def dReLU(x):
return 1. * (x > 0)
0
will automatically be turned in to same size as tensor x
. The bool
result will be turned in to 0 or 1 and then will get multiplied elementwise. There is no magic :). –
Bumbledom 1. * ( x >= 0)
–
Polyptych I'm completely revising my original answer because of points raised in the other questions and comments. Here is the new benchmark script:
import time
import numpy as np
def fancy_index_relu(m):
m[m < 0] = 0
relus = {
"max": lambda x: np.maximum(x, 0),
"in-place max": lambda x: np.maximum(x, 0, x),
"mul": lambda x: x * (x > 0),
"abs": lambda x: (abs(x) + x) / 2,
"fancy index": fancy_index_relu,
}
for name, relu in relus.items():
n_iter = 20
x = np.random.random((n_iter, 5000, 5000)) - 0.5
t1 = time.time()
for i in range(n_iter):
relu(x[i])
t2 = time.time()
print("{:>12s} {:3.0f} ms".format(name, (t2 - t1) / n_iter * 1000))
It takes care to use a different ndarray for each implementation and iteration. Here are the results:
max 126 ms
in-place max 107 ms
mul 136 ms
abs 86 ms
fancy index 132 ms
np.maximum(x, 0, x)
is ignored and the result is directly written to x
. –
Mho EDIT As jirassimok has mentioned below my function will change the data in place, after that it runs a lot faster in timeit. This causes the good results. It's some kind of cheating. Sorry for your inconvenience.
I found a faster method for ReLU with numpy. You can use the fancy index feature of numpy as well.
fancy index:
20.3 ms ± 272 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> x = np.random.random((5,5)) - 0.5
>>> x
array([[-0.21444316, -0.05676216, 0.43956365, -0.30788116, -0.19952038],
[-0.43062223, 0.12144647, -0.05698369, -0.32187085, 0.24901568],
[ 0.06785385, -0.43476031, -0.0735933 , 0.3736868 , 0.24832288],
[ 0.47085262, -0.06379623, 0.46904916, -0.29421609, -0.15091168],
[ 0.08381359, -0.25068492, -0.25733763, -0.1852205 , -0.42816953]])
>>> x[x<0]=0
>>> x
array([[ 0. , 0. , 0.43956365, 0. , 0. ],
[ 0. , 0.12144647, 0. , 0. , 0.24901568],
[ 0.06785385, 0. , 0. , 0.3736868 , 0.24832288],
[ 0.47085262, 0. , 0.46904916, 0. , 0. ],
[ 0.08381359, 0. , 0. , 0. , 0. ]])
Here is my benchmark:
import numpy as np
x = np.random.random((5000, 5000)) - 0.5
print("max method:")
%timeit -n10 np.maximum(x, 0)
print("max inplace method:")
%timeit -n10 np.maximum(x, 0,x)
print("multiplication method:")
%timeit -n10 x * (x > 0)
print("abs method:")
%timeit -n10 (abs(x) + x) / 2
print("fancy index:")
%timeit -n10 x[x<0] =0
max method:
241 ms ± 3.53 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
max inplace method:
38.5 ms ± 4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
multiplication method:
162 ms ± 3.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
abs method:
181 ms ± 4.18 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
fancy index:
20.3 ms ± 272 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
a[a < 0] = 0
) performed worst of the methods, with np.maximum
doing best. –
Rosiarosicrucian Richard Möhn's comparison is not fair.
As Andrea Di Biagio's comment, the in-place method np.maximum(x, 0, x)
will modify x at the first loop.
So here is my benchmark:
import numpy as np
def baseline():
x = np.random.random((5000, 5000)) - 0.5
return x
def relu_mul():
x = np.random.random((5000, 5000)) - 0.5
out = x * (x > 0)
return out
def relu_max():
x = np.random.random((5000, 5000)) - 0.5
out = np.maximum(x, 0)
return out
def relu_max_inplace():
x = np.random.random((5000, 5000)) - 0.5
np.maximum(x, 0, x)
return x
Timing it:
print("baseline:")
%timeit -n10 baseline()
print("multiplication method:")
%timeit -n10 relu_mul()
print("max method:")
%timeit -n10 relu_max()
print("max inplace method:")
%timeit -n10 relu_max_inplace()
Get the results:
baseline:
10 loops, best of 3: 425 ms per loop
multiplication method:
10 loops, best of 3: 596 ms per loop
max method:
10 loops, best of 3: 682 ms per loop
max inplace method:
10 loops, best of 3: 602 ms per loop
In-place maximum method is only a bit faster than the maximum method, and it may because it omits the variable assignment for 'out'. And it's still slower than the multiplication method.
And since you're implementing the ReLU func. You may have to save the 'x' for backprop through relu. E.g.:
def relu_backward(dout, cache):
x = cache
dx = np.where(x > 0, dout, 0)
return dx
So i recommend you to use multiplication method.
relu_mul
is fastest, but you sayrelu_max_inplace
is slightly faster? Also, why do you initialise the test matrix in each function and no just once at the beginning for each method? Your timings now include time taken to create a matrix with 5000*5000 = 25000000 elements - roughly 200 Mb in size if default float64
is used. %timeit np.random.random((5000, 5000)) - 0.5
gives 273 ms ± 7.95 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
. That is over a third of the actual timings you posted. –
Scant relu_max_inplace
is slighter faster than relu_max
, but the most recommended method is relu_mul
. –
Thoroughpaced np.random.random()
intentionally, because if i don't do this, relu_max_inplace
method will seem to be extremly fast, like @Richard Möhn 's result. @Richard Möhn 's result shows that relu_max_inplace
vs relu_max
is 38.4ms vs 238ms per loop. It's just because the in_place method will only be excuted once. And initialise the matrix in each loop will avoid this situation. The comparison will be fair. –
Thoroughpaced numpy didn't have the function of relu, but you define it by yourself as follow:
def relu(x):
return np.maximum(0, x)
for example:
arr = np.array([[-1,2,3],[1,2,3]])
ret = relu(arr)
print(ret) # print [[0 2 3] [1 2 3]]
If we have 3 parameters (t0, a0, a1)
for Relu, that is we want to implement
if x > t0:
x = x * a1
else:
x = x * a0
We can use the following code:
X = X * (X > t0) * a1 + X * (X < t0) * a0
X
there is a matrix.
For a single neuron
def relu(net):
return max(0, net)
Where net is the net activity at the neuron's input(net=dot(w,x)), where dot() is the dot product of w and x (weight vector and input vector respectively). dot() is a function defined in numpy package in Python.
For neurons in a layer with net vector
def relu(net):
return np.maximum(net)
This is more precise implementation:
def ReLU(x):
return abs(x) * (x > 0)
abs
is unnecessary given that your stamping out all the negative components. –
Pilloff © 2022 - 2024 — McMap. All rights reserved.