One standard is to include an epsilon variable that prevents divide by zero. In theory, it is not needed because it doesn't make logical sense to do such calculations. In reality, machines are just calculators and divide by zero becomes either NaN or +/-Inf.
In short, define your function like this:
def z_norm(arr, epsilon=1e-100):
return (arr-arr.mean())/(arr.std()+epsilon)
This assumes a 1D array, but it would be easy to change to row-wise or column-wise calculation of a 2D array.
Epsilon is an intentional error added to calculations to prevent creating NaN or Inf. In the case of Inf, you will still end up with numbers that are really large, but later calculations will not propagate Inf and may still retain some meaning.
The value of 1/(1 x 10^100) is incredibly small and will not change your result much. You can go down to 1e-300 or so if you want, but you risk hitting the lowest precision value after further calculation. Be aware of the precision you use and the smallest precision it can handle. I was using float64.
Update 2021-11-03: Adding test code. The objective of this epsilon is to minimize damage and remove the chance of random NaNs in your data pipeline. Setting epsilon to a positive value fixes the problem.
for arr in [
np.array([0,0]),
np.array([1e-300,1e-300]),
np.array([1,1]),
np.array([1,2])
]:
for epi in [1e-100,0,1e100]:
stdev = arr.std()
mean = arr.mean()
result = z_norm(arr, epsilon=epi)
print(f' z_norm(np.array({str(arr):<21}),{epi:<7}) ### stdev={stdev}; mean={mean:<6}; becomes --> {str(result):<19} (float-64) --> Truncate to 32 bits. =', result.astype(np.float32))
z_norm(np.array([0 0] ),1e-100 ) ### stdev=0.0; mean=0.0 ; becomes --> [0. 0.] (float-64) --> Truncate to 32 bits. = [0. 0.]
z_norm(np.array([0 0] ),0 ) ### stdev=0.0; mean=0.0 ; becomes --> [nan nan] (float-64) --> Truncate to 32 bits. = [nan nan]
z_norm(np.array([0 0] ),1e+100 ) ### stdev=0.0; mean=0.0 ; becomes --> [0. 0.] (float-64) --> Truncate to 32 bits. = [0. 0.]
z_norm(np.array([1.e-300 1.e-300] ),1e-100 ) ### stdev=0.0; mean=1e-300; becomes --> [0. 0.] (float-64) --> Truncate to 32 bits. = [0. 0.]
z_norm(np.array([1.e-300 1.e-300] ),0 ) ### stdev=0.0; mean=1e-300; becomes --> [nan nan] (float-64) --> Truncate to 32 bits. = [nan nan]
z_norm(np.array([1.e-300 1.e-300] ),1e+100 ) ### stdev=0.0; mean=1e-300; becomes --> [0. 0.] (float-64) --> Truncate to 32 bits. = [0. 0.]
z_norm(np.array([1 1] ),1e-100 ) ### stdev=0.0; mean=1.0 ; becomes --> [0. 0.] (float-64) --> Truncate to 32 bits. = [0. 0.]
z_norm(np.array([1 1] ),0 ) ### stdev=0.0; mean=1.0 ; becomes --> [nan nan] (float-64) --> Truncate to 32 bits. = [nan nan]
z_norm(np.array([1 1] ),1e+100 ) ### stdev=0.0; mean=1.0 ; becomes --> [0. 0.] (float-64) --> Truncate to 32 bits. = [0. 0.]
z_norm(np.array([1 2] ),1e-100 ) ### stdev=0.5; mean=1.5 ; becomes --> [-1. 1.] (float-64) --> Truncate to 32 bits. = [-1. 1.]
z_norm(np.array([1 2] ),0 ) ### stdev=0.5; mean=1.5 ; becomes --> [-1. 1.] (float-64) --> Truncate to 32 bits. = [-1. 1.]
z_norm(np.array([1 2] ),1e+100 ) ### stdev=0.5; mean=1.5 ; becomes --> [-5.e-101 5.e-101] (float-64) --> Truncate to 32 bits. = [-0. 0.]