beginner with Python here. So I'm having trouble trying to calculate the resulting binary pairwise hammington distance matrix between the rows of an input matrix using only the numpy library. I'm supposed to avoid loops and use vectorization. If for instance I have something like:
[ 1, 0, 0, 1, 1, 0]
[ 1, 0, 0, 0, 0, 0]
[ 1, 1, 1, 1, 0, 0]
The matrix should be something like:
[ 0, 2, 3]
[ 2, 0, 3]
[ 3, 3, 0]
ie if the original matrix was A and the hammingdistance matrix is B. B[0,1] = hammingdistance (A[0] and A[1]). In this case the answer is 2 as they only have two different elements.
So for my code is something like this
def compute_HammingDistance(X):
hammingDistanceMatrix = np.zeros(shape = (len(X), len(X)))
hammingDistanceMatrix = np.count_nonzero ((X[:,:,None] != X[:,:,None].T))
return hammingDistanceMatrix
However it seems to just be returning a scalar value instead of the intended matrix. I know I'm probably doing something wrong with the array/vector broadcasting but I can't figure out how to fix it. I've tried using np.sum instead of np.count_nonzero but they all pretty much gave me something similar.