numpy cross-correlation - vectorizing

I have a large number of cross-correlations to calculate and I'm looking for the fastest way to do it. I'm assuming vectorizing the problem would help rather than doing it with loops

I have a 3D array labelled as electrode x timepoint x trial (shape: 64x256x913). I want to calculate the max cross-correlation of the timepoints for every pair of electrodes, for every trial.

Specifically: for every trial, I want to take each of the pair combination of electrodes and calculate the max cross-correlation value for every pair. That will result in 4096 (64*64) max cross-correlation values in a single row/vector. That will be done for every trial, stacking each of the rows/vectors on top of each other resulting in a final 2D array of shape 913*4096 containing max cross-correlation values

Thats a lot of computation but I want to try to find the fastest method to do it. I mocked up some proto-code using lists as containers that may help explain the problem a bit better. There may be some logic errors in there, but either way the code doesn't run on my computer because theres so much to calculate that python just freezes. Here is it:

#allData is a 64x256x913 array

all_experiment_trials = []
for trial in range(allData.shape[2]):
    all_trial_electrodes = []
    for electrode in range(allData.shape[0]):
        for other_electrode in range(allData.shape[0]):
            if electrode == other_electrode:
                pass
            else:
                single_xcorr = max(np.correlate(allData[electrode,:,trial], allData[other_electrode,:,trial], "full"))
                all_trial_electrodes.append(single_xcorr)
    all_experiment_trials.append(all_trial_electrodes)

Obviously loops are really slow for this type of thing. Is there a vectorized solution to this using numpy arrays?

I've checked out things like correlate2d() and the like, but I don't think they really work in my case as I'm not multiplying 2 matrices together

def vectorized_approach(allData): # Get shape M,N,R = allData.shape # Valid mask based on condition: "if electrode == other_electrode" valid_mask = np.mod(np.arange(M*M),M+1)!=0 # Elementwise multiplications across all elements in axis=0, # and then summation along axis=1 out = np.einsum('ijkl,ijkl->lij',allData[:,None,:,:],allData[None,:,:,:]) # Use valid mask to skip columns and have the final output return out.reshape(R,-1)[:,valid_mask]

In [10]: allData = np.random.rand(20,80,200) In [11]: def org_approach(allData): ...: all_experiment_trials = [] ...: for trial in range(allData.shape[2]): ...: all_trial_electrodes = [] ...: for electrode in range(allData.shape[0]): ...: for other_electrode in range(allData.shape[0]): ...: if electrode == other_electrode: ...: pass ...: else: ...: single_xcorr = max(np.correlate(allData[electrode,:,trial], allData[other_electrode,:,trial])) ...: all_trial_electrodes.append(single_xcorr) ...: all_experiment_trials.append(all_trial_electrodes) ...: return all_experiment_trials ...: In [12]: %timeit org_approach(allData) 1 loops, best of 3: 1.04 s per loop In [13]: %timeit vectorized_approach(allData) 100 loops, best of 3: 15.1 ms per loop In [14]: np.allclose(vectorized_approach(allData),np.asarray(org_approach(allData))) Out[14]: True

Recommended topics

Hot tags