I'm performing some MD simulations involving systems of millions of atoms.
I have written some code to generate a file which is just a listing of XYZ atom coordinates. Now I need to generate bonds between the atoms. If two atoms are within a certain distance of each other, that is considered a bond.
Example XYZ file:
1 0 0
2 0 0
7 0 0
10 0 0
9 0 0
So I've got five atoms. If my distance threshold is 2 units, then my bond listing will be:
1 2
3 5
4 5
(where the numbers correspond to the index of the coordinate in the XYZ file).
The naive approach to generate this list is just:
for i = 1:numAtoms
for j = i+1:numAtoms
if distance(atom[i], atom[j]) < 2
bonds.push [i, j]
However, this quickly hits an algorithmic limit and is slow even in highly optimized C for millions of atoms, at least for as frequently as I'll be doing this process.
The only experience I have with space-partitioning data structures is with kd-trees when I wrote a photon mapper once, so I don't really know what the best solution to this problem is. I'm sure there's probably something out there that's optimal for this though.
I should also mention that my simulation box is periodic, meaning that an atom at (0.5, 0, 0) will bond to an atom at (boxWidth - 0.5, 0, 0) with a distance threshold such as 2.