There are many things you could do to improve your program - both algorithm, and code.
On the code side, one of the things that is REALLY slowing you down is the fact that not only you use a for
loop (which is slow), but in the line
P = [P;P1];
you append elements to an array. Every time that happens, Matlab needs to find a new place to put the data, copying all the points in the process. This quickly becomes very slow. Preallocating the array with
P = zeros(1000000, 3);
keeping track of the number N of points you have found so far, and changing your calculation of distance to
D = pdist2(P1, P(1:N, :), 'euclidean');
would at least address that...
The other issue is that you check new points against all previously found points - so when you have 100 points, you check about 100x100, for 1000 it is 1000x1000. You can see then that this algorithm is O(N^3) at least... not counting the fact that you will get more "misses" as the density goes up. A O(N^3) algorithm with N=10E6 takes at least 10E18 cycles; if you had a 4 GHz machine with one clock cycle per comparison, you would need 2.5E9 seconds = approximately 80 years. You can try parallel processing, but that's just brute force - who wants that?
I recommend that you think about breaking the problem into smaller pieces (quite literally): for example, if you divide your sphere into little boxes that are about the size of your maximum distance, and for each box you keep track of what points are in it, then you only need to check against points in "this" box and its immediate neighbors - 27 boxes in all. If your boxes are 2.5 mm across, you would have 100x100x100 = 1M boxes. That seems like a lot, but now your computation time will be reduced drastically, as you will have (by the end of the algorithm) only 1 point on average per box... Of course with the distance criterion you are using, you will have more points near the center, but that's a detail.
The data structure you would need would be a cell array of 100x100x100, and each cell contains the index of the good points found so far "in that cell". The problem with a cell array is that it doesn't lend itself to vectorization. If instead you have the memory, you could assign it as a 4D array of 10x100x100x100, assuming you will have no more than 10 points per cell (if you do, you will have to handle that separately; work with me here...). Use an index of -1
for points not yet found
Then your check would be something like this:
% initializing:
bigList = zeros(10,102,102,102)-1; % avoid hitting the edge...
NPlist = zeros(102, 102, 102); % track # valid points in each box
bottomcorner = [-25.5, -25.5, -25.5]; % boxes span from -25.5 to +25.5
cellSize = 0.5;
.
% in your loop:
P1= [x, y, z];
cellCoords = ceil(P1/cellSize);
goodFlag = true;
pointsSoFar = bigList(:, cellCoords(1)+(-1:1), cellCoords(2)+(-1:1), cellCoords(3)+(-1:1));
pointsToCheck = find(pointsSoFar>0); % this is where the big gains come...
r=sum(P1.^2);
D = pdist2(P1,P(pointsToCheck, :),'euclidean'); % euclidean distance
if D>0.146*r^(2/3)
P(k,:) = P1;
% check the syntax of this line...
cci = ind2sub([102 102 102], cellCoords(1), cellCoords(2), cellCoords(3));
NP(cci)=NP(cci)+1; % increasing number of points in this box
% you want to handle the case where this > 10!!!
bigList(NP(cci), cci) = k;
k=k+1;
end
....
I don't know if you can take it from here; if you can't, say so in the notes and I may have some time this weekend to code this up in more detail. There are ways to speed it up more with some vectorization, but it quickly becomes hard to manage.
I think that putting a larger number of points randomly in space, and then using the above for a giant vectorized culling, may be the way to go. But I recommend to take little steps first... if you can get the above to work at all well, you can then optimize further (array size, etc).