I am generating a scatter plot of ~300k data points and am having the issue that it is so over-crowded in some places that no structure is visible - So I had a thought!
I want to have the plot generate a contour plot for the densest parts and leave the less-dense areas with the scatter()
data points.
So I was trying to individually compute a nearest-neighbour distance for each of the data points and then when this distance hit a specific value, draw a contour and fill it, then when it hit a much larger value (less dense) just do the scatter...
I have been trying and failing for a few days now, I am not sure that the conventional contour plot will work in this case.
I would supply code but it is so messy and would probably just confuse the issue. And it is so computationally intensive that it would probably just crash my pc if it did work!
Thank you all in advance!
p.s. I have been searching and searching for an answer! I am convinced it is not even possible for all the results it turned up!
Edit: So the idea of this is to see where some particular points lie within the structure of the 300k sample. Here is an example plot, my points are scattered in three diff. colours.
I will attempt to randomly sample 1000 datapoints from my data and upload it as a text file. Cheers Stackers. :)
Edit: Hey,
Here are some sample data 1000 lines - just two columns [X,Y]
(or [g-i,i]
from plot above) space delimited. Thank you all!
the data
scatter(x, y, alpha=0.1)
or some suitable small value. To do what you suggest, I would build a kernel density estimate (seescipy.stats.kde
). – Gouacheplt.hexbin()
but I do not think that they are as instantly clear as a contour plot. Nor is it as easy (for a viewer) to quantitatively determine the value of specific regions. Sorry for the misunderstanding. – Acanthoidnp.histogram2d
to make an array of bin counts, then draw them as a contour plot instead? In terms of quantification you could normalize by bin size so that your values correspond to the density of points in each bin. You could also use KDE and plot the estimated probability density function of your data, although this has a slightly different meaning to your original plot. – Wilinski