My objective is to create a simple density or barplot of a long dataframe which shows the relative frequency of nationalities in a course (MOOC). I just don't want all of the nationalities in there, just the top 10. I created this example df below + the ggplot2 code I use for plotting.
d=data.frame(course=sample(LETTERS[1:5], 500,replace=T),nationality=as.factor(sample(1:172,500,replace=T)))
mm <- ggplot(d, aes(x=nationality, colour=factor(course)))
mm + geom_bar() + theme_classic()
...but as said: I want a subset of the entire dataset based on frequency. The above shows all data.
PS. I added the ggplot2 code for context but also because maybe there is something within ggplot2 itself that would make this possible (I doubt it however).
EDIT 2014-12-11: The current answers use ddplyr or table methods to arrive at the desired subset, but I wonder if there is not a more direct way to achieve the same.. I will let it stay for now, see if there are other ways.
ggplot
. This Q&A, which is similar to your case, is one of many examples of that. Cheers. – Alburga