This is in response to a question asked on the r-help mailing list.
Here are lots of examples of how to find top values by group using sql
, so I imagine it's easy to convert that knowledge over using the R sqldf
package.
An example: when mtcars
is grouped by cyl
, here are the top three records for each distinct value of cyl
. Note that ties are excluded in this case, but it'd be nice to show some different ways to treat ties.
mpg cyl disp hp drat wt qsec vs am gear carb ranks
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 2.0
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 1.0
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 2.0
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 3.0
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 1.0
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 1.5
Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 1.5
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 3.0
How to find the top or bottom (maximum or minimum) N records per group?
k
records per group, this question can help: https://mcmap.net/q/281732/-get-top-k-records-per-group-where-k-differs-by-group-in-r-data-table/1840471 – Nickey