I am trying to calculate a median value across a number of columns, however my data is a bit funky. It looks like the following example.
library(data.table)
dt <- data.table("ID" = c(1,2,3,4),"none" = c(0,5,5,3),
"ten" = c(3,2,5,4),"twenty" = c(0,2,3,1))
ID none ten twenty
1: 1 0 3 0
2: 2 5 2 2
3: 3 5 5 3
4: 4 3 4 1
In the table to column represents the number of occurrences of that value. I am wanting to calculate the median occurrence.
For example for ID = 1
median(c(10, 10, 10))
is the calculation I am wanting to create.
for ID = 2
median(c(0, 0, 0, 0, 0, 10, 10, 20, 20))
I have tried using rep()
and lapply()
with very limited success and am after some clear guidance on how this might be achieved. I understand for the likes of rep()
I would be having to hard code my value to be repeated (e.g. rep(0,2)
or rep(10,2)
) and this is what I expect. I am just struggling to create a list or vector with the repetitions from each column.
dt
in your example doesn't match the description of the problem. It's only a typo but confused me initially because all answers differ from your expected result. For ID = 2 the numbers for"none"
and"twenty"
are interchanged. With your code you will get for ID = 2median(0,0,0,0,0,10,10,20,20)
. – Trevortrevorr