I have a dataframe in R. For each row, I would like to select which column has the highest value, and paste the name of that column. This is simple when there are only two columns to chose from (note that I have a filtering step that doesn't include rows if both columns have a value of less than 0.1):
set.seed(6)
mat_simple <- matrix(rexp(200, rate=.1), ncol=2) %>%
as.data.frame()
head(mat_simple)
V1 V2
1 2.125366 6.7798683
2 1.832349 8.9610534
3 6.149668 15.7777370
4 3.532614 0.2355711
5 21.110703 1.2927119
6 2.871455 16.7370847
mat_simple <- mat_simple %>%
mutate(
class = case_when(
V1 < 0.1 & V2 < 0.1 ~ NA_character_,
V1 > V2 ~ "V1",
V2 > V1 ~ "V2"
)
)
head(mat_simple)
V1 V2 class
1 2.125366 6.7798683 V2
2 1.832349 8.9610534 V2
3 6.149668 15.7777370 V2
4 3.532614 0.2355711 V1
5 21.110703 1.2927119 V1
6 2.871455 16.7370847 V2
However, this doesn't work effeciently when there is more than two columns. Eg:
set.seed(6)
mat_hard <- matrix(rexp(200, rate=.1), ncol=5) %>%
as.data.frame()
head(mat_hard)
V1 V2 V3 V4 V5
1 2.125366 26.427335 13.7289349 1.7513873 6.297978
2 1.832349 10.241441 5.3084648 0.3347235 29.247774
3 6.149668 5.689442 5.4546072 4.5035747 11.646721
4 3.532614 10.397464 6.5560545 4.4221171 1.713909
5 21.110703 9.928022 0.2284966 0.2101213 1.033498
6 2.871455 4.781357 3.3246585 15.8878010 4.004967
Is there a better solution for this, preferably using dplyr
?