How can I find the 2 highest values for each row, then add them together?

Asked 30/11, 2020 at 14:12 Answered 30/11, 2020 at 15:3

In the table below, how can I find the two highest values in each row, then add these values together?

I have a copy of the attached table in RStudio. Is there a line of code I can use to add the two highest numbers on each row, so that I can apply to this to a much bigger dataset?

Table

Workhouse answered 30/11, 2020 at 14:12 Comment(1)

One approach would simply be to calculate the ranks per row nd then add rank 1 + rank 2. – Dimmick 30/11, 2020 at 14:17

You can do a rowwise calculation where you sort the values in the specified columns and sum the highest two:

library(dplyr)

df <- data.frame(Mon = c(12,15,42,43,56,73,23),
                 Tues = c(15,14,12,75,98,79,68),
                 Wed = c(13,42,35,64,35,95,56),
                 Thur = c(23,46,32,94,78,68,35),
                 Friday = c(25,23,64,35,27,54,32))



df %>% 
  rowwise() %>% 
  mutate(two_max = sum(sort(c(Mon, Tues, Wed, Thur, Friday), decreasing = TRUE)[1:2])) %>% 
  ungroup()

If you don't want to specify the column names manually you can also select all numeric columns at once:


df %>% 
  rowwise() %>% 
  mutate(two_max = sum(sort(c_across(where(is.numeric)), decreasing = TRUE)[1:2])) %>% 
  ungroup()

Both strategies give the result:


# A tibble: 7 x 6
    Mon  Tues   Wed  Thur Friday two_max
  <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>
1    12    15    13    23     25      48
2    15    14    42    46     23      88
3    42    12    35    32     64     106
4    43    75    64    94     35     169
5    56    98    35    78     27     176
6    73    79    95    68     54     174
7    23    68    56    35     32     124

Ellison answered 30/11, 2020 at 14:25 Comment(0)

If you prefer one-liner, take that one

df <- data.frame(Mon = c(12,15,42,43,56,73,23),
                 Tues = c(15,14,12,75,98,79,68),
                 Wed = c(13,42,35,64,35,95,56),
                 Thur = c(23,46,32,94,78,68,35),
                 Friday = c(25,23,64,35,27,54,32))

sol <- apply(df, 1, function(x) sum(max(x), max(x[-which(x == max(x))[1]])))

Note that there are slightly more efficient ways. Lmk if you wanna hear them

Repeated answered 30/11, 2020 at 15:3 Comment(0)

Using apply with mtcars as an example:

top2 <- x <- apply(mtcars, 1, function(x) sort(x, decreasing = TRUE)[1:2])
top2 <- matrix(top2, ncol = 2, byrow = TRUE)
addem <- rowSums(top2)
top2plus <- cbind(mtcars, addem)
head(top2plus, 5)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb addem
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   270
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   270
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   201
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   368
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   535

Condon answered 30/11, 2020 at 14:33 Comment(0)

One way to do this would be to reshape the dataframe to have tidy data (i.e. the days as a variable, the values as another), and use the dplyr verbs to process the data into a summary:

df <- data.frame(ID = LETTERS[1:7],
                 Mon = c(12, 15, 42, 43, 56, 73, 23),
                 Tues = c(15, 14, 12, 75, 98, 79, 68),
                 Wed = c(13, 42, 35, 64, 35, 95, 56),
                 Thur = c(23, 46, 32, 94, 78, 68, 35),
                 Friday = c(25, 23, 64, 35, 27, 54, 32))

library(dplyr)
library(tidyr)
df %>% 
  pivot_longer(-ID) %>%         # reshape
  group_by(ID) %>%              # change scope to each ID
  slice_max(value, n = 2) %>%   # keep two maximums
  summarise(max2 = sum(value))  # sum them
#> # A tibble: 7 x 2
#>   ID     max2
#>   <chr> <dbl>
#> 1 A        48
#> 2 B        88
#> 3 C       106
#> 4 D       169
#> 5 E       176
#> 6 F       174
#> 7 G       124

^{Created on 2020-12-01 by the reprex package (v0.3.0)}

Convincing answered 30/11, 2020 at 14:37 Comment(0)

Recommended topics

Hot tags