how to move up the values within each group in R
Asked Answered
Z

4

6

I need to shift valid values to the top the of dataframe withing each id. Here is an example dataset:

df <- data.frame(id = c(1,1,1,2,2,2,3,3,3,3),
                 itemid = c(1,2,3,1,2,3,1,2,3,4),
                 values = c(1,NA,0,NA,NA,0,1,NA,0,NA))
    
df
   id itemid values
1   1      1      1
2   1      2     NA
3   1      3      0
4   2      1     NA
5   2      2     NA
6   2      3      0
7   3      1      1
8   3      2     NA
9   3      3      0
10  3      4     NA

excluding the id column, when there is a missing value in values column, I want to shift all values aligned to the top for each id.

How can I get this desired dataset below?

df1
   id itemid values
1   1      1      1
2   1      2      0
3   1      3     NA
4   2      1      0
5   2      2     NA
6   2      3     NA
7   3      1      1
8   3      2      0
9   3      3     NA
10  3      4     NA
Zondra answered 30/11, 2022 at 18:48 Comment(0)
J
8

Using tidyverse you can arrange by whether values is missing or not (which will put those at the bottom).

library(tidyverse)

df %>%
  arrange(id, is.na(values))

Output

      id itemid values
   <dbl>  <dbl>  <dbl>
 1     1      1      1
 2     1      3      0
 3     1      2     NA
 4     2      3      0
 5     2      1     NA
 6     2      2     NA
 7     3      1      1
 8     3      3      0
 9     3      2     NA
10     3      4     NA

Or, if you wish to retain the same order for itemid and other columns, you can use mutate to specifically order columns of interest (like values). Other answers provide good solutions, such as @Santiago and @ThomasIsCoding. If you have multiple columns of interest to move NA to the bottom per group, you can also try:

df %>%
  group_by(id) %>%
  mutate(across(.cols = values, ~values[order(is.na(.))]))

where the .cols argument would contain the columns to transform and reorder independently.

Output

      id itemid values
   <dbl>  <dbl>  <dbl>
 1     1      1      1
 2     1      2      0
 3     1      3     NA
 4     2      1      0
 5     2      2     NA
 6     2      3     NA
 7     3      1      1
 8     3      2      0
 9     3      3     NA
10     3      4     NA
Jacks answered 30/11, 2022 at 18:55 Comment(2)
@RitchieSacramento I was just looking at this! Thanks for the comment - corrected.Jacks
But doing this the order of itemid is lost. Compare with the desired output in the question.Draftee
G
3

We can try ave + order

> transform(df,  values = ave(values, id, FUN = function(x) x[order(is.na(x))]))
   id itemid values
1   1      1      1
2   1      2      0
3   1      3     NA
4   2      1      0
5   2      2     NA
6   2      3     NA
7   3      1      1
8   3      2      0
9   3      3     NA
10  3      4     NA
Gorgonzola answered 30/11, 2022 at 19:24 Comment(0)
D
2

With data.table:

library(data.table)

setDT(df)[, values := values[order(is.na(values))], id][]
#>     id itemid values
#>  1:  1      1      1
#>  2:  1      2      0
#>  3:  1      3     NA
#>  4:  2      1      0
#>  5:  2      2     NA
#>  6:  2      3     NA
#>  7:  3      1      1
#>  8:  3      2      0
#>  9:  3      3     NA
#> 10:  3      4     NA
Dania answered 30/11, 2022 at 19:27 Comment(0)
D
2

I'd define a function that does what you want and then group by id:

completed_first <- function(x) {
  completed <- x[!is.na(x)]
  length(completed) <- length(x)
  completed
}

library(dplyr)

df %>%
  group_by(id) %>%
  mutate(
    values = completed_first(values)
  ) %>%
  ungroup()
# # A tibble: 10 × 3
#       id itemid values
#    <dbl>  <dbl>  <dbl>
#  1     1      1      1
#  2     1      2      0
#  3     1      3     NA
#  4     2      1      0
#  5     2      2     NA
#  6     2      3     NA
#  7     3      1      1
#  8     3      2      0
#  9     3      3     NA
# 10     3      4     NA

(This method preserves the order of itemid.)


Or building upon ThomasIsCoding's answer:

library(dplyr)

df %>%
  group_by(id) %>%
  mutate(
    values = values[order(is.na(values))]
  ) %>%
  ungroup()
# # A tibble: 10 × 3
#       id itemid values
#    <dbl>  <dbl>  <dbl>
#  1     1      1      1
#  2     1      2      0
#  3     1      3     NA
#  4     2      1      0
#  5     2      2     NA
#  6     2      3     NA
#  7     3      1      1
#  8     3      2      0
#  9     3      3     NA
# 10     3      4     NA
Draftee answered 1/12, 2022 at 2:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.