How to pivot_longer a set of multiple columns? and How to go back from that long format to original wide?
Asked Answered
R

1

3

If I have the following data:

D = tibble::tribble(
  ~firm, ~ind, ~var1_1, ~var1_2, ~op2_1, ~op2_2,
  "A",     1,     10,     11,     11,     12,
  "A",     2,     12,     13,     13,     14,
  "B",     1,     14,     15,     15,     16,
  "B",     2,     16,     17,     17,     18,
  "C",     1,     18,     19,     19,     20,
  "C",     2,     20,     21,     21,     22,
)

How can I pivot_longer() var1 and var2 having "_*" as year indicator?

I mean, I would like have something like this:

D %>%
  pivot_longer(var1_1:op2_2,
  names_to = c(".value", "year"),
  names_pattern = "(.*)_(.*)",
  values_to = c("var1, var2")
  )
# A tibble: 12 x 5
   firm    ind year   var1  op2
   <chr> <dbl> <chr> <dbl> <dbl>
 1 A         1 1        10    11
 2 A         1 2        11    12
 3 A         2 1        12    13
 4 A         2 2        13    14
 5 B         1 1        14    15
 6 B         1 2        15    16
 7 B         2 1        16    17
 8 B         2 2        17    18
 9 C         1 1        18    19
10 C         1 2        19    20
11 C         2 1        20    21
12 C         2 2        21    22

I'm achieving the desired result using the code above. However in my real case I'm dealing with more than 30 variables and 10 years. Then, using values_to isn't practical and clean. I'd like the code read first part of variable name as the desired new variable name. Since initially all columns to be pivoted are structured like "varname_year".

Besides, once I get the new data format into long, I might need to go back to wide-format keeping the initial data structure.

Riant answered 16/1, 2020 at 21:51 Comment(0)
R
7

We can use one of the select_helpers

library(dplyr)
library(tidyr)
library(stringr)
Dlong <- D %>%
          pivot_longer(cols = starts_with('var'), 
             names_to = c(".value", "year"), names_sep = "_")

From the 'long' format, change to 'wide' with pivot_wider

Dlong %>%
    pivot_wider(names_from = ind, values_from = str_c("var", 1:2))
Revisory answered 16/1, 2020 at 21:54 Comment(3)
What if all columns starts with a different name?Riant
@Riant Many options are there, either matches or just use a numeric indexRevisory
Upvote because you are awesome.Garda

© 2022 - 2024 — McMap. All rights reserved.