Asked 17/8, 2017 at 11:51 Answered 17/8, 2017 at 12:19

Solved r list transpose list-manipulation

I have a list structure which represents a table being handed to me like this

> l = list(list(1, 4), list(2, 5), list(3, 6))
> str(l)
List of 3
 $ :List of 2
  ..$ : num 1
  ..$ : num 4
 $ :List of 2
  ..$ : num 2
  ..$ : num 5
 $ :List of 2
  ..$ : num 3
  ..$ : num 6

And I'd like to convert it to this

> lt = list(x = c(1, 2, 3), y = c(4, 5, 6))
> str(lt)
List of 2
 $ x: num [1:3] 1 2 3
 $ y: num [1:3] 4 5 6

I've written a function that does it in a really simple manner which uses Reduce, but I feel like there must be a smarter way to do it.

Any help appreciated, Thanks

Benchmarks

Thanks all! Much appreciated. Benchmarked the answers and picked the fastest for a larger test case:

f1 = function(l) {
  k <- length(unlist(l)) / length(l) 
  lapply(seq_len(k), function(i) sapply(l, "[[", i))
}

f2 = function(l) {
  n <- length(l[[1]])
  split(unlist(l, use.names = FALSE), paste0("x", seq_len(n)))
}

f3 = function(l) {
  split(do.call(cbind, lapply(l, unlist)), seq(unique(lengths(l))))
}

f4 = function(l) { 
  l %>% 
    purrr::transpose() %>%
    map(unlist)
}

f5 = function(l) {
  # bind lists together into a matrix (of lists)
  temp <- Reduce(rbind, l)
  # split unlisted values using indices of columns
  split(unlist(temp), col(temp))
}

f6 = function(l) {
  data.table::transpose(lapply(l, unlist))
}

microbenchmark::microbenchmark(
  lapply     = f1(l),
  split_seq  = f2(l),
  unique     = f3(l),
  tidy       = f4(l),
  Reduce     = f5(l),
  dt         = f6(l),
  times      = 10000
)

Unit: microseconds
      expr     min       lq     mean   median       uq      max neval
    lapply 165.057 179.6160 199.9383 186.2460 195.0005 4983.883 10000
 split_seq  85.655  94.6820 107.5544  98.5725 104.1175 4609.378 10000
    unique 144.908 159.6365 182.2863 165.9625 174.7485 3905.093 10000
      tidy  99.547 122.8340 141.9482 129.3565 138.3005 8545.215 10000
    Reduce 172.039 190.2235 216.3554 196.8965 206.8545 3652.939 10000
        dt  98.072 106.6200 120.0749 110.0985 116.0950 3353.926 10000

Servomechanical answered 17/8, 2017 at 11:51 Comment(3)

What's the logic behind wanted output? List of two vectors or list of vectors with three items? – Conductive 17/8, 2017 at 11:53

List of two vectors but generalisable to n vectors – Servomechanical 17/8, 2017 at 11:56

so you can have x, y, z vectors at the end? – Fdic 17/8, 2017 at 11:57

For the specific example, you can use this pretty simple approach:

split(unlist(l), c("x", "y"))
#$x
#[1] 1 2 3
#
#$y
#[1] 4 5 6

It recycles the x-y vector and splits on that.

To generalize this to "n" elements in each list, you can use:

l = list(list(1, 4, 5), list(2, 5, 5), list(3, 6, 5)) # larger test case

split(unlist(l, use.names = FALSE), paste0("x", seq_len(length(l[[1L]]))))
# $x1
# [1] 1 2 3
# 
# $x2
# [1] 4 5 6
# 
# $x3
# [1] 5 5 5

This assumes, that all the list elements on the top-level of l have the same length, as in your example.

Eggcup answered 17/8, 2017 at 12:13 Comment(0)

Here is one idea with unlisting each list i.e.

split(do.call(cbind, lapply(l, unlist)), seq(unique(lengths(l))))

which gives,

$`1`
[1] 1 2 3

$`2`
[1] 4 5 6

Fdic answered 17/8, 2017 at 12:0 Comment(1)

Thank you - this is slightly slower than the accepted answer which is why I picked the other one :) – Servomechanical 17/8, 2017 at 13:1

We can use

library(tidyverse)
r1 <- l %>% 
        transpose %>%
        map(unlist)
identical(r1, unname(lt))
#[1] TRUE

Gelatinate answered 17/8, 2017 at 11:58 Comment(1)

something similar: data.table::transpose(lapply(x, unlist)). – Kra 17/8, 2017 at 12:0

A second base R method using Reduce and split in two lines is

# bind lists together into a matrix (of lists)
temp <- Reduce(rbind, l)
# split unlisted values using indices of columns
split(unlist(temp), col(temp))
$`1`
[1] 1 2 3

$`2`
[1] 4 5 6

this assumes that each list item has the same number of elements. You can add names in the second line if desired with setNames:

setNames(split(unlist(temp), col(temp)), c("x", "y"))

Downstairs answered 17/8, 2017 at 12:2 Comment(0)

The sapply extracts the ith element of each component of l creating a numeric vector and the lapply applies it over 1:2 (since there are k=2 elements in each component of l).

If you know that k is 2 then the first line could be replaced with k <- 2. Also note that in the first line we divide by max(..., 1) to avoid dividing by 0 in the case that l is a zero length list.

The code below gives the output shown in the question; however, the subject refers to nested lists and if we wanted a list of lists rather than a list of numeric vectors then we could replace sapply with lapply.

k <- length(unlist(l)) / max(length(l) , 1)
lapply(seq_len(k), function(i) sapply(l, "[[", i))

giving:

[[1]]
[1] 1 2 3

[[2]]
[1] 4 5 6

Gabon answered 17/8, 2017 at 12:19 Comment(0)

Benchmarks

Recommended topics

Hot tags