How do I make a matrix from a list of vectors in R?
Asked Answered
Y

7

119

Goal: from a list of vectors of equal length, create a matrix where each vector becomes a row.

Example:

> a <- list()
> for (i in 1:10) a[[i]] <- c(i,1:5)
> a
[[1]]
[1] 1 1 2 3 4 5

[[2]]
[1] 2 1 2 3 4 5

[[3]]
[1] 3 1 2 3 4 5

[[4]]
[1] 4 1 2 3 4 5

[[5]]
[1] 5 1 2 3 4 5

[[6]]
[1] 6 1 2 3 4 5

[[7]]
[1] 7 1 2 3 4 5

[[8]]
[1] 8 1 2 3 4 5

[[9]]
[1] 9 1 2 3 4 5

[[10]]
[1] 10  1  2  3  4  5

I want:

      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5 
Yeomanry answered 25/8, 2009 at 18:2 Comment(0)
Y
140

One option is to use do.call():

 > do.call(rbind, a)
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5
Yeomanry answered 25/8, 2009 at 18:4 Comment(2)
So the difference between this and the standard rbind() is that do.call() passes each list item as a separate arg - is that right? do.call(rbind,a) is equivalent to rbind(a[[1]], a[[2]]... a[[10]])?Homing
do.call() is great for this purpose, I wish it were better "documented" in the introductory materials.Vagina
C
21

simplify2array is a base function that is fairly intuitive. However, since R's default is to fill in data by columns first, you will need to transpose the output. (sapply uses simplify2array, as documented in help(sapply).)

> t(simplify2array(a))
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5
Calash answered 28/10, 2014 at 21:44 Comment(0)
C
18

The built-in matrix function has the nice option to enter data byrow. Combine that with an unlist on your source list will give you a matrix. We also need to specify the number of rows so it can break up the unlisted data. That is:

> matrix(unlist(a), byrow=TRUE, nrow=length(a) )
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5
Calash answered 28/10, 2014 at 21:36 Comment(2)
Or fill a matrix by columns and then transpose: t( matrix( unlist(a), ncol=length(a) ) ).Calash
this can be 7x faster than the do.call(rbind,)-approach for matrices with many rows, but sometimes won't warn if vectors don't have the same size.Divisor
C
12

Not straightforward, but it works:

> t(sapply(a, unlist))
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    2    3    4    5
 [2,]    2    1    2    3    4    5
 [3,]    3    1    2    3    4    5
 [4,]    4    1    2    3    4    5
 [5,]    5    1    2    3    4    5
 [6,]    6    1    2    3    4    5
 [7,]    7    1    2    3    4    5
 [8,]    8    1    2    3    4    5
 [9,]    9    1    2    3    4    5
[10,]   10    1    2    3    4    5
Chenault answered 26/8, 2009 at 6:30 Comment(1)
With rjson results, colMeans works only for this method! Thank you!Amazonite
G
8
t(sapply(a, '[', 1:max(sapply(a, length))))

where 'a' is a list. Would work for unequal row size

Grindery answered 26/1, 2014 at 15:53 Comment(0)
U
3
> library(plyr)
> as.matrix(ldply(a))
      V1 V2 V3 V4 V5 V6
 [1,]  1  1  2  3  4  5
 [2,]  2  1  2  3  4  5
 [3,]  3  1  2  3  4  5
 [4,]  4  1  2  3  4  5
 [5,]  5  1  2  3  4  5
 [6,]  6  1  2  3  4  5
 [7,]  7  1  2  3  4  5
 [8,]  8  1  2  3  4  5
 [9,]  9  1  2  3  4  5
[10,] 10  1  2  3  4  5
Underfoot answered 9/9, 2009 at 21:7 Comment(3)
This will simply not work if the rows don't have the same length, while do.call(rbind,...) still works.Dael
any clues how to make it work for unequal row size with NA for the missing row data?Grindery
@Dael Actually, do.call(rbind,...) does not work for unequal-length vectors, unless you really intend to have the vector reused when filling in the row at the end. See Arihant's response for a way that fills in with NA values at the end instead.Calash
D
-1

data.table::transpose(a) can be a useful tool here if you your list elements have unequal size or you actually wanted a data.frame instead.

It efficiently turns a length-n list of length-up-to-p vectors into a length-p list of length-n vectors, padding the missing elements with a value of your choice.

# For list of vectors of unequal size if you want to pad instead of recycle
a <- sapply(1:6, function(i) c(i, seq_len(i)))
a
#> [[1]]
#> [1] 1 1
#> 
#> [[2]]
#> [1] 2 1 2
#> 
#> [[3]]
#> [1] 3 1 2 3
#> 
#> [[4]]
#> [1] 4 1 2 3 4
#> 
#> [[5]]
#> [1] 5 1 2 3 4 5
#> 
#> [[6]]
#> [1] 6 1 2 3 4 5 6

matrix(unlist(data.table::transpose(a)), nrow=length(a))
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,]    1    1   NA   NA   NA   NA   NA
#> [2,]    2    1    2   NA   NA   NA   NA
#> [3,]    3    1    2    3   NA   NA   NA
#> [4,]    4    1    2    3    4   NA   NA
#> [5,]    5    1    2    3    4    5   NA
#> [6,]    6    1    2    3    4    5    6
#
## neat if you want a data.frame instead
data.table::setDF(data.table::as.data.table(data.table::transpose(a)))[]
#>   V1 V2 V3 V4 V5 V6 V7
#> 1  1  1 NA NA NA NA NA
#> 2  2  1  2 NA NA NA NA
#> 3  3  1  2  3 NA NA NA
#> 4  4  1  2  3  4 NA NA
#> 5  5  1  2  3  4  5 NA
#> 6  6  1  2  3  4  5  6

It is almost as fast as the matrix(unlist( ), byrow=TRUE) solution and much faster than the t(sapply( approach that also works for unequal lengths.

a <- sapply(1:6, function(i) c(i, seq_len(i)))
a
bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length))))
)
#> # A tibble: 2 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))   6.87µs   8.68µs
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 33.29µs  42.14µs
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>



# small list, equal sizes
a <- sapply(1:6, function(i) c(i, seq_len(5)), simplify = FALSE)
a
#> [[1]]
#> [1] 1 1 2 3 4 5
#> 
#> [[2]]
#> [1] 2 1 2 3 4 5
#> 
#> [[3]]
#> [1] 3 1 2 3 4 5
#> 
#> [[4]]
#> [1] 4 1 2 3 4 5
#> 
#> [[5]]
#> [1] 5 1 2 3 4 5
#> 
#> [[6]]
#> [1] 6 1 2 3 4 5

bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length)))),
  do.call(rbind, a),
  matrix(unlist(a), byrow=TRUE, nrow=length(a) )
)
#> # A tibble: 4 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))   7.03µs   9.06µs
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 32.99µs  36.18µs
#> 3 do.call(rbind, a)                                            2.92µs   3.47µs
#> 4 matrix(unlist(a), byrow = TRUE, nrow = length(a))            2.77µs   3.07µs
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>


# large list, equal sizes
a <- sapply(seq_len(100000), function(i) c(i, seq_len(5)), simplify = FALSE)

bench::mark(
  matrix(unlist(data.table::transpose(a)), nrow=length(a)),
  t(sapply(a, '[', 1:max(sapply(a, length)))),
  do.call(rbind, a),
  matrix(unlist(a), byrow=TRUE, nrow=length(a) )
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 4 × 6
#>   expression                                                      min   median
#>   <bch:expr>                                                 <bch:tm> <bch:tm>
#> 1 matrix(unlist(data.table::transpose(a)), nrow = length(a))  11.62ms  12.54ms
#> 2 t(sapply(a, "[", 1:max(sapply(a, length))))                 94.56ms 101.09ms
#> 3 do.call(rbind, a)                                           59.02ms  70.49ms
#> 4 matrix(unlist(a), byrow = TRUE, nrow = length(a))            7.02ms   7.82ms
#> # ℹ 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>
Divisor answered 11/2 at 9:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.