data.table avoid recycling
Asked Answered
M

3

8

I'm constructing a data.table from two (or more) input vectors with different lengths:

x <- c(1,2,3,4)
y <- c(8,9)

dt <- data.table(x = x, y = y)

And need the shorter vector(s) to be filled with NA rather than recycling their values, resulting in a data.table like this:

   x  y
1: 1  8
2: 2  9
3: 3 NA
4: 4 NA

Is there a way to achieve this without explicitly filling the shorter vector(s) with NA before passing them to the data.table() constructor?

Thanks!

Mazzola answered 18/3, 2018 at 9:36 Comment(0)
B
8

One can use out of range indices:

library("data.table")

x <- c(1,2,3,4)
y <- c(8,9)
n <- max(length(x), length(y))

dt <- data.table(x = x[1:n], y = y[1:n])
# > dt
#    x  y
# 1: 1  8
# 2: 2  9
# 3: 3 NA
# 4: 4 NA

Or you can extend y by doing (as @Roland recommended in the comment):

length(y) <- length(x) <- max(length(x), length(y))
dt <- data.table(x, y)
Bingaman answered 18/3, 2018 at 9:49 Comment(1)
length(y) <- length(x)Buttons
S
3

An option is cbind.fill from rowr

library(rowr)
setNames(cbind.fill(x, y, fill = NA), c("x", "y"))

Or place the vectors in a list and then pad NA at the end based on the maximum length of the list elements

library(data.table)
lst <- list(x = x, y = y)
as.data.table(lapply(lst, `length<-`, max(lengths(lst))))
#   x  y
#1: 1  8
#2: 2  9
#3: 3 NA
#4: 4 NA
Soldier answered 18/3, 2018 at 9:42 Comment(1)
How would you go about this if your list was not a list of vectors but a list of data.frames with an identical number of columns but a varying number of rows each?Auerbach
H
3

The "out of range indices" answer provided by jogo can be extended cleanly to in-place assignment using .N:

x <- c(1,2,3,4)
y <- c(8,9)
n <- max(length(x), length(y))
dt <- data.table(x = x[1:n], y = y[1:n])

z <- c(6,7)
dt[, z := z[1:.N]]
#    x  y  z
# 1: 1  8  6
# 2: 2  9  7
# 3: 3 NA NA
# 4: 4 NA NA
Henryhenryetta answered 15/6, 2019 at 17:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.