How to cbind or rbind different lengths vectors without repeating the elements of the shorter vectors?
Asked Answered
P

7

72
cbind(1:2, 1:10)  
     [,1] [,2]  
  [1,]    1    1  
  [2,]    2    2  
  [3,]    1    3  
  [4,]    2    4  
  [5,]    1    5  
  [6,]    2    6  
  [7,]    1    7  
  [8,]    2    8  
  [9,]    1    9  
 [10,]    2   10  

I want an output like below

[,1] [,2]  
[1,] 1 1  
[2,] 2 2  
[3,]   3  
[4,]   4  
[5,]   5  
[6,]   6  
[7,]   7  
[8,]   8  
[9,]   9  
[10,]  10  
Pentosan answered 13/9, 2010 at 10:5 Comment(2)
Yup, this is called recycling and is one of R's base concepts. What other behavior do you want?Carliecarlile
This seems to have been addressed through time also: here, here and hereSelima
S
90

The trick is to make all your inputs the same length.

x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n                      
length(y) <- n

If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.

cbind(x, y)
       x  y
 [1,]  1  1
 [2,]  2  2
 [3,] NA  3
 [4,] NA  4
 [5,] NA  5
 [6,] NA  6
 [7,] NA  7
 [8,] NA  8
 [9,] NA  9
[10,] NA 10

To get rid of the NAs, the output must be a list.

Map(function(...) 
   {
      ans <- c(...)
      ans[!is.na(ans)]
   }, as.list(x), as.list(y)
)
[[1]]
[1] 1 1

[[2]]
[1] 2 2

[[3]]
[1] 3

[[4]]
[1] 4

[[5]]
[1] 5

[[6]]
[1] 6

[[7]]
[1] 7

[[8]]
[1] 8

[[9]]
[1] 9

[[10]]
[1] 10

EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.

Slub answered 13/9, 2010 at 10:28 Comment(6)
You could also do r[which(!is.na(r))] assuming that r is a row of the matrix.Salish
length(x) <- n thanks, that was exactly what I was looking forLidless
If you are just looking to write the file, you can replace the NA with blank by doing x[is.na(x)]<-""Smart
for some reason when I do cbind(x,y) I get repetitions... how do you add NA instead?Zavras
@Zavras If the lengths of all the inputs are the same then there is nothing to repeat. Did you change the lengths of the vectors like it says in my answer?Slub
@RichieCotton No I didn't, thanks :) For some reason I just thought you were calculating lengths.Zavras
M
29

I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.

Example

x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)

Results

> head(dta)
      x y
[1,]  1 1
[2,]  2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6

Side comments

Following the OP's original example, column names can be easily removed:

colnames(dta) <- NULL

the operation would produce the desired output in full:

> head(dta)
     [,1] [,2]
[1,]    1    1
[2,]    2    2
[3,]   NA    3
[4,]   NA    4
[5,]   NA    5
[6,]   NA    6
Milden answered 18/4, 2016 at 11:13 Comment(3)
This solution is particularly elegant when working with more than two data sets. Using do.call(qpcR:::cbind.na, ...) allows to easily cbind a list of data.frames of arbitrary length.Bamboo
cbind.na does not exist (anymore?) in the package qpcRMickens
@Mickens it is an internal function hence the use of ::: instead of ::. It is worth noting that calling internal package functions should be used with caution. From ?`:::` : It is typically a design mistake to use ::: in your code since the corresponding object has probably been kept internal for a good reason.Cocker
T
8

I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.

> rowr::cbind.fill(1:2,1:10, fill = NA);

   object object
1       1      1
2       2      2
3      NA      3
4      NA      4
5      NA      5
6      NA      6
7      NA      7
8      NA      8
9      NA      9
10     NA     10

Or alternatively, to match the OP's desired output:

> rowr::cbind.fill(1:2,1:10, fill = '');

   object object
1       1      1
2       2      2
3              3
4              4
5              5
6              6
7              7
8              8
9              9
10            10
Trotter answered 8/11, 2019 at 22:14 Comment(1)
rowr was removed from CRANCinnabar
L
4

Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.

bind_cols_fill <- function(df_list) {

  max_rows <- map_int(df_list, nrow) %>% max()
  
  map(df_list, function(df) {
    if(nrow(df) == max_rows) return(df)
    first <- names(df)[1] %>% sym()
    df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
  }) %>% bind_cols()
}

Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:

x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y)) 
Lilly answered 2/3, 2021 at 15:21 Comment(1)
After a long time searching, this was the only response that worked. RowR is not working anymore!Piave
S
3

Helper function...

bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
  if (side %in% c("b", "r")) {
    out <- sapply(l, 'length<-', value=len)
  } else {
    out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
  if (side %in% c("r", "l")) out <- t(out)
  out
}

Examples:

> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
     [,1] [,2] [,3] [,4] [,5]
[1,]   NA   NA   NA   NA   NA
[2,]    1   NA   NA   NA    1
[3,]    2    1   NA    1    2
[4,]    3    2    1    2    3

[[2]]
     [,1] [,2] [,3] [,4]
[1,]   NA    1    2    3
[2,]   NA   NA    1    2
[3,]   NA   NA   NA    1
[4,]   NA   NA    1    2
[5,]   NA    1    2    3

[[3]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    1    1    1    1
[2,]    2    2   NA    2    2
[3,]    3   NA   NA   NA    3
[4,]   NA   NA   NA   NA   NA

[[4]]
     [,1] [,2] [,3] [,4]
[1,]    1    2    3   NA
[2,]    1    2   NA   NA
[3,]    1   NA   NA   NA
[4,]    1    2   NA   NA
[5,]    1    2    3   NA
Saith answered 8/4, 2016 at 20:52 Comment(0)
D
0

Another solution with no dependencies:

my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
    len_diff <- length(x) - length(y)
    y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
    len_diff <- length(y) - length(x)
    x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])
Depend answered 9/4, 2021 at 11:30 Comment(0)
C
0

Using other peoples' ideas here and there, below is my own cbind.fill that:

  • outputs a data frame
  • works with vectors, data frames and matrices alike
  • keeps data frames variables classes
  • uses only base functions
  • gives you the option of giving custom names to the output data frames
  • makes me proud
cbind.fill = function(...,names=NA) {
  xlist = list(...)
  y= Reduce(
    function(a,b) {
      if(is.vector(a)) na = length(a)
      if(is.data.frame(a)|is.matrix(a)) na = nrow(a)
      if(is.vector(b)) nb = length(b)
      if(is.data.frame(b)|is.matrix(b)) nb = nrow(b)
      subset(
        merge(
          cbind(cbindfill.id = 1:na, a),
          cbind(cbindfill.id = 1:nb, b),
          all = TRUE,by="cbindfill.id"
        ),
        select = -cbindfill.id
      )}
    ,xlist)
  if(!is.na(names[1])) colnames(y) <- names
  return(y)
  }

Long story short, it creates the NA using the merge function and bypasses the merge function limitation to two items by using the Reduce function.

Here is an example to test it:

x <- 1:2
y <- 1:5
z <- data.frame(my=letters[1:4],your=as.integer(5:8),his=as.factor(12:15))

> cbind.fill(x,y,z)
   a b   my your  his
1  1 1    a    5   12
2  2 2    b    6   13
3 NA 3    c    7   14
4 NA 4    d    8   15
5 NA 5 <NA>   NA <NA>

> str(cbind.fill(x,y,z))
'data.frame':   5 obs. of  5 variables:
 $ a   : int  1 2 NA NA NA
 $ b   : int  1 2 3 4 5
 $ my  : chr  "a" "b" "c" "d" ...
 $ your: int  5 6 7 8 NA
 $ his : Factor w/ 4 levels "12","13","14",..: 1 2 3 4 NA
Child answered 8/8, 2023 at 7:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.