Add Columns to an empty data frame in R
Asked Answered
M

3

30

I have searched extensively but not found an answer to this question on Stack Overflow.

Lets say I have a data frame a.

I define:

a <- NULL
a <- as.data.frame(a)

If I wanted to add a column to this data frame as so:

a$col1 <- c(1,2,3)

I get the following error:

Error in `$<-.data.frame`(`*tmp*`, "a", value = c(1, 2, 3)) : 
    replacement has 3 rows, data has 0

Why is the row dimension fixed but the column is not?

How do I change the number of rows in a data frame?

If I do this (inputting the data into a list first and then converting to a df), it works fine:

a <- NULL
a$col1 <- c(1,2,3)
a <- as.data.frame(a)
Meltage answered 31/10, 2014 at 22:4 Comment(0)
V
12

The row dimension is not fixed, but data.frames are stored as list of vectors that are constrained to have the same length. You cannot add col1 to a because col1 has three values (rows) and a has zero, thereby breaking the constraint. R does not by default auto-vivify values when you attempt to extend the dimension of a data.frame by adding a column that is longer than the data.frame. The reason that the second example works is that col1 is the only vector in the data.frame so the data.frame is initialized with three rows.

If you want to automatically have the data.frame expand, you can use the following function:

cbind.all <- function (...) 
{
    nm <- list(...)
    nm <- lapply(nm, as.matrix)
    n <- max(sapply(nm, nrow))
    do.call(cbind, lapply(nm, function(x) rbind(x, matrix(, n - 
        nrow(x), ncol(x)))))
}

This will fill missing values with NA. And you would use it like: cbind.all( df, a )

Valgus answered 1/11, 2014 at 0:14 Comment(2)
I guess this close to what I have been doing already. I thought there is a smarter solution than this. I do realize that I was using a vector as opposed to adding it to a dataframe. Also, I probably didn't describe what I meant well when I used the word fixed. The row dimension cannot be changed easily like the column dimension.Meltage
Also, it is the same answer as this: #7962767Meltage
Q
4

if you have an empty dataframe, called for example df, in my opinion another quite simple solution is the following:

df[1,]=NA  # ad a temporary new row of NA values
df[,'new_column'] = NA # adding new column, called for example 'new_column'
df = df[0,] # delete row with NAs

I hope this may help.

Quicken answered 6/12, 2022 at 14:17 Comment(0)
C
2

You could also do something like this where I read in data from multiple files, grab the column I want, and store it in the dataframe. I check whether the dataframe has anything in it, and if it doesn't, create a new one rather than getting the error about mismatched number of rows:

readCounts = data.frame()

for(f in names(files)){
    d = read.table(files[f], header=T, as.is=T)
    d2 = round(data.frame(d$NumReads))
    colnames(d2) = f
    if(ncol(readCounts) == 0){
        readCounts = d2
        rownames(readCounts) = d$Name
    } else{
        readCounts = cbind(readCounts, d2)
    }
}
Clancy answered 28/11, 2017 at 23:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.