Convert data.frame column to a vector?
Asked Answered
J

12

222

I have a dataframe such as:

a1 = c(1, 2, 3, 4, 5)
a2 = c(6, 7, 8, 9, 10)
a3 = c(11, 12, 13, 14, 15)
aframe = data.frame(a1, a2, a3)

I tried the following to convert one of the columns to a vector, but it doesn't work:

avector <- as.vector(aframe['a2'])
class(avector) 
[1] "data.frame"

This is the only solution I could come up with, but I'm assuming there has to be a better way to do this:

class(aframe['a2']) 
[1] "data.frame"
avector = c()
for(atmp in aframe['a2']) { avector <- atmp }
class(avector)
[1] "numeric"

Note: My vocabulary above may be off, so please correct me if so. I'm still learning the world of R. Additionally, any explanation of what's going on here is appreciated (i.e. relating to Python or some other language would help!)

Jules answered 15/8, 2011 at 20:8 Comment(1)
As you're seeing in the answers, a close reading of ?'[.data.frame' will take you very far.Elene
E
266

I'm going to attempt to explain this without making any mistakes, but I'm betting this will attract a clarification or two in the comments.

A data frame is a list. When you subset a data frame using the name of a column and [, what you're getting is a sublist (or a sub data frame). If you want the actual atomic column, you could use [[, or somewhat confusingly (to me) you could do aframe[,2] which returns a vector, not a sublist.

So try running this sequence and maybe things will be clearer:

avector <- as.vector(aframe['a2'])
class(avector) 

avector <- aframe[['a2']]
class(avector)

avector <- aframe[,2]
class(avector)
Elene answered 15/8, 2011 at 20:19 Comment(5)
+1 This is useful. I had gotten used to using aframe[,"a2"] because of the ability to use this with both data frames and matrices & seem to get the same results - a vector.Cybil
[..., drop = F] will always return a data framePolysemy
This is particularly good to know because the df$x syntax returns a vector. I used this syntax for a long time, but when I had to start using df['name'] or df[n] to retrieve columns, I hit problems when I tried to send them to functions that expected vectors. Using df[[n]] or df[['x']] cleared things right up.Tonita
Why does as.vector seem to silently have no effect? Shouldn't this either return a vector or conspicuously fail?Scrooge
aframe[['a2']] is very useful with sf objects because aframe[,"a2"] will return two columns because the geometry column is included.Catalase
G
83

There's now an easy way to do this using dplyr.

dplyr::pull(aframe, a2)
Gradus answered 8/1, 2018 at 17:21 Comment(0)
J
38

You could use $ extraction:

class(aframe$a1)
[1] "numeric"

or the double square bracket:

class(aframe[["a1"]])
[1] "numeric"
Jeanajeanbaptiste answered 15/8, 2011 at 20:20 Comment(0)
I
22

You do not need as.vector(), but you do need correct indexing: avector <- aframe[ , "a2"]

The one other thing to be aware of is the drop=FALSE option to [:

R> aframe <- data.frame(a1=c1:5, a2=6:10, a3=11:15)
R> aframe
  a1 a2 a3
1  1  6 11
2  2  7 12
3  3  8 13
4  4  9 14
5  5 10 15
R> avector <- aframe[, "a2"]
R> avector
[1]  6  7  8  9 10
R> avector <- aframe[, "a2", drop=FALSE]
R> avector
  a2
1  6
2  7
3  8
4  9
5 10
R> 
Inge answered 15/8, 2011 at 20:19 Comment(2)
+1: The reminder of drop=FALSE is useful - this helps me in cases where I may select N columns from a data.frame, in those cases where N=1.Cybil
I use this when I can't foresee the number of columns selected and in case one column comes up, the result still gets passed as a data.frame with n columns. A vector may throw a monkey wrench into the functions down the line.Hettiehetty
A
19

You can try something like this-

as.vector(unlist(aframe$a2))
Atmosphere answered 5/10, 2018 at 3:48 Comment(2)
This is good if you want to compare two columns using identical.Uprear
This is also helpful if you don't know the column name ahead of time...i.e. as.vector(unlist(aframe[,1]))Wriest
A
14

Another advantage of using the '[[' operator is that it works both with data.frame and data.table. So if the function has to be made running for both data.frame and data.table, and you want to extract a column from it as a vector then

data[["column_name"]] 

is best.

Adora answered 14/9, 2016 at 7:26 Comment(1)
Simple! Thank you!Stony
P
11
as.vector(unlist(aframe['a2']))
Perorate answered 19/10, 2019 at 12:57 Comment(0)
B
6
a1 = c(1, 2, 3, 4, 5)
a2 = c(6, 7, 8, 9, 10)
a3 = c(11, 12, 13, 14, 15)
aframe = data.frame(a1, a2, a3)
avector <- as.vector(aframe['a2'])

avector<-unlist(avector)
#this will return a vector of type "integer"
Bield answered 2/7, 2017 at 13:52 Comment(0)
D
5

If you just use the extract operator it will work. By default, [] sets option drop=TRUE, which is what you want here. See ?'[' for more details.

>  a1 = c(1, 2, 3, 4, 5)
>  a2 = c(6, 7, 8, 9, 10)
>  a3 = c(11, 12, 13, 14, 15)
>  aframe = data.frame(a1, a2, a3)
> aframe[,'a2']
[1]  6  7  8  9 10
> class(aframe[,'a2'])
[1] "numeric"
Dismount answered 15/8, 2011 at 20:20 Comment(0)
G
2

I use lists to filter dataframes by whether or not they have a value %in% a list.

I had been manually creating lists by exporting a 1 column dataframe to Excel where I would add " ", around each element, before pasting into R: list <- c("el1", "el2", ...) which was usually followed by FilteredData <- subset(Data, Column %in% list).

After searching stackoverflow and not finding an intuitive way to convert a 1 column dataframe into a list, I am now posting my first ever stackoverflow contribution:

# assuming you have a 1 column dataframe called "df"
list <- c()
for(i in 1:nrow(df)){
  list <- append(list, df[i,1])
}
View(list)
# This list is not a dataframe, it is a list of values
# You can filter a dataframe using "subset([Data], [Column] %in% list")
Gamopetalous answered 3/12, 2018 at 18:17 Comment(0)
J
2

We can also convert data.frame columns generically to a simple vector. as.vector is not enough as it retains the data.frame class and structure, so we also have to pull out the first (and only) element:

df_column_object <- aframe[,2]
simple_column <- df_column_object[[1]]

All the solutions suggested so far require hardcoding column titles. This makes them non-generic (imagine applying this to function arguments).

Alternatively, you could, of course read the column names from the column first and then insert them in the code in the other solutions.

Jedjedd answered 10/4, 2020 at 2:15 Comment(0)
S
0

Another option is using as.matrix with as.vector. This can be done for one column but is also possible if you want to convert all columns to one vector. Here is a reproducible example with first converting one column to a vector and second convert complete dataframe to one vector:

a1 = c(1, 2, 3, 4, 5)
a2 = c(6, 7, 8, 9, 10)
a3 = c(11, 12, 13, 14, 15)
aframe = data.frame(a1, a2, a3)

# Convert one column to vector
avector <- as.vector(as.matrix(aframe[,"a2"]))
class(avector)
#> [1] "numeric"
avector
#> [1]  6  7  8  9 10

# Convert all columns to one vector
avector <- as.vector(as.matrix(aframe))
class(avector)
#> [1] "numeric"
avector
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

Created on 2022-08-27 with reprex v2.0.2

Sika answered 27/8, 2022 at 9:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.