Levels function returning NULL
Asked Answered
R

3

8

I'm hoping this is an easy fix. Whenever I run levels(df) am I given a NULL output. This isn't specific to my data frame as it occurs with any set of data that I use. I am thinking that there may be an issue with one of my packages. Has anyone run into this or know of a fix? Thanks

Rapper answered 7/2, 2018 at 1:11 Comment(2)
levels tells you the levels of a factor. It does not directly apply to data framesTangier
levels looks at the levels attribute of an object. data.frames doesn't necessarily have one, hence the NULL output.Island
R
8

You can only run levels on a factor vector, not on a data frame.

Example below

> df <- data.frame(a = factor(c('a','b','c'), levels = c('a','b','c','d','e')),
+                  b = factor(c('a','b','c')), 
+                  c = factor(c('a','a','c')))
> levels(df)
NULL

To see the level of every column in your data frame, you can use lapply

> lapply(df, levels)
$a
[1] "a" "b" "c" "d" "e"

$b
[1] "a" "b" "c"

$c
[1] "a" "c"

If you want the levels of a specific column, you can specify that instead:

> levels(df[, 2])
[1] "a" "b" "c"

EDIT: To answer question below on why apply(df, 2, levels) returns NULL.

Note the following from the documentation for apply():

In all cases the result is coerced by as.vector to one of the basic vector types before the dimensions are set, so that (for example) factor results will be coerced to a character array.

You can see this behavior when you try to take the class, and try a few other functions.

> apply(df, 2, levels)
NULL
> apply(df, 2, class)
          a           b           c 
"character" "character" "character" 
> apply(df, 2, function(i) levels(i))
NULL
> apply(df, 2, function(i) levels(factor(i)))
$`a`
[1] "a" "b" "c"

$b
[1] "a" "b" "c"

$c
[1] "a" "c"

Note that even though we can force apply() to treat the columns as factors, we lose the prior ordering/levels that were set for df when it was originally created (see column `a`). This is because it has been coerced into a character vector.

Relentless answered 7/2, 2018 at 1:16 Comment(1)
Any idea why apply(df,2,levels) also returns NULL?Dingo
M
18

When initializing a dataframe, pass stringsAsFactors = T in the initialization

eg. dataFrame <- read.csv(file.choose(), stringsAsFactors=T)

this makes R treat the string values as factors. Hope it helped

Mornay answered 1/7, 2020 at 6:22 Comment(1)
This fixed it for me.Stockman
R
8

You can only run levels on a factor vector, not on a data frame.

Example below

> df <- data.frame(a = factor(c('a','b','c'), levels = c('a','b','c','d','e')),
+                  b = factor(c('a','b','c')), 
+                  c = factor(c('a','a','c')))
> levels(df)
NULL

To see the level of every column in your data frame, you can use lapply

> lapply(df, levels)
$a
[1] "a" "b" "c" "d" "e"

$b
[1] "a" "b" "c"

$c
[1] "a" "c"

If you want the levels of a specific column, you can specify that instead:

> levels(df[, 2])
[1] "a" "b" "c"

EDIT: To answer question below on why apply(df, 2, levels) returns NULL.

Note the following from the documentation for apply():

In all cases the result is coerced by as.vector to one of the basic vector types before the dimensions are set, so that (for example) factor results will be coerced to a character array.

You can see this behavior when you try to take the class, and try a few other functions.

> apply(df, 2, levels)
NULL
> apply(df, 2, class)
          a           b           c 
"character" "character" "character" 
> apply(df, 2, function(i) levels(i))
NULL
> apply(df, 2, function(i) levels(factor(i)))
$`a`
[1] "a" "b" "c"

$b
[1] "a" "b" "c"

$c
[1] "a" "c"

Note that even though we can force apply() to treat the columns as factors, we lose the prior ordering/levels that were set for df when it was originally created (see column `a`). This is because it has been coerced into a character vector.

Relentless answered 7/2, 2018 at 1:16 Comment(1)
Any idea why apply(df,2,levels) also returns NULL?Dingo
C
0

Depending on how the data frame was read, it is possible that the columns or arrays may not be processed as factors (conveniently taken as input by levels). Hence a simple fix is:

levels(as.factor(df$COLUMN_of_choice))
Countrified answered 7/2 at 11:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.