R cor.test : "not enough finite observations"

Asked 9/7, 2014 at 16:33 Answered 7/4, 2023 at 18:22

I'm currently trying to create an R function computing the corr.test correlation of a specified column with all the numeric columns of a dataframe. Here's my code :

#function returning only numeric columns
only_num <- function(dataframe)
{
  nums <- sapply(dataframe, is.numeric)
  dataframe[ , nums]
}

#function returning a one-variable function computing the cor.test correlation of the variable
#with the specified column

function_generator <- function(column)
  {
    function(x)
    {
      cor.test(x, column, na.action = na.omit)
    } 
  }

data_analysis <- function(dataframe, column)
  {
  DF <- only_num(dataframe)

  fonction_corr <- function_generator(column)

  sapply(DF, fonction_corr)

  }

data_analysis(40, 6, m, DF$Morphine)

When I call "data_analysis" at the last line, I get the following error :

"Error in cor.test.default(x, column, na.action=na.omit) : not enough finite observations"

What could it mean? What should I change? I'm kind of stuck...

Thanks.

Clément

Behl answered 9/7, 2014 at 16:33 Comment(0)

"Not enough finite obervations" is an error returned by cor.test under certain circumstances. If you take a look a the cor.test.default source code, you'll see :

OK <- complete.cases(x, y)
x <- x[OK]
y <- y[OK]
n <- length(x)

cor.test removes NA values from you vectors [...]

if (method = "pearson") {
    if (n < 3L) 
        stop("not enough finite obervations")

[...]

else {
    if (n<2)
        stop("not enough finite obervations")

If your vectors do not contain enough non-NA values (less than 3), the function will return the error.

Make all of the columns in your dataframe contain enough non-NA values before you use cor.test.

I hope this will be useful.

Behl answered 10/7, 2014 at 13:28 Comment(0)

I ran into this problem when I added a color argument to the aes call within ggplot, e.g

ggplot(df, aes(x=x_var, y=y_var, color = my_group))

I believe what happened is that stat_cor then runs the regression on each group separately. If you just want the color without changing the regression, add it to the marker, e.g:

geom_point(aes(color = my_group)

Hiawatha answered 7/4, 2023 at 18:22 Comment(0)

I can't see what 'm' or 'DF$Morphine' are, so I created a data frame with numeric and non-numeric columns.

# generate some data
set.seed(321)
mydf <- data.frame(A = rnorm(100), 
                   B = rexp(100, 1), 
                   C = runif(100), 
                   D = sample(letters, size=100, replace=TRUE))

I kept your functions as written, but called data_analysis differently. A data frame is expected as the first argument, and a numeric vector is expected as the second argument

data_analysis(dataframe=mydf, column=mydf$C)

When I run this, I get cor.test outputs for each column in the data frame.

            A                                      B                                     
statistic   -0.4153108                             -0.4669693                            
parameter   98                                     98                                    
p.value     0.6788223                              0.6415584                             
estimate    -0.04191585                            -0.04711863                           
null.value  0                                      0                                     
alternative "two.sided"                            "two.sided"                           
method      "Pearson's product-moment correlation" "Pearson's product-moment correlation"
data.name   "x and column"                         "x and column"                        
conf.int    Numeric,2                              Numeric,2                             
            C                                     
statistic   Inf                                   
parameter   98                                    
p.value     0                                     
estimate    1                                     
null.value  0                                     
alternative "two.sided"                           
method      "Pearson's product-moment correlation"
data.name   "x and column"                        
conf.int    Numeric,2

Kava answered 9/7, 2014 at 18:25 Comment(1)

I've realised it works on a "classic" data set. However, my personnal data frame contains A LOT of NA values... And that's the reason why cor.test won't work : cor.test returns an error and stops if they are too numerous. – Mouthful 10/7, 2014 at 12:8

Recommended topics

Hot tags