Accumulating tailored ggpairs() plot objects into a list object
Asked Answered
O

1

7

I am trying to create a list object that contains GGally plots. These plots are each created with two datasets, the main dataset and a subset of the main dataset to be plotted again in orange. In the MWE below, three plots are created, each comparing two columns from the mtcars data and each containing a different number of subset points to be plotted in orange:

Plot_1: mpg and cyl, 1 orange overlaid point

Plot_2: mpg and disp, 20 orange overlaid points

Plot_3: mpg and hp, 30 orange overlaid points

library(GGally)
library(ggplot2)

data = mtcars
data$ID = rownames(mtcars)
data = data[, c(12,1:11)]

  my_fn <- function(data, mapping, ...){
    xChar = as.character(mapping$x)
    yChar = as.character(mapping$y)
    x = data[,c(xChar)]
    y = data[,c(yChar)]
    p <- ggplot(data, aes(x=x, y=y)) + geom_point() + geom_point(data = colorData, aes_string(x=xChar, y=yChar), inherit.aes = FALSE)
    p
  }

  ret=list()
  colorVec = c(1, 10, 20)
  k=1
    for (j in c(3:5)){
      datSel <- cbind(ID=data$ID, data[,c(2, j)])
      datSel$ID = as.character(datSel$ID)
      colorData <- datSel[sample(1:nrow(data), colorVec[k]),]
      p <- ggpairs(datSel[,-1], lower = list(continuous = my_fn), upper = list(continuous = wrap("cor", size = 4))) + theme_gray()
      ret[[paste0("Plot_",j)]] <- p
      k=k+1
    }  

However, when I run this code, and create the ret list object, only the last plot object in the list successfully creates the plot. The first two list objects cannot find one of the columns in the data.

> ret[["Plot_1"]]
Error in FUN(X[[i]], ...) : object 'cyl' not found

> ret[["Plot_2"]]
Error in FUN(X[[i]], ...) : object 'disp' not found

> ret[["Plot_3"]]
Correctly plotted

What might be a painless way to fix this problem? Thank you in advance for sharing advice.

EDIT:

Adding session info for reproduciblity

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_2.2.1 GGally_1.3.2 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15       reshape_0.8.7      grid_3.4.3         plyr_1.8.4         gtable_0.2.0      
 [6] magrittr_1.5       scales_0.5.0       pillar_1.2.1       stringi_1.1.6      rlang_0.2.0       
[11] reshape2_1.4.3     lazyeval_0.2.1     labeling_0.3       RColorBrewer_1.1-2 tools_3.4.3       
[16] stringr_1.3.0      munsell_0.4.3      yaml_2.1.17        compiler_3.4.3     colorspace_1.3-2  
[21] tibble_1.4.2
Olecranon answered 24/3, 2018 at 22:26 Comment(9)
What happens if you use ret[[j]] <- p in your code instead of what you currently have? Then try ret[[1]], ret[[2]], etc., to access the first, second, etc., plot? You could assign names to your ret object with the names() command after you finish storing all the plots in it.Siple
Thanks @IsabellaGhement. This approach seems to lead to the same results when I do ret[[3]], ret[[4]], and ret[[5]] (as j is between 3 and 5).Olecranon
I think that is because your j runs from 3 to 5? So maybe declare your ret as ret <- vector("list", 3) before you enter the for loop and then, inside the loop, so something like this: ret[[j - 2]] <- p. Then ret[[1]] should store the plot for j = 3, ret[[2]] should store the plot for j = 4, etc.Siple
@IsabellaGhement. Thank you, but this also leads to the same problem: When I do ret[[1]], I get an error that object 'cyl' is not found, when I do ret[[2]], I get an error that object "disp" is not found, and when I do ret[[3]] I get the plot successfully outputted.Olecranon
Your code is not reproducible. When I ran, I got error: Error in make_ggmatrix_plot_obj(wrapp(sub_type, funcArgName = sub_type_name), : variables: "x" have non standard format: "~mpg". Please rename the columns or make a new column.Subduct
Thanks for letting me know, @Tung. I just closed R and reran it and it seems to work for me. Do you think it may be different R and package versions? I added my session Info as an Edit to my original post in case this causes the differences.Olecranon
Not working for me after restarting R session. I'm using latest development version of ggplot2_2.2.1.9000 & GGally_1.3.2 on Microsoft R Open 3.4.3Subduct
You seem to have problems in the plot creation not plot storage. You may want to use a print(p) right before storing the plot and also use options(warn=2) before running your code to force R to stop if it encounters an error.Siple
@ Tung, thank you for letting me know. I wonder if it is because we are using different ggplot2 versions or different operating systems (I am using Mac). @IsabellaGhement the results were still the same. I wonder if you are also getting the same error message as Tung? Thanks again.Olecranon
A
2

A possible solution, if I correctly understood your question :

library(GGally)
data = mtcars
data$ID = rownames(mtcars)
data = data[, c(12,1:11)]

# Load tidyverse
library(tidyverse)

# Create a vector list for each plot you want
var_list <- data.frame(var = names(data)[3:5], 
                   color = colorVec)

# Function for sampling orange points
my_color_fn <- function(data, color_nb) {
  sample(1:nrow(data), color_nb)
}

# Create a list with a data for each variable with colors
data_list <- apply(var_list, 1, 
                   function(x) 
                     data %>% 
                      select(ID, mpg, as.character(x[["var"]])) %>% 
                      mutate(color = "black") %>% 
                      mutate(color = replace(color, my_color_fn(., x[["color"]]), "orange")))

# Update my_fn function
my_fn <- function(data, mapping, ...){
  xChar = as.character(mapping$x)
  yChar = as.character(mapping$y)
  x = data[, c(xChar)]
  y = data[, c(yChar)]
  p <- ggplot(data, aes_string(x=x, y=y)) + 
    geom_point(aes(color = color)) + 
    scale_color_manual("", values = c("black" = "black",
                                      "orange" = "orange"))
  p
}

# Create a function to get ggpairs for each subset
my_fn2 <- function(data)
{
  p <- ggpairs(data %>% select(- ID), 1:2, 
               lower = list(continuous = my_fn), 
               upper = list(continuous = wrap("cor", size = 4)))
  return(p)
}

# Get plot for each list element
ret <- lapply(data_list, function(x) my_fn2(x))

ret[[1]]
ret[[2]]
ret[[3]]

plot_1 plot_2 plot_3

Ali answered 27/3, 2018 at 13:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.