Is there any way to read out the color-index of cells from excel files with R?
While I can set the cell color with packages like XLConnect
or XLSX
, I have found no way to extract the color-information from existing workbooks.
Is there any way to read out the color-index of cells from excel files with R?
While I can set the cell color with packages like XLConnect
or XLSX
, I have found no way to extract the color-information from existing workbooks.
R-Bloggers provided a function that will do the job for you. I am including the answer here for future reference.
Read the excel file using xlsx
package:
library(xlsx)
wb <- loadWorkbook("test.xlsx")
sheet1 <- getSheets(wb)[[1]]
# get all rows
rows <- getRows(sheet1)
cells <- getCells(rows)
This part extracts the information that later will be used for getting background color (or other style information) of the cells:
styles <- sapply(cells, getCellStyle) #This will get the styles
This is the function
that identifies/extracts the cell background color:
cellColor <- function(style)
{
fg <- style$getFillForegroundXSSFColor()
rgb <- tryCatch(fg$getRgb(), error = function(e) NULL)
rgb <- paste(rgb, collapse = "")
return(rgb)
}
error
will handle the cells with no background color.
Using sapply
you can get the background color for all of the cells:
sapply(styles, cellColor)
You can also categorize/identify them by knowing the RGb codes:
mycolor <- list(green = "00ff00", red = "ff0000")
m <- match(sapply(styles, cellColor), mycolor)
labs <-names(mycolor)[m]
You can read more and learn how to apply it at R-bloggers
You can get the RGB codes from RapidTables.com
getCellStyle
for that. You need to dig through the documentation. –
Sheerlegs Old question but maybe it can help someone in the future.
There is a strange behavior in the POI (java) library (at least on my computer). It is not getting the colors correctly. The code provided in the @M--'s answer works well when the color is a basic color (indexed color), but does not work when the color is, for example, in grayscale. To get around you can use the following code using the getTint ()
function. Tint is a number between -1 (dark) and 1 (light), and combining it with the RGB (getRgb ()
) function, you can completely recover the color.
cell_color <- function(style){
fg <- style$getFillForegroundXSSFColor()
hex <- tryCatch(fg$getRgb(), error = function(e) NULL)
hex <- paste0("#", paste(hex, collapse = ""))
tint <- tryCatch(fg$getTint(), error = function(e) NULL)
if(!is.null(tint) & !is.null(hex)){ # Tint varies between -1 (dark) and 1 (light)
rgb_col <- col2rgb(col = hex)
if(tint < 0) rgb_col <- (1-abs(tint))*rgb_col
if(tint > 0) rgb_col <- rgb_col + (255-rgb_col)*tint
hex <- rgb(red = rgb_col[1, 1],
green = rgb_col[2, 1],
blue = rgb_col[3, 1],
maxColorValue = 255)
}
return(hex)
}
Some references to help:
https://poi.apache.org/apidocs/dev/org/apache/poi/hssf/usermodel/HSSFExtendedColor.html#getTint--
© 2022 - 2024 — McMap. All rights reserved.
readxl
version of the question: #53210186 – Emmerie