How to ignore hidden data when importing from Excel
Asked Answered
S

1

3

I have a collection of excel files which I am importing into R.

The files contain hidden data which I would like to disregard -- eg, simply not importing it, or importing it with a flag indicating it was hidden so that I can then drop it.

The files contain two types of hidden data:

  • Complete sheets are hidden
  • Specific Rows within a sheet are hidden.

Is there a way to identify when data in excel is hidden?

Right now I am using the gdata package, but am happy to use XLConnect or other package

Sample Code:

library(gdata)
xlsfile <- "test.xls"

# grab all the sheet names. 
# This is giving me both hidden & non-hidden sheets.  I would like only the latter
sheets <- sheetNames(xlsfile)

# read in the xls file, by sheet 
xlData <- 
  lapply(sheets, function(s) 
      read.xls(xlsfile, sheet=s, stringsAsFactors = FALSE))

if needed, I can create a dummy xls file and post it.

Silda answered 2/10, 2013 at 15:18 Comment(4)
My commiseration for being stuck in this circle of hell. I think it might be easier to do this from the other end, i.e., write a VBA script that exports non-hidden data as CSV. But I cannot offer more specific advice, as I would reject these files on principle.Immaterialism
You probably already know this but you can check if visible before copying data: if Sheets(ws).Visible = false then copy...Detrition
@PortlandRunner Is the visible flag something that is (or could be) exported?Silda
Thanks @Roland. If only rejecting the files were an option, but the data inside is critical. Going the VBA route might end up being the most directSilda
N
2

XLConnect has a nice function called isSheetHidden which does what you want. Assuming Sheet2 is hidden:

library(XLConnect)
xlsfile <- "Book1.xls"
wb <- loadWorkbook(xlsfile, create = TRUE)
isSheetHidden(wb, "Sheet1") # FALSE
isSheetHidden(wb, "Sheet2") # TRUE

In gdata you would have to write your own function that calls the underlying perl package to access the sheet property, but it is possible.

Neurasthenia answered 2/10, 2013 at 16:41 Comment(4)
Thats great, thanks! now if only they had something similar for rows...?Silda
Yeah, isHidden doesn't work for rows. So, XLConnect wraps HFFS (Horrible Spreadsheet Format), which is a nice Java library. The XLConnect wrapper is posted to Github. The API for HFFS seems to indicate that they have a getHidden for rows. It may not be that hard to slightly edit the wrapper so you expose the getHidden function. You might even ask the author to do it for you :)Neurasthenia
You can do the row import with resepct to being hidden using rcom rcom.univie.ac.atTravertine
@Travertine Can you provide an example (perhaps as an answer?)Neurasthenia

© 2022 - 2024 — McMap. All rights reserved.