Convert Number Stored as Text in Excel File Using R
Asked Answered
C

1

7

I would like to output data from R to an excel file, and have been using the openxlsx package.

writeDataTable is the function I have been using. The problem is that I want the entries of one of the columns to be a mixture of numbers and text. As a column of a dataframe in R must have all the same type of entries, everything gets converted to character entries.

When I open the excel file that is outputted I get the little green triangle in the corner of the number cells, telling me that they are a number stored as text and giving me the option to convert to number.

I'm wondering is it possible to get R to convert this for me in the Workbook Object that is created before saving the file.

I've spent a lot of time googling the problem, looking for different libraries that might help, but haven't been able to find anything so far. I have an example below. Note that this isn't the exact data frame I am using, but is for demonstrative purposes.

df = data.frame(A = c('Dog', '5', '7.04'), B = c('Cat', '12', '1.23'))
wb = createWorkbook()
addWorksheet(wb, "Sheet2")
writeDataTable(wb, "Sheet2", df)
output_file = "C:\\Users\\johndoe\\documents\\excel_file.xlsx"
saveWorkbook(wb, output_file)

Any help would be much appreciated.

Corsage answered 29/8, 2019 at 17:44 Comment(4)
I use library(xlsx) \ write.xlsx(mydata, "c:/mydata.xlsx").Disruptive
@TJ87, I don't think that addresses the mixed-class column problem.Gayl
One workaround could be by creating a template.xlsx with two tabs. One with Data tab where you writeData into and another tab say Formula that has excel formulas to Data tab, e.g. cell A2 =Data!A2. Not ideal (but will get rid of green triangles) as you will first need the formulas to be copied down in template.xlsx and also file size will double as you are keeping two tabs of same dataFlintlock
Thanks MKa, this could definitely be a viable workaround. Would it be possible to copy and then paste values within R, so that the Data tab could then be deleted? And would it then be possible to format part of the Formula tab as a table (the data I'm dealing with would have a variable number of rows, so it wouldn't be possible to have the table pre-formatted). In any case, I think I will try r2evans method below, but always good to have as many solutions as possible.Crapulous
G
3

Here's a way, but it will be painfully slow.

### unchanged
library(openxlsx)
df = data.frame(A = c('Dog', '5', '7.04'), B = c('Cat', '12', '1.23'))
wb = createWorkbook()
addWorksheet(wb, "Sheet2")
writeDataTable(wb, "Sheet2", df)

### this is the new part
for (cn in seq_len(ncol(df))) {
  for (rn in seq_len(nrow(df))) {
    if (!is.numeric(df[rn,cn]) && !is.na(val <- as.numeric(as.character(df[rn,cn])))) {
      writeData(wb, "Sheet2", val, startCol = cn, startRow = 1L + rn)
    }
  }
}

### unchanged
saveWorkbook(wb, output_file)

The catch is that this is writing one cell at a time.

Ways to improve this:

  • If you have more numbers than strings, you might want to reverse this a little by converting the relevant columns with as.numeric (producing a lot of NAs), and then one-by-one over-write a cell's NA value with the previous string.

  • You can look for runs of cells to replace (perhaps using rle within a column), which will write clump-by-clump instead of cell-by-cell.

Gayl answered 29/8, 2019 at 18:23 Comment(2)
Thanks r2evans. I will definitely try this. There aren't actually many non-numeric values so I might try the reversed way you suggested. Never thought of overwriting values that had already been written for some reason. Obviously would prefer not to use a for loop but I couldn't find a way so this seems like a reasonable solution. Seems strange that you wouldn't be able to change the data type after it has been written to the excel object.Crapulous
Yeah, I was trying to think of a way to push a matrix with "transparency" (i.e., "do not put in a value when NA"), but I don't think that logic exists. If openxlsx uses an internal (C++) for loop, then it might actually be feasible to do that, perhaps you can request it. (Unfortunately, the last commit appears to be Aug 2018 ... not promising.)Gayl

© 2022 - 2024 — McMap. All rights reserved.