"OLE2 signature is invalid" when trying to parse XLS file
Asked Answered
P

2

6

I'm trying to upload and parse an .XLS file. I've tried with roo-xls and Spreadsheet, but with both, I'm getting this error:

Ole::Storage::FormatError in UploadController#upload
OLE2 signature is invalid

I found a few resources about this error, and the only answer ever given is to re-save the document as .XLS, because although the original is labeled as .XLS, it actually isn't.

Unfortunately this isn't really an option, because I have users uploading files, so it's important that it work without requiring a re-save.

For the record, I tried re-saving the file and now it works, but I'm at a loss as to why it works, because the file's format before and after the re-save looks to be the exact same. This is what was listed under "type of file" in Excel, before and after:

Microsoft Excel 97-2003 Worksheet (.xls)"

And this is what is listed under "Type" in Libreoffice, before and after:

Microsoft Excel Worksheet (application/vnd.ms-excel)

What's going on?

Also, here's my simple upload code:

form

<%= form_tag(upload_path, multipart: true) do %>
  <%= file_field_tag :file %>
<% end %>

controller

file = params[:file].path

#Roo Attempt
doc = Roo::Excel.new(file)

#Spreadsheet Attempt
require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'
doc = Spreadsheet.open(file).worksheets
Pawn answered 17/9, 2018 at 4:50 Comment(1)
Did anything help?Findley
F
2

The issue is that roo reads only:

  • Excel 2007 - 2013 formats (xlsx, xlsm)
  • LibreOffice / OpenOffice.org formats (ods)
  • CSV

For xls only you need to use roo-xls gem.

If you need any details further, I would need a copy of the excel file.

you should have:

require 'roo'
require 'roo-xls'

Then it will work.

Findley answered 20/9, 2018 at 9:25 Comment(7)
My mistake, I'm using roo-xls but I neglected to say that in the question. Unfortunately the excel file is proprietary, but I'll try to recreate the error with a non-proprietary file.Pawn
@JoeMorano please do so, nobody has crystal ball :).Findley
@JoeMorano Also it would help to have minimal working example - stackoverflow.com/help/mcveFindley
@JoeMorano do you have both gems in the require? Only the roo-xls gem is not enough!Findley
I do have both gems required. I've been trying to create another .xls file that causes the same error, but I can't...the original file was generated by third-party POS software. I can't share it in its present form because it's not my intellectual property, but if I make any changes and save it, it no longer causes the error. If you had access to it, how would you analyze it? Would you check the metadata or binary or something like that?Pawn
@JoeMorano I know it can be hard to reproduce the error (could you do an extraction with some dummy data, from some testing platform?). First I would check if the file is actually a correctly created *.xls file (perhaps a *.csv file named *.xls?). From your post it appears that it is not. I would view it with a text file editor to see what is actually inside. Then I would try to extract meta-data via some extractor like - meta-extractor.sourceforge.net to see if there are any discrepancies.Findley
@JoeMorano I have found out that there is even a ruby way to read the meta data using yomu- github.com/yomurb/yomu. You may try that.Findley
A
1

Here's a suggestion. Might not be pretty, but should work, assuming the differences are the same for any file with the problem.

  1. Manually perform a binary compare of the file before and after it's 're-saved'. Note the differences.
  2. Read the uploaded file from the user. See if it's missing the noted difference from earlier, and modify the file accordingly.
  3. Send the 'modified' file to roo-xls
Astrobiology answered 25/9, 2018 at 5:0 Comment(2)
How can I view the binary of a file?Pawn
I would use BeyondCompare to compare the two filesAstrobiology

© 2022 - 2024 — McMap. All rights reserved.