How to process old excel .xls files using POI?
Asked Answered
G

2

25

I switched from jxl to poi since POI has more features. However, I wasn't able to process the xls files that were generated in the old format. Now I am getting this error:

org.apache.poi.hssf.OldExcelFormatException: The supplied spreadsheet seems to be Excel 5.0/7.0 (BIFF5) format. POI only supports BIFF8 format (from Excel versions 97/2000/XP/2003)

Now I am thinking to use both JXL as wells as POI depending on the xls version so for old format xls files I will use jxl while for newer versions I will use POI. Is this a good solution? Are there any alternatives?

Giovannagiovanni answered 11/3, 2013 at 10:37 Comment(3)
Is that, in fact, an Excel 5.0/7.0 file?Insignificant
Yes I validated that it is an Excel 5/7 file (Office 95)Giovannagiovanni
Using single API is definitely better as it would have reduced the complexity a lot. But only these two are the most mature API to read Excel. So as per my opinion its is the best way of doing it.Eagle
F
16

For old Excel format files, you have the following alternatives:

  1. HSSF, the POI implementation of the Excel '97(-2007) file format.
    • If you just want to extract the textual content, then you can use OldExcelExtractor which will pull only the text and numbers from the file.
    • If you need the values from a specific cells, then you'll need to take an approach a bit like OldExcelExtractor, process the file at the record level, and check for the co-ordinates on OldStringRecord, NumberRecord, OldFormulaRecord and friends.
  2. Like you already mentioned, JXL can handle some cases too.
  3. Use a JDBC/ODBC driver. It is not as flexible as HSSF but for some old formats it is the only way to extract the information.
Foursquare answered 14/12, 2015 at 19:59 Comment(4)
The link referenced by the text "JDBC/ODBC" does not appear to point to any relevant content.Maldonado
@Maldonado Thanks, I updated the link to a new url, it seems that the previous page was removed :(Foursquare
Hi Thanks your answers, just I would know if there is a way to detect the format of Excel file is BIFF5?Hierodule
@esprittn You can check the BOF (Beginning of File) record. See page 43 from here: download.microsoft.com/download/0/B/E/… for more details.Foursquare
A
-5

as per my knowledge you can use this code to read excel files of the .xls format

FileInputStream in=new FileInputStream(new File("filename.xls"));
Wookbook wb=new HSSFWorkbook(in);

to read the new excel versions(2007 and up):

 FileInputStream in=new FileInputStream(new File("filename.xls"));
    Wookbook wb=new XSSFWorkbook(in);

external jar files that you will need:

 1. poi-3.9 
 2. dom4j-1.6.1
 3. XMLbeams-2.5.0

if you're work only requires you to work with .xls then only poi-3.0 will suffice. You need the other jars to work witht the new versions of excel.

Aitchbone answered 4/3, 2014 at 4:5 Comment(1)
I think you are referring to issues going from the old binary .xls format to newer XML based .xlsx formats but I think the question is referring to very old .xls binary formats that POI can't read - it can read newer .xls files - nothing to do with being binary or XML based format - just the older .xls files seem to not be supported by POI.Infract

© 2022 - 2024 — McMap. All rights reserved.