I am using the POI 3.9 to read data from xlsx. But now I face an issue, the POI don't support the xlsb files, and I need to read data from xlsb programly. Does anybody know how to read data from xlsb programly? Appreciated.
Using poi you can read XLSB to DB, Structure(XML,...), List of Contents or etc.
The following code convert XLSB to List of row/comment lists and map for extra info.
Just you can customize the code according your needs.
Please find many examples from link; thanks to authors of poi.
// Main class
package excel;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.openxml4j.exceptions.OpenXML4JException;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.xssf.binary.XSSFBSharedStringsTable;
import org.apache.poi.xssf.binary.XSSFBSheetHandler;
import org.apache.poi.xssf.binary.XSSFBStylesTable;
import org.apache.poi.xssf.eventusermodel.XSSFBReader;
import org.xml.sax.SAXException;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
public class Excel {
public static void main (String [] args){
String xlsbFileName = "C:\\Users\\full path to .xlsb file";
callXLToList(xlsbFileName);
}
static void callXLToList(String xlsbFileName){
OPCPackage pkg;
try {
pkg = OPCPackage.open(xlsbFileName);
XSSFBReader r = new XSSFBReader(pkg);
XSSFBSharedStringsTable sst = new XSSFBSharedStringsTable(pkg);
XSSFBStylesTable xssfbStylesTable = r.getXSSFBStylesTable();
XSSFBReader.SheetIterator it = (XSSFBReader.SheetIterator) r.getSheetsData();
List<XLSB2Lists> workBookAsList = new ArrayList<>();
int sheetNr = 1;
while (it.hasNext()) {
InputStream is = it.next();
String name = it.getSheetName();
System.out.println("Begin parsing sheet "+sheetNr+": "+name);
XLSB2Lists testSheetHandler = new XLSB2Lists();
testSheetHandler.startSheet(name);
XSSFBSheetHandler sheetHandler = new XSSFBSheetHandler(is,
xssfbStylesTable,
it.getXSSFBSheetComments(),
sst, testSheetHandler,
new DataFormatter(),
false);
sheetHandler.parse();
testSheetHandler.endSheet();
System.out.println("End parsing sheet "+sheetNr+": "+name);
sheetNr++;
// Add parsed sheet to workbook list
workBookAsList.add(testSheetHandler);
}
// For every sheet in Workbook
System.out.println("\nShort Report:");
for(XLSB2Lists sheet:workBookAsList){
// sheet content
System.out.println("Size of content: " +sheet.getSheetContentAsList().size());
// sheet comment
System.out.println("Size fo comment: "+sheet.getSheetCommentAsList().size());
// sheet extra info
System.out.println("Extra info.: "+sheet.getMapOfInfo().toString());
}
} catch (InvalidFormatException e) {
// TODO Please do your catch hier
e.printStackTrace();
} catch (IOException e) {
// TODO Please do your catch hier
e.printStackTrace();
} catch (OpenXML4JException e) {
// TODO Please do your catch hier
e.printStackTrace();
} catch (SAXException e) {
// TODO Please do your catch hier
e.printStackTrace();
}
}
}
// Parsing class
package excel;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler;
import org.apache.poi.xssf.usermodel.XSSFComment;
/**
*
* @author Dominique
*/
public class XLSB2Lists implements XSSFSheetXMLHandler.SheetContentsHandler {
private final List sheetAsList = new ArrayList<>();
private List rowAsList;
private final List sheetCommentAsList = new ArrayList<>();
private List rowCommentAsList;
private final Map propertyMap = new HashMap<>();
public void startSheet(String sheetName) {
propertyMap.put("sheetName", sheetName);
}
@Override
public void startRow(int rowNum) {
rowAsList = new ArrayList<>();
rowCommentAsList = new ArrayList<>();
}
@Override
public void endRow(int rowNum) {
sheetAsList.add(rowNum, rowAsList);
sheetCommentAsList.add(rowNum, rowCommentAsList);
}
@Override
public void cell(String cellReference, String formattedValue, XSSFComment comment) {
formattedValue = (formattedValue == null) ? "" : formattedValue;
rowAsList.add(formattedValue);
if (comment == null) {
rowCommentAsList.add("");
} else {
propertyMap.put("comment author at "+comment.getRow()+":"+cellReference, comment.getAuthor());
rowCommentAsList.add(comment.getString().toString().trim());
}
}
@Override
public void headerFooter(String text, boolean isHeader, String tagName) {
if (isHeader) {
propertyMap.put("header tag", tagName);
propertyMap.put("header text", text);
} else { // footer
propertyMap.put("header tag", tagName);
propertyMap.put("header text", text);
}
}
public List getSheetContentAsList(){
return sheetAsList;
}
public List getSheetCommentAsList(){
return sheetCommentAsList;
}
public Map getMapOfInfo(){
return propertyMap;
}
}
Apache POI added support for streaming reading of XLSB (no write support) in 3.16. Apache Tika 1.15 now supports extraction from XLSB.
In Perl, the Win32::OLE module can convert XLSB to XLSX. The downside: you have to have MS Excel installed. Here's some sample code based on what I used...
use File::Spec::Functions qw/rel2abs/;
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Excel';
use Win32::OLE::Variant;
Win32::OLE->Option( Warn => 3 );
my $xlsb = 'C:\Users\wohlfarj\Documents\File.xlsb';
# This block uses an already open instance of Excel, or starts a new one if it isn't already open.
my $excel;
eval { $excel = Win32::OLE->GetActiveObject('Excel.Application') };
die 'MS Excel not installed' if $@;
unless (defined $excel) {
$excel = Win32::OLE->new( 'Excel.Application', 'Quit' )
or die 'Cannot start MS Excel';
}
# After all of the setup, converting the file is painless.
my $xlsx = rel2abs( $xlsb );
$xlsx =~ s/\.xlsb$/\.xlsx/i;
my $workbook = $excel->Workbooks->Open( {FileName => rel2abs( $xlsb )} );
$workbook->SaveAs( {FileFormat => xlOpenXMLWorkbook, Filename => $xlsx} );
$workbook->Close( {SaveChanges => xlDoNotSaveChanges} );
From here, the Spreadsheet::XLSX module reads the XLSX copy just fine.
POI devs apparently don't have plans on supporting XLSB: http://mail-archives.apache.org/mod_mbox/poi-dev/201401.mbox/%3Calpine.DEB.2.02.1401250721280.31868%40urchin.earth.li%3E
It would be rather a lot of work, as you'd both need to update records to cope with the longer/different format, then redo all the marshling stuff to handle the very different way it does that. Thus far, no-one has wanted to put in all that work for the very marginal benefit
There appears to be a javascript library for reading xlsb, which you could use to export the data as JSON and read from java.
This is just a workaround..
You can convert the xlsb files to xlsx file and use POI to extract data from it.
Have you tried it ? I know its not the correct answer but hope it helps. :)
© 2022 - 2024 — McMap. All rights reserved.