Encoding problem in JExcel
Asked Answered
C

3

14

I am loading an excel file in an GAE/Java application with JExcel like this:

The html form to upload the file islike this:

<form id="" action="/save" method="post" enctype="multipart/form-data" accept-charset="ISO-8859-1">
    <input name="file" type="file" value="load"/>
    <input type="submit"value="load excel"/>
</form>

and in the server I have:

ServletFileUpload upload = new ServletFileUpload();
FileItemIterator iterator = upload.getItemIterator(request);
while (iterator.hasNext()) {
    FileItemStream item = iterator.next();
    InputStream stream = item.openStream();
    if (!item.isFormField()) {
        //if it's not a form field it's a file

        Workbook workbook = Workbook.getWorkbook(stream);
        ...
        String name = sheet.getCell(COL_NUMBER, row).getContents();
    }
}

The problem is that if I write in the cell something like 'city ó' when it reads in the server the variable name is ' city ?'. The encoding is not OK.

I've tried to change accept-charset="ISO-8859-1" (setting it to utf-8 or removing it) but with no success.

Can anyone tell me how could I solve this problem.

Thanks

Chemism answered 18/4, 2011 at 10:46 Comment(3)
how are you determining the name on the server? i assume the encoding when uploading is ignored (i'd imagine it's binary data). have you checked the actual string chars? it's possible the string is correct, but you are just printing it somewhere where the character cannot be displayed correctly.Inlaid
@Inlaid I'm saving it in the datastore. I noticed that it had problems with encoding because I have another method to download with the data in the datastore. I noticed that the downloaded data was not correct, so I've debugged and seen that in the String variable name it is reading the data with a wrong encoding, so then it is saved wrong and so on. The problem starts when calling the getContents() method on the call.Chemism
you might want to investigate how poi dtermines the file character encoding. from what i've seen, the default file encoding for GAE is "ascii". if poi is depending on the default file encoding, you could have problems.Inlaid
C
35

OK, I got it by doing this:

WorkbookSettings ws = new WorkbookSettings();
ws.setEncoding("Cp1252");
Workbook workbook = Workbook.getWorkbook(stream, ws);
Chemism answered 18/4, 2011 at 12:12 Comment(0)
C
3

WorkbookSettings will look for system property jxl.encoding

If you don't have easy access to WorkbookSettings (i.e. coming from Drools- ExcelParser) you might find this preferable.

Collincolline answered 24/1, 2012 at 11:2 Comment(0)
M
-2

First up, make sure you're using a recent version of POI (something like 3.7 or 3.8 beta 2). Very old versions of POI did have encoding problems, but as long as you're on a new one then that shouldn't be your issue.

Next, on your local machine, run something like org.apache.poi.hssf.extractor.ExcelExtractor against the file. This will let you confirm that POI is handling the encoding correctly. Run it with

java -classpath poi-3.8-beta2.jar org.apache.poi.hssf.extractor.ExcelExtractor --show-sheet-names Y -i MyExcel.xls

Assuming that works fine, then you know your issue is within Google App Engine.

Marigolde answered 18/4, 2011 at 11:40 Comment(1)
Sorry a lot, I have several projects and this one was not using Apache POI. It's using JExcel.Chemism

© 2022 - 2024 — McMap. All rights reserved.