Trying to read Japanese CSV file in Java
Asked Answered
P

1

6

I am trying to read a Japanese content CSV file which is downloaded and extracted pragmatically.

Code to read the CSV

  String splitBy = ",";
            BufferedReader br;// = new BufferedReader(new FileReader(pathOfExcel + "\\KEN_ALL.CSV   "));
            br = new BufferedReader(new InputStreamReader(new FileInputStream(pathOfExcel + "\\KEN_ALL1.CSV"),"SHIFT-JIS")); 
              String line = "";
              int cnt = 0;
              while((line = br.readLine()) != null){
                  //System.out.println("Count :: " + cnt++);
                  List<Object> excelList = new ArrayList<Object>();
                  if(line != null){
                   String[] splitCells = line.split(splitBy);
                   excelList.add(splitCells[0].replace("\"", ""));
                   excelList.add(splitCells[1].replace("\"", ""));
                   excelList.add(splitCells[2].replace("\"", ""));
                   excelList.add(splitCells[3].replace("\"", ""));
                   excelList.add(splitCells[4].replace("\"", ""));
                   excelList.add(splitCells[5].replace("\"", ""));
                   excelList.add(splitCells[6].replace("\"", ""));
                   excelList.add(splitCells[7].replace("\"", ""));
                   excelList.add(splitCells[8].replace("\"", ""));
                   returnList.add(excelList);
                 }
              }
              br.close(); 

I have tried both UTF-8 and SHIFT-JIS as shown in the following code.

br = new BufferedReader(new InputStreamReader(new FileInputStream(pathOfExcel + "\\KEN_ALL1.CSV"),"UTF-8"));

When I was trying to encode with UTF-8 and SHIFT-JIS the " excelList.add(splitCells[3].replace("\"", ""));" will be returning the following outputs. But where as the original output should be ホッカイドウ

UTF-8 - ί¶²ÄÞ³

Shift-JIS - テ篠ッツカツイテ�楪ウ

Paquin answered 5/12, 2015 at 18:16 Comment(1)
Any question with this title needs to be upvoted :DRomeyn
S
0

The file KEN_ALL1.CSV is the file provided by JAPAN POST Co.,Ltd., right? https://www.post.japanpost.jp/zipcode/dl/kogaki-zip.html

I could read the file collectly with your program, so I think the program has no problem.

Result of your program on my Eclipse console

I think your file might have some problem. Can you read the CSV file with text editor that can show the character encoding of the file (e.g. Notepad++)? Is the content of the file showed collectly, and is the character encoding really Shift-JIS like this?

KEN_ALL1.CSV shown in Notepad++

Singh answered 29/10, 2019 at 15:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.