I have a 37 column CSV file that I am parsing in Java with Apache Commons CSV 1.2. My setup code is as follows:
//initialize FileReader object
FileReader fileReader = new FileReader(file);
//intialize CSVFormat object
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withHeader(FILE_HEADER_MAPPING);
//initialize CSVParser object
CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);
//Get a list of CSV file records
List<CSVRecord> csvRecords = csvFileParser.getRecords();
// process accordingly
My problem is that when I copy the CSV to be processed to my target directory and run my parsing program, I get the following error:
Exception in thread "main" java.lang.IllegalArgumentException: Index for header 'Title' is 7 but CSVRecord only has 6 values!
at org.apache.commons.csv.CSVRecord.get(CSVRecord.java:110)
at launcher.QualysImport.createQualysRecords(Unknown Source)
at launcher.QualysImport.importQualysRecords(Unknown Source)
at launcher.Main.main(Unknown Source)
However, if I copy the file to my target directory, open and save it, then try the program again, it works. Opening and saving the CSV adds back the commas needed at the end so my program won't compain about not having enough headers to read.
For context, here is a sample line of before/after saving:
Before (failing): "data","data","data","data"
After (working): "data","data",,,,"data",,,"data",,,,,,
So my question: why does the CSV format change when I open and save it? I'm not changing any values or encoding, and the behavior is the same for MS-DOS or regular .csv format when saving. Also, I'm using Excel to copy/open/save in my testing.
Is there some encoding or format setting I need to be using? Can I solve this programmatically?
Thanks in advance!
EDIT #1:
For additional context, when I first view an empty line in the original file, it just has the new line ^M character like this:
^M
After opening in Excel and saving, it looks like this with all 37 of my empty fields:
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,^M
Is this a Windows encoding discrepancy?