Regarding Java String Manipulation
Asked Answered
Q

4

2

I have the string "MO""RET" gets stored in items[1] array after the split command. After it get's stored I do a replaceall on this string and it replaces all the double quotes. But I want it to be stored as MO"RET. How do i do it. In the csv file from which i process using split command Double quotes within the contents of a Text field are repeated (Example: This account is a ""large"" one"). So i want retain the one of the two quotes in the middle of string if it get's repeated and ignore the end quotes if present . How can i do it?

String items[] = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
items[1] has "MO""RET"
String recordType = items[1].replaceAll("\"","");

After this recordType has MORET I want it to have MO"RET

Quaggy answered 11/2, 2010 at 2:48 Comment(2)
Less than one hour ago you posted a very similar question #2242258 which you haven't responded to, down or upvoted, or accepted. If you don't give back to the site, people will stop giving to you.Risotto
@Mark Byers: oh, how I wish that were true.Duley
O
6

Don't use regex to split a CSV line. This is asking for trouble ;) Just parse it character-by-character. Here's an example:

public static List<List<String>> parseCsv(InputStream input, char separator) throws IOException {
    BufferedReader reader = null;
    List<List<String>> csv = new ArrayList<List<String>>();
    try {
        reader = new BufferedReader(new InputStreamReader(input, "UTF-8"));
        for (String record; (record = reader.readLine()) != null;) {
            boolean quoted = false;
            StringBuilder fieldBuilder = new StringBuilder();
            List<String> fields = new ArrayList<String>();
            for (int i = 0; i < record.length(); i++) {
                char c = record.charAt(i);
                fieldBuilder.append(c);
                if (c == '"') {
                    quoted = !quoted;
                }
                if ((!quoted && c == separator) || i + 1 == record.length()) {
                    fields.add(fieldBuilder.toString().replaceAll(separator + "$", "")
                        .replaceAll("^\"|\"$", "").replace("\"\"", "\"").trim());
                    fieldBuilder = new StringBuilder();
                }
                if (c == separator && i + 1 == record.length()) {
                    fields.add("");
                }
            }
            csv.add(fields);
        }
    } finally {
        if (reader != null) try { reader.close(); } catch (IOException logOrIgnore) {}
    }
    return csv;
}

Yes, there's little regex involved, but it only trims off ending separator and surrounding quotes of a single field.

You can however also grab any 3rd party Java CSV API.

Oedema answered 11/2, 2010 at 2:57 Comment(4)
Thanks a lot. Thanks a lot. In case if the my string has a value of "TEST"REPLA". If there is only one single double quote in the middle of the string how can i delete the first ,last quote and retain all the middle quote. I want the output as TEST"REPLA Example 2 : "EXAM"PLE"2IN" I want the output as EXAM"PLE"2IN First and last quotes needs to be deletedQuaggy
The posted code example already does that (assuming that your CSV file adheres the RFC4180 as outlined here rfc-editor.org/rfc/rfc4180.txt ).Oedema
I used your code. Great! Humm... There is a little problem. I expected ["A","B","",""] from line A,B,, of exported file from spreadsheet, but I got ["A","B",""].Neve
@Paul: Oh, I overlooked that edge case. I updated the answer.Oedema
R
1

How about:

String recordType = items[1].replaceAll( "\"\"", "\"" );
Reconstructionism answered 11/2, 2010 at 2:55 Comment(2)
Thanks a lot. In case if the my string has a value of "TEST"REPLA". If there is only one single double quote in the middle of the string how can i delete the first ,last quote and retain all the middle quote. I want the output as TEST"REPLA Example 2 : "EXAM"PLE"2IN" I want the output as EXAM"PLE"2IN First and last quotes needs to be deletedQuaggy
It's difficult to do this with regex and cover the case where there is one starting quote and no ending quote, etc.. And the regex starts to get really complicated. You are really starting to get better off parsing the whole line. If you really just want the specific start/end quote case then just check for this with charAt() and do a substring. It will be faster than regex anyway.Reconstructionism
H
0

I prefer you to use replace instead of replaceAll. replaceAll uses REGEX as the first argument.

The requirement is to replace two continues QUOTES with one QUOTE

String recordType = items[1].replace( "\"\"", "\"" );

To see the difference between replace and replaceAll , execute bellow code

recordType = items[1].replace( "$$", "$" );
recordType = items[1].replaceAll( "$$", "$" );
Hornswoggle answered 11/2, 2010 at 3:22 Comment(1)
Thanks a lot. In case if the my string has a value of "TEST"REPLA". If there is only one single double quote in the middle of the string how can i delete the first ,last quote and retain all the middle quote. I want the output as TEST"REPLA Example 2 : "EXAM"PLE"2IN" I want the output as EXAM"PLE"2IN First and last quotes needs to be deletedQuaggy
H
0

Here you can use the regular expression.

recordType = items[1].replaceAll( "\\B\"", "" ); 
recordType = recordType.replaceAll( "\"\\B", "" ); 

First statement replace the quotes in the beginning of the word with empty character. Second statement replace the quotes in the end of the word with empty character.

Hornswoggle answered 11/2, 2010 at 7:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.