Java String parsing - {k1=v1,k2=v2,...}
Asked Answered
T

7

7

I have the following string which will probably contain ~100 entries:

String foo = "{k1=v1,k2=v2,...}"

and am looking to write the following function:

String getValue(String key){
    // return the value associated with this key
}

I would like to do this without using any parsing library. Any ideas for something speedy?

Tumble answered 29/10, 2009 at 15:53 Comment(5)
My parsing library you mean regex, or no third party library?Jeepers
Can v1, v2... contain '=' or ','?Peterus
Lets suppose values do not contain '=' or ','. Just no 3rd party libs.Tumble
This is really really close to JSON. Why not use that?Fallacious
(Also: no 3rd party libs in Java? Madness.)Fallacious
E
12

If you know your string will always look like this, try something like:

HashMap map = new HashMap();

public void parse(String foo) {
  String foo2 = foo.substring(1, foo.length() - 1);  // hack off braces
  StringTokenizer st = new StringTokenizer(foo2, ",");
  while (st.hasMoreTokens()) {
    String thisToken = st.nextToken();
    StringTokenizer st2 = new StringTokenizer(thisToken, "=");

    map.put(st2.nextToken(), st2.nextToken());
  }
}

String getValue(String key) {
  return map.get(key).toString();
}

Warning: I didn't actually try this; there might be minor syntax errors but the logic should be sound. Note that I also did exactly zero error checking, so you might want to make what I did more robust.

Elegy answered 29/10, 2009 at 16:1 Comment(1)
A shortcut would be using ",={}" . No hacking off braces or a second tokenizer needed :)Entrain
U
4

The speediest, but ugliest answer I can think of is parsing it character by character using a state machine. It's very fast, but very specific and quite complex. The way I see it, you could have several states:

  • Parsing Key
  • Parsing Value
  • Ready

Example:

int length = foo.length();
int state = READY;
for (int i=0; i<length; ++i) {
   switch (state) {
      case READY:
        //Skip commas and brackets
        //Transition to the KEY state if you find a letter
        break;
      case KEY:
        //Read until you hit a = then transition to the value state
        //append each letter to a StringBuilder and track the name
        //Store the name when you transition to the value state
        break;
      case VALUE:
        //Read until you hit a , then transition to the ready state
        //Remember to save the built-key and built-value somewhere
        break;
   }
}

In addition, you can implement this a lot faster using StringTokenizers (which are fast) or Regexs (which are slower). But overall, individual character parsing is most likely the fastest way.

Underhill answered 29/10, 2009 at 16:1 Comment(2)
For raw speed, use the char array to avoid synchronization. Well, that's an old-timer reflex since modern JVMs coarsen the locks :-)Gormless
Oh, good call. I actually completely forgot to drop in how to actually access the characters...Underhill
D
2

If the string has many entries you might be better off parsing manually without a StringTokenizer to save some memory (in case you have to parse thousands of these strings, it's worth the extra code):


public static Map parse(String s) {
    HashMap map = new HashMap();
    s = s.substring(1, s.length() - 1).trim(); //get rid of the brackets
    int kpos = 0; //the starting position of the key
    int eqpos = s.indexOf('='); //the position of the key/value separator
    boolean more = eqpos > 0;
    while (more) {
        int cmpos = s.indexOf(',', eqpos + 1); //position of the entry separator
        String key = s.substring(kpos, eqpos).trim();
        if (cmpos > 0) {
            map.put(key, s.substring(eqpos + 1, cmpos).trim());
            eqpos = s.indexOf('=', cmpos + 1);
            more = eqpos > 0;
            if (more) {
                kpos = cmpos + 1;
            }
        } else {
            map.put(key, s.substring(eqpos + 1).trim());
            more = false;
        }
    }
    return map;
}

I tested this code with these strings and it works fine:

{k1=v1}

{k1=v1, k2 = v2, k3= v3,k4 =v4}

{k1= v1,}

Depose answered 29/10, 2009 at 16:12 Comment(0)
G
0

Written without testing:

String result = null;
int i = foo.indexOf(key+"=");
if (i != -1 && (foo.charAt(i-1) == '{' || foo.charAt(i-1) == ',')) {
    int j = foo.indexOf(',', i);
    if (j == -1) j = foo.length() - 1;
    result = foo.substring(i+key.length()+1, j);
}
return result;

Yes, it's ugly :-)

Gormless answered 29/10, 2009 at 16:3 Comment(0)
P
0

Well, assuming no '=' nor ',' in values, the simplest (and shabby) method is:

int start = foo.indexOf(key+'=') + key.length() + 1;
int end =  foo.indexOf(',',i) - 1;
if (end==-1) end = foo.indexOf('}',i) - 1;
return (start<end)?foo.substring(start,end):null;

Yeah, not recommended :)

Peterus answered 29/10, 2009 at 16:8 Comment(2)
Don't think i'll be using this one, but interesting answer!Tumble
Oh, I know is not the good way :) I just wanted to indicate that this is a fast method. But some users are faster than me and posted similar solutions before. I don't see good solutions in the other answers too, and the final solution would imply using an AST parser or something similar.Peterus
E
0

Adding code to check for existance of key in foo is left as exercise to the reader :-)

String foo = "{k1=v1,k2=v2,...}";

String getValue(String key){
    int offset = foo.indexOf(key+'=') + key.length() + 1;
    return foo.substring(foo.indexOf('=', offset)+1,foo.indexOf(',', offset));
}
Entrain answered 29/10, 2009 at 16:59 Comment(0)
P
0

Please find my solution:

public class KeyValueParser {

    private final String line;
    private final String divToken;
    private final String eqToken;
    private Map<String, String> map = new HashMap<String, String>();

    // user_uid=224620; pass=e10adc3949ba59abbe56e057f20f883e;
    public KeyValueParser(String line, String divToken, String eqToken) {
        this.line = line;
        this.divToken = divToken;
        this.eqToken = eqToken;
        proccess();
    }

    public void proccess() {
        if (Strings.isNullOrEmpty(line) || Strings.isNullOrEmpty(divToken) || Strings.isNullOrEmpty(eqToken)) {
            return;
        }
        for (String div : line.split(divToken)) {
            if (Strings.isNullOrEmpty(div)) {
                continue;
            }
            String[] split = div.split(eqToken);
            if (split.length != 2) {
                continue;
            }
            String key = split[0];
            String value = split[1];
            if (Strings.isNullOrEmpty(key)) {
                continue;
            }
            map.put(key.trim(), value.trim());
        }

    }

    public String getValue(String key) {
        return map.get(key);
    }
}

Usage

KeyValueParser line = new KeyValueParser("user_uid=224620; pass=e10adc3949ba59abbe56e057f20f883e;", ";", "=");
String userUID = line.getValue("user_uid")
Powder answered 25/10, 2012 at 19:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.