Parsing Strings with JavaCC

I'm trying to think of a good way to parse strings using JavaCC without mistakenly matching it to another token. These strings should be able to have spaces, letters, and numbers.

My identifier and number token are as follows:

<IDENTIFIER: (["a"-"z", "A"-"Z"])+>
<NUMBER: (["0"-"9"])+>

My current string token is:

<STRING: "\"" (<IDENTIFIER> | <NUMBERS> | " ")+ "\"">

Ideally, I want to only save the stuff that's inside of the quotes. I have a separate file in which I do the actual saving of variables and values. Should I remove the quotes in there?

I originally had a method in the parser file like this:

variable=<INDENTIFIER> <ASSIGN> <QUOTE> message=<IDENTIFIER> <QUOTE>
{File.saveVariable(variable.image, message.image);}

But, as you might guess, this didn't allow for spaces—or numbers for that matter. For identifiers such as variable names, I only want to allow letters.

So, I'd just like to get some advice on how I could go about capturing string literals. In particular, I'd like to make strings such as:

" hello", "hello ", " hello " and "\nhello", "hello\n", "\nhello\n"

valid in my syntax.

TOKEN: { <QUOTE:"\""> : STRING_STATE } <STRING_STATE> MORE: { "\\" : ESC_STATE } <STRING_STATE> TOKEN: { <ENDQUOTE:<QUOTE>> : DEFAULT | <CHAR:~["\"","\\"]> } <ESC_STATE> TOKEN: { <CNTRL_ESC:["\"","\\","/","b","f","n","r","t"]> : STRING_STATE }

/** * Match a quoted string. */ String string() : { StringBuilder builder = new StringBuilder(); } { <QUOTE> ( getChar(builder) )* <ENDQUOTE> { return builder.toString(); } } /** * Match char inside quoted string. */ void getChar(StringBuilder builder): { Token t; } { ( t = <CHAR> | t = <CNTRL_ESC> ) { if (t.image.length() < 2) { // CHAR builder.append(t.image.charAt(0)); } else if (t.image.length() < 6) { // ESC char c = t.image.charAt(1); switch (c) { case 'b': builder.append((char) 8); break; case 'f': builder.append((char) 12); break; case 'n': builder.append((char) 10); break; case 'r': builder.append((char) 13); break; case 't': builder.append((char) 9); break; default: builder.append(c); } } } }

Recommended topics

Hot tags