Probably the main thing that's throwing it off is that \s
matches horizontal and vertical space. To match just horizontal space, use \h
, and to match just vertical space, \v
.
One small recommendation I'd make is to avoid including the newlines in the token. You might also want to use the alternation operators %
or %%
, as they're designed for handling this type work:
grammar Parser {
token TOP {
<headerRow> \n
<valueRow>+ %% \n
}
token headerRow { <.ws>* %% <header> }
token valueRow { <.ws>* %% <value> }
token header { \S+ }
token value { \S+ }
token ws { \h* }
}
The result of Parser.parse($dat)
for this is the following:
「ID Name Email
1 test [email protected]
321 stan [email protected]
」
headerRow => 「ID Name Email」
header => 「ID」
header => 「Name」
header => 「Email」
valueRow => 「 1 test [email protected]」
value => 「1」
value => 「test」
value => 「[email protected]」
valueRow => 「 321 stan [email protected]」
value => 「321」
value => 「stan」
value => 「[email protected]」
valueRow => 「」
which shows us that the grammar has successfully parsed everything. However, let's focus on the second part of your question, that you want to it to be available in a variable for you. To do that, you'll need to supply an actions class which is very simple for this project. You just make a class whose methods match the methods of your grammar (although very simple ones, like value
/header
that don't require special processing besides stringification, can be ignored). There are some more creative/compact ways to handle processing of yours, but I'll go with a fairly rudimentary approach for illustration. Here's our class:
class ParserActions {
method headerRow ($/) { ... }
method valueRow ($/) { ... }
method TOP ($/) { ... }
}
Each method has the signature ($/)
which is the regex match variable. So now, let's ask what information we want from each token. In header row, we want each of the header values, in a row. So:
method headerRow ($/) {
my @headers = $<header>.map: *.Str
make @headers;
}
Any token with a quantifier on it will be treated as a Positional
, so we could also access each individual header match with $<header>[0]
, $<header>[1]
, etc. But those are match objects, so we just quickly stringify them. The make
command allows other tokens to access this special data that we've created.
Our value row will look identically, because the $<value>
tokens are what we care about.
method valueRow ($/) {
my @values = $<value>.map: *.Str
make @values;
}
When we get to last method, we will want to create the array with hashes.
method TOP ($/) {
my @entries;
my @headers = $<headerRow>.made;
my @rows = $<valueRow>.map: *.made;
for @rows -> @values {
my %entry = flat @headers Z @values;
@entries.push: %entry;
}
make @entries;
}
Here you can see how we access the stuff we processed in headerRow()
and valueRow()
: You use the .made
method. Because there are multiple valueRows, to get each of their made
values, we need to do a map (this is a situation where I tend to write my grammar to have simply <header><data>
in the grammar, and defeine the data as being multiple rows, but this is simple enough it's not too bad).
Now that we have the headers and rows in two arrays, it's simply a matter of making them an array of hashes, which we do in the for
loop. The flat @x Z @y
just intercolates the elements, and the hash assignment Does What We Mean, but there are other ways to get the array in hash you want.
Once you're done, you just make
it, and then it will be available in the made
of the parse:
say Parser.parse($dat, :actions(ParserActions)).made
-> [{Email => [email protected], ID => 1, Name => test} {Email => [email protected], ID => 321, Name => stan} {}]
It's fairly common to wrap these into a method, like
sub parse-tsv($tsv) {
return Parser.parse($tsv, :actions(ParserActions)).made
}
That way you can just say
my @entries = parse-tsv($dat);
say @entries[0]<Name>; # test
say @entries[1]<Email>; # [email protected]
Nil
. It's pretty barren as far as feedback goes, right? For debugging, download commaide if you haven't already, and/or see How can error reporting in grammars be improved?. You gotNil
cuz your pattern assumed backtracking semantics. See my answer about that. I recommend you eschew backtracking. See @user0721090601's answer about that. For sheer practicality and speed, see JJ's answer. Also, Introductory general answer to "I want to parse X with Raku. Can anyone help?". – Arctic