ocamlyacc parse error: what token?
Asked Answered
P

3

16

I'm using ocamlyacc and ocamllex. I have an error production in my grammar that signals a custom exception. So far, I can get it to report the error position:

| error { raise (Parse_failure (string_of_position (symbol_start_pos ()))) }

But, I also want to know which token was read. There must be a way---anyone know?

Thanks.

Phylactery answered 19/12, 2009 at 14:56 Comment(0)
H
16

Tokens are generated by lexer, hence you can use the current lexer token when error occurs :

  let parse_buf_exn lexbuf =
    try
      T.input T.rule lexbuf
    with exn ->
      begin
        let curr = lexbuf.Lexing.lex_curr_p in
        let line = curr.Lexing.pos_lnum in
        let cnum = curr.Lexing.pos_cnum - curr.Lexing.pos_bol in
        let tok = Lexing.lexeme lexbuf in
        let tail = Sql_lexer.ruleTail "" lexbuf in
        raise (Error (exn,(line,cnum,tok,tail)))
      end

Lexing.lexeme lexbuf is what you need. Other parts are not necessary but useful. ruleTail will concat all remaining tokens into string for the user to easily locate error position. lexbuf.Lexing.lex_curr_p should be updated in the lexer to contain correct positions. (source)

Houseraising answered 21/12, 2009 at 9:5 Comment(4)
Great answer. I have one question, though: why do we have to use lexbuf.Lexing.lex_curr_p instead of lexbuf.lex_curr_p?Arduous
Because lex_curr_p belongs to Lexing module. Either open it or wait until OCaml gets wiser and understands unqualified record field references.Houseraising
Where do you find Sql_lexer. and Error ?Autry
Could someone pleae state explicitly where this function should be pasted ?Autry
P
22

The best way to debug your ocamlyacc parser is to set the OCAMLRUNPARAM param to include the character p - this will make the parser print all the states that it goes through, and each shift / reduce it performs.

If you are using bash, you can do this with the following command:

$ export OCAMLRUNPARAM='p'
Pastoralize answered 29/7, 2010 at 7:2 Comment(0)
H
16

Tokens are generated by lexer, hence you can use the current lexer token when error occurs :

  let parse_buf_exn lexbuf =
    try
      T.input T.rule lexbuf
    with exn ->
      begin
        let curr = lexbuf.Lexing.lex_curr_p in
        let line = curr.Lexing.pos_lnum in
        let cnum = curr.Lexing.pos_cnum - curr.Lexing.pos_bol in
        let tok = Lexing.lexeme lexbuf in
        let tail = Sql_lexer.ruleTail "" lexbuf in
        raise (Error (exn,(line,cnum,tok,tail)))
      end

Lexing.lexeme lexbuf is what you need. Other parts are not necessary but useful. ruleTail will concat all remaining tokens into string for the user to easily locate error position. lexbuf.Lexing.lex_curr_p should be updated in the lexer to contain correct positions. (source)

Houseraising answered 21/12, 2009 at 9:5 Comment(4)
Great answer. I have one question, though: why do we have to use lexbuf.Lexing.lex_curr_p instead of lexbuf.lex_curr_p?Arduous
Because lex_curr_p belongs to Lexing module. Either open it or wait until OCaml gets wiser and understands unqualified record field references.Houseraising
Where do you find Sql_lexer. and Error ?Autry
Could someone pleae state explicitly where this function should be pasted ?Autry
D
2

I think that, similar to yacc, the tokens are stored in variables corresponding to the symbols in your grammar rule. Here since there is one symbol (error), you may be able to simply output $1 using printf, etc.

Edit: responding to comment.

Why do you use an error terminal? I'm reading an ocamlyacc tutorial that says a special error-handling routine is called when a parse error happens. Like so:

3.1.5. The Error Reporting Routine

When ther parser function detects a syntax error, it calls a function named parse_error with the string "syntax error" as argument. The default parse_error function does nothing and returns, thus initiating error recovery (see Error Recovery). The user can define a customized parse_error function in the header section of the grammar file such as:

let parse_error s = (* Called by the parser function on error *)
  print_endline s;
  flush stdout

Well, looks like you only get "syntax error" with that function though. Stay tuned for more info.

Dunlin answered 19/12, 2009 at 15:42 Comment(2)
Unfortunately, that doesn't work: File "parser.mly", line 372: $1 refers to terminal `error', which has no argumentPhylactery
Can you show me the code for the entire function? I may be able to offer more insight then.Dunlin

© 2022 - 2024 — McMap. All rights reserved.