How to handle errors during parsing in F#
Asked Answered
F

1

9

I'm using fslex/fsyacc utilities for my F# Lexer and Parser. If input text has incorrect syntax it is necessary to know place where it happens.

It is possible to determine incorrect lexeme (token) in Lexer and throw an exception if it was used incorrect symbol or word:

rule token = parse
          ...      
  | integer   { INT (Int32.Parse(lexeme lexbuf)) }
  | "*="      { failwith "Incorrect symbol" }
  | eof       { EOF }

The question is related more to Parser (fsyacc) - if input text has correct tokens and was sucessfuly tokenized by Lexer, but error happened during parsing (for example, incorrect tokens order or some absent token in the rule)

I know if catch an exception, this give position (line and column), where parsing failed:

try
   Parser.start Lexer.token lexbuf
with e ->
   let pos = lexbuf.EndPos
   let line = pos.Line
   let column = pos.Column
   let message = e.Message  // "parse error"
    ... 

But is it possible (if yes - how to do it?) to determine also AST class, for which parsing failed.

For example is it possible to write something similar to following in my parser.fsy file:

Expression1: 
   | INT         { Int $1 }
     ...
   | _           { failwith "Error with parsing in Expression1"}
Fowkes answered 9/3, 2011 at 10:47 Comment(0)
S
9

Just skipping the "_" should lead to a shift/reduce conflict. For a small set of tokens, you could list them all. For a larger set of tokens, it is more problematic.

The F# compiler does something similar by defining prefixes of earlier rules, and sets an error state:

atomicPattern:
  ...
  | LPAREN parenPatternBody RPAREN 
      {  let m = (lhs(parseState)) in SynPat.Paren($2 m,m) } 
  | LPAREN parenPatternBody recover 
      { reportParseErrorAt (rhs parseState 1) (FSComp.SR.parsUnmatchedParen()); $2 (rhs2 parseState 1 2) }
  | LPAREN error RPAREN 
      { (* silent recovery *) SynPat.Wild (lhs(parseState)) }
  | LPAREN recover 
      {  reportParseErrorAt (rhs parseState 1) (FSComp.SR.parsUnmatchedParen()); SynPat.Wild (lhs(parseState))}  

recover: 
   | error { true }  
   | EOF { false }

You can see the whole file in the repository.

More info on error handling in ocamlyacc/fsyacc can be found in the OCaml manual (Part III → Lexer and parser generators → Error handling).

Samothrace answered 9/3, 2011 at 12:32 Comment(2)
thank you for your answer - it give a lot. Trying to add the solution in my codeFowkes
Both solutions (keys in rules with blank tokens and example with recover) works for me, so thank you again. I'm marking answer as acceptedFowkes

© 2022 - 2024 — McMap. All rights reserved.