How to control error handling and synchronization in Antlr 4 / c#
Asked Answered
S

2

5

I'm using Antlr 4 with c# target. Here is a subset of my grammar:

/*
 * Parser Rules
 */
text : term+  EOF;
term : a1 a2 a3;
a1: ....
...
...

I want to accept valid data blocks as (term)s, when error exists I want to search for the next valid term and print out the whole text which caused the error for user to analyze manually.

How to synchronize input to the next valid term? and How to get the ignored text?

Sparklesparkler answered 31/8, 2013 at 16:4 Comment(0)
F
6

You will need to create your own implementation of IAntlrErrorStrategy for this, and then set the Parser.ErrorHandler property to an instance of your error strategy. The documentation for the Java versions of the ANTLRErrorStrategy interface and default implementation DefaultErrorStrategy may provide useful information for implementing an error strategy, but I must warn you going in that creating a custom error strategy is an advanced feature with limited documentation. It's expected that the implementer is already an expert in ANTLR 4's implementation of the Adaptive LL(*) parsing algorithm (we're talking researcher-level understanding).

Felloe answered 31/8, 2013 at 23:18 Comment(4)
Thanks Mr. Harwell, (It's the second time you answer me for two similar questions. so, thank you very much , but ,in both, you gave me a very general answer, I still can not find a solution.) If I just wanted to get the ignored text when error occurred. How could i find it?Sparklesparkler
There is no simple solution here that is not "very general". Each time I've needed something special regarding error handling, I rolled my own custom solution that was specific to the particular language I was parsing and the application I was developing.Felloe
In DefaultErrorStrategy documentation, there is something about recover method says that recover method will consume tokens until resynchronization, can we collect consumed tokens? (I'm trying to extend DefaultErrorStrategy).Sparklesparkler
Thank you Mr. Harwell,I found some useful information, I put it in an answer and hope you advice me if there is better solution.Sparklesparkler
S
1

For the first question (How to synchronize input to the next valid term?) I found some useful information that led me to acceptable solution.

Antlr generates next subcode for previous grammar:

public TextContext text() {
    TextContext _localctx = new TextContext(_ctx, State);
    EnterRule(_localctx, 0, RULE_text);
    int _la;
    try {
        EnterOuterAlt(_localctx, 1);
        State = 49;
        _errHandler.Sync(this);
        _la = _input.La(1);
        do {
            State = 48; term();
            State = 51;
            _errHandler.Sync(this);
            _la = _input.La(1);
        } while ( _la==KEYWORD );
        State = 53; Match(EOF);
    }
    catch (RecognitionException re) {
        _localctx.exception = re;
        _errHandler.ReportError(this, re);
        _errHandler.Recover(this, re);
    }
    finally {
        ExitRule();
    }
    return _localctx;
}

The call _errHandler.Sync(this); makes the parser advances through the input stream in an attempt to find next valid turn (as a result of "term+" component). To stop parser from sync in other subrules accept "term" rule", I Extended DefaultErrorStrategy Class as next:

public class MyErrorStrategy : Antlr4.Runtime.DefaultErrorStrategy
{
    public EventErrorStrategy() : base()
    { }

    public override void Sync(Antlr4.Runtime.Parser recognizer)
    {
        if(recognizer.Context is Dict.TextAnalyzer.DictionaryParser.TextContext)
            base.Sync(recognizer);
    }
}

then provided it to the parser:

parser.ErrorHandler = new MyErrorStrategy();
Sparklesparkler answered 2/9, 2013 at 0:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.