Is there a way to pass contextual information to parsers?
Asked Answered
P

2

8

I am parsing a small declarative language where in a scope you can have variables declared (with a type), and then later on, just like in most other languages, the name (without the type) is used.

The declaration of variable would look like this:

?varname
?varname1 ?varname2 - type1
?varname3 ?varname4 ?varname5 - type2

If the type is omitted, the default type should be object, like in the first case. So for that I have a specific parser which returns a list of my own domain object called LiftedTerm (you can just assume its a tuple with the name of the variable and the type of the variable, in reality there is some more stuff in it but irrelevant for this problem):

def typed_list_variables : Parser[List[LiftedTerm]]= typed_variables.+ ^^ { case list => list.flatten.map(variable =>
        LiftedTerm(variable._1, variable._2 match {
          case "object" => ObjectType
          case _ => TermType(variable._2)
        })) }

def typed_variables = ((variable+) ~ (("-" ~> primitive_type)?)) ^^ {
    case variables ~ primitive_type => 
         for (variable <- variables) yield variable -> primitive_type.getOrElse("object")
}

def variable = """\?[a-zA-Z][a-zA-Z0-9_-]*""".r
def primitive_type = """[a-zA-Z][a-zA-Z0-9_-]*""".r

All this works perfectly fine.

Now further down in the same 'scope' I have to parse the parts where there is a reference to these variables. The variable obviously won't be declared again in full. So, in the above example, places where ?varname1 is used won't include type1. However, when I parse the rest of the input I wish to get the reference of the right LiftedTerm object, rather than just a string.

I have some recursive structures in place, so I don't wish to do this mapping at the top level parser. I don't wish to make a 'global mapping' of these either in my RegexParsers object because most of these are scoped and only relevant for a small piece of the input.

Is there a way of passing contextual information to a parser? Ideally I pass the list of LiftedTerm (or better still a map from the variable names String -> LiftedTerm) into the recursive parser calls.

(Apologies if this is something obvious, I am still new to Scala and even newer to parser combinators).

Pathan answered 2/1, 2014 at 14:38 Comment(6)
I believe scala.util.parsing.ast.Binders would have let you do something along these lines, but it has been deprecated for a while now and is on its way to extinction.Robustious
@Robustious Thanks. So what's the standard way of achieving this? I presume its a common thing to need when parsing something which refers to something else which was defined earlier.Pathan
I don't know about standard, sorry. My personal take on this is that needing binders (or an ad-hoc type system) is a clear sign to move to a multiple-pass processor; parse, analyze names, analyze types. That will be drastically easier to maintain. That's just my take, though.Robustious
OK thanks for your input. All examples I find with Scala Parser - Combinators are practically the same... simple instances of mathematical expression parsers or simple structures. :(Pathan
My question here is probably related. Unfortunately the answers aren't terribly satisfying.Feola
@TravisBrown Yep, somewhat related. I think I have to put my untyped variables in a temporary structure without types, and then find their respective type declaration when combining them together into my domain object. Not impressively clean but its the only solution I can think of.Pathan
D
1

AFAIK, scala's combinator parser library is limited to contex-free grammars. Hence, your usecase is not supported.

The proper way to go would be to extend scala.util.parsing.combinator.Parsers and provide a custom Parser class which carries your context around. Than you need to define all the combinators to also deal with the context.

edit: As has been pointed out below, parsers have a method into and flatMap, therefore, when you have a parser that yields your context, you can combine it with another parser that requires a context in monadic style.

Deutsch answered 16/2, 2014 at 21:46 Comment(3)
With combinators flatMap/into, it is clearly not limited to context-free grammars.Rundgren
Where do you see those combinators? I cannot find them here: www.scala-lang.org/api/2.10.2/index.html#scala.util.parsing.combinator.ParserDeutsch
scala-lang.org/api/current/…Rundgren
U
1

"Is there a way of passing contextual information to a parser?" Yes. For instance here is a toy implementation of the lambda calculus that erases variable names.

case class Lam(bod: LambdaExp) extends LambdaExp
case class Var(v: Int) extends LambdaExp {
  require(v >= 0)
}
case class App(f: LambdaExp, a: LambdaExp) extends LambdaExp

sealed trait LambdaExp

and can be parsed with

class LambdaParser extends RegexParsers {

  def varName: Parser[String] = """[a-z|A-Z]+""".r

  def variable(ctx: List[String]): Parser[Var] = varName ^? ({ case name if ctx.contains(name) => Var(ctx.indexOf(name)) }, { name => s"no var in scope with name $name" })

  def scope(v: String, ctx: List[String]): Parser[LambdaExp] = (("." ~> exp(v :: ctx))) ^^ { case bod => bod }

  def lam(ctx: List[String]): Parser[Lam] = (("λ" ~> varName) >> { v => scope(v, ctx) }) ^^ { scope => Lam(scope) }

  def app(ctx: List[String]): Parser[App] = (exp(ctx) ~ exp(ctx)) ^^ { case f ~ a => App(f, a) }

  def exp(ctx: List[String]): Parser[LambdaExp] = ("(" ~> exp(ctx) <~ ")") | ("(" ~> app(ctx) <~ ")") |
    variable(ctx) | lam(ctx)

}

Note the context information being passed through the parse.

Unswear answered 16/3, 2018 at 18:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.