Pyparsing, parsing the contents of php function comment blocks using nested parsers
Asked Answered
D

1

7

AKA "Add sub-nodes constructed from the results of a Parser.parseAction to the parent parse tree"

I'm trying to parse PHP files using PyParsing (Which rules IMHO) whereby the function definitions have been annotated with JavaDoc style annotations. The reason is that I want to store type information in a way that can be used to generate client stub code.

For example:

/*
*  @vo{$user=UserAccount}
*/
public function blah($user){ ......

Now, I've been able to write a parser, it's super easy using PyParser. But, PyParser comes with a built in javaStyleComment Token, which I wanted to reuse. So I parsed the code and then tried to attach a parseAction which would strip out the gunk and run a sub-parser (sorry, not certain of the terminology) and attach the result to the parent parse tree.

I can't figure out how to do it. The code is attached below. Incidentally, I could easily write my own javaStyleComment but I'm wondering in general is it possible to chain parse results?

Again, sorry if my question is not succinct, I'm only a novice at this stuff.

#@PydevCodeAnalysisIgnore
from pyparsing import delimitedList,Literal,Keyword,Regex,ZeroOrMore,Suppress,Optional,QuotedString,Word,hexnums,alphas,\
    dblQuotedString,FollowedBy, sglQuotedString,oneOf,Group
import pyparsing

digits = "0123456789"
colon = Literal(':')
semi = Literal(';')
period = Literal('.')
comma = Literal(',')
lparen = Literal('{')
rparen = Literal('}')
lbracket = Literal('(')
rbracket = Literal(')')
number = Word(digits)
hexint = Word(hexnums,exact=2)
text = Word(alphas)

php = Literal("<?php") + Literal("echo") + Literal("?>")
print php.parseString("""<?php echo ?>""")

funcPerm = oneOf("public private protected")

print funcPerm.parseString("""public""")
print funcPerm.parseString("""private""")
print funcPerm.parseString("""protected""")

stdParam = Regex(r"\$[a-z][a-zA-Z0-9]*")
print stdParam.parseString("""$dog""")

dblQuotedString.setParseAction(lambda t:t[0][1:-1])
sglQuotedString.setParseAction(lambda t:t[0][1:-1])
defaultParam = Group(stdParam + Literal("=") + ( dblQuotedString | sglQuotedString | number))  
print defaultParam.parseString(""" $dave = 'dog' """)

param = ( defaultParam | stdParam )
print param.parseString("""$dave""")

#print param.parseString("""dave""")
print param.parseString(""" $dave = 'dog' """)
print param.parseString(""" $dave = "dog" """)

csl = Optional(param  + ZeroOrMore( Suppress( "," ) + param))
print csl.parseString("""$dog,$cat,$moose     """)
print csl.parseString("""$dog,$cat,$moose = "denny"     """)
print csl.parseString("""""")
#
funcIdent = Regex(r"[a-z][_a-zA-Z0-9]*")
funcIdent.parseString("farb_asdfdfsDDDDDDD")
#
funcStart = Group(funcPerm + Literal("function") + funcIdent)
print funcStart.parseString("private function dave")
#
#
litWordlit = Literal("(") +  csl + Literal(")")
print litWordlit.parseString("""( )""")

funcDef = funcStart + Literal("(") + Group(csl)  + Literal(")")
#funcDef.Name = "FUNCTION"
#funcDef.ParseAction = lambda t: (("found %s") % t)
print funcDef.parseString("""private function doggy($bow,$sddfs)""")

funcDefPopulated = funcStart + Literal("(") + Group(csl)  + Literal(")") + Group(Literal("{")  +  ZeroOrMore(pyparsing.CharsNotIn("}"))  +Literal("}")) 
#funcDef.Name = "FUNCTION"
#funcDef.ParseAction = lambda t: (("found %s") % t)
print funcDefPopulated.parseString("""private function doggy($bow,$sddfs){ $dog="dave" }""")

#" @vo{$bow=BowVo}"
docAnnotations = ZeroOrMore( Group( Literal("@") + text + Suppress(lparen) + param + Literal("=") + text  + Suppress(rparen ) ))
print docAnnotations.parseString(""" @vo{$bow=BowVo}""")

def extractDoco(s,l,t):
    """ Helper parse action for parsing the content of a comment block
    """
    ret = t[0]
    ret = ret.replace('/**','')
    ret = ret.replace('*\n','')
    ret = ret.replace('*\n','\n')
    ret = ret.replace('*/','')
    t = docAnnotations.parseString(ret)
    return  t

phpCustomComment = pyparsing.javaStyleComment

#Can't figure out what to do here. Help !!!!!
phpCustomComment.addParseAction(extractDoco)

commentedFuncDef  =  phpCustomComment + funcDefPopulated
print commentedFuncDef.parseString(
                                   """
                                   /**
                                   * @vo{$bow=BowVo}
                                   * @vo{$sddfs=UserAccount}
                                   */
                                   private function doggy($bow,$sddfs){ $dog="dave" }"""
                                   )


*emphasized text*





#example = open("./example.php","r")
#funcDef.parseFile(example)
#f4.parseString("""private function dave ( $bow )""")
#funcDef = funcPerm + Keyword("function") + funcName + Literal("(")  +  csl  + Literal(")")  
#print funcDef.parseString(""" private function doggy($bow)""")

=== Update

I've discovered that ParseResults for example has a method insert which allows you to augment the parse tree, but still can't figure out how to do it dynamically.

For example:

title = oneOf("Mr Miss Sir Dr Madame")
aname = title + Group(Word(alphas) + Word(alphas))
res=aname.parseString("Mr Dave Young")
res
(['Mr', (['Dave', 'Young'], {})], {})

res.insert(3,3)

res
(['Mr', (['Dave', 'Young'], {}), 3], {})
Dray answered 22/2, 2012 at 17:27 Comment(4)
Perhaps I should have titled this "PyParsing, add sub-nodes to parse tree"Dray
If you feel like it, you can edit your question and change the title.Mongo
You may want to check out how PHPUnit parses the comment blocks, as it allows for a limited number of signatures to be set. For example /** @expectedException myException **/ More info @ phpunit.de/manual/3.6/en/…Mendelism
Thanks Mike, will check it out.Dray
D
2

Firstly, I'm in Love. PyParser has to be one of the nicest libs I've ever used. Secondly, the solution was really, really, easy.

Here's how I fixed it:

docAnnotations = ZeroOrMore( Group( ZeroOrMore(Suppress("*")) +   Suppress(Literal("@")) + Suppress(Literal("vo")) + Suppress(lparen) + param + Literal("=") + text  + Suppress(rparen ) ))
print docAnnotations.parseString(""" @vo{$bow=BowVo}""")

def extractDoco(t):
    """ Helper parse action for parsing the content of a comment block
    """
    ret = t[0]
    ret = ret.replace('/**','')
    ret = ret.replace('*\n','')
    ret = ret.replace('*\n','\n')
    ret = ret.replace('*/','')
    print ret
    return docAnnotations.parseString(ret)  

phpCustomComment = pyparsing.javaStyleComment

The last section:

print commentedFuncDef.parseString(
                                   """
                                   /**
                                   * @vo{$bow=BowVo}
                                   * @vo{$sddfs=UserAccount}
                                   */
                                   private function doggyWithCustomComment($bow,$sddfs){ $dog="dave" }"""
                                   )

The Result:

[['$bow', '=', 'BowVo'], ['$sddfs', '=', 'UserAccount'], ['private', 'function', 'doggyWithCustomComment'], '(', ['$bow', '$sddfs'], ')', ['{', ' $dog="dave" ', '}']]
Dray answered 22/2, 2012 at 19:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.