Example for ast.NodeTransformer that mutates an equation

Asked 17/6, 2011 at 11:48 Answered 21/8, 2011 at 11:32

python abstract-syntax-tree genetic-programming

This is a continuation of my last question. I want to parse an equation and work on the ast I get. What I want to do is basically randomly scramble it so I get a new equation, that has to be also a valid function. This is to be used in a genetic algorithm.

Here is where I start:

class Py2do(ast.NodeTransformer):
def __init__(self):
  self.tree=[]
def generic_visit(self, node):
    print type(node).__name__
    self.tree.append(type(node).__name__)
    ast.NodeVisitor.generic_visit(self, node)
    depth=3
    s = node.__dict__.items()
    s = "    ".join("%s %r" % x for x in sorted(node.__dict__.items()))
    print( "%s%s\t%s" % (depth, str(type(node)), s) )
    for x in ast.iter_child_nodes(node):
      print (x, depth)

def visit_Name(self, node):
    # print 'Name :', node.id
    pass

def visit_Num(self, node):
    print 'Num :', node.__dict__['n']

def visit_Str(self, node):
    print "Str :", node.s

def visit_Print(self, node):
    print "Print :"
    ast.NodeVisitor.generic_visit(self, node)

def visit_Assign(self, node):
    print "Assign :"
    ast.NodeVisitor.generic_visit(self, node)

def visit_Expr(self, node):
    print "Expr :"
    ast.NodeVisitor.generic_visit(self, node)





if __name__ == '__main__':
    node = ast.parse("res= e**(((-0.5*one)*((delta_w*one/delta*one)**2)))")
    import ast_pretty
    print ast.dump(node)
    pprintAst(node)
    v = Py2do()
    v.visit(node)
    print v.tree

What I want to get out is something like this :

res= e**(delta*((one/delta_w*one)**2)))

or another valid random equation of some sort. This will be used in a Fortran program, so it would be nice if the resulting equation can also be transferred into Fortran. Please comment your code and provide a test sample/unit test.

Nullifidian answered 17/6, 2011 at 11:48 Comment(7)

I think you should mark expression nodes with their proper type, e.g. associative, binary, monary, etc. Then you can scramble the tree based on the type of operators to maintain coherency. – Cesta 8/7, 2011 at 12:45

it is not that easy to access the nodes of the tree. if i had that that would be a big step forward. – Nullifidian 8/7, 2011 at 13:34

But you've already done a "Visitor" pattern there... you could clone the tree and then mutate it accordingly. – Cesta 8/7, 2011 at 16:38

ok that would mean set up a secondary data structure and then mutate it. that would not be that hard i think. but what about the conversion back? i need a valid expression in the end... – Nullifidian 11/7, 2011 at 6:42

You could do an expression generator on your new data structure and then you can 'ast.parse()' it or even 'compile()' it for later runnning on the python interpreter. You could even have an 'expresion' object than can fit the expressions according to the language you want to generate. – Cesta 11/7, 2011 at 13:19

how would i set up an expression generator? – Nullifidian 11/7, 2011 at 13:44

You just traverse the generated tree on depth-first order; on the leafs print the value/variable; when returning from a leaf on a binary-op put the operator between each sibling; when returning from a node to a group-op (parentheses) you wrap the expression on them. Similar to the way Lambda-calculus works ;) – Cesta 11/7, 2011 at 20:1

So the input and the output are Fortran code? And you want to use arbitrary Fortran expressions/statements? (Including array slices, ...?) Fortran is a pretty complex language; reading it requires pretty much a full parser.

Perhaps you want to use an program transformation tool that can already manipulate Fortran directly. Such a tool would read the Fortran code, build an AST, let you "randomize" it using a set of randomly chosen transformations, and then regenerate valid Fortran code.

Our DMS Software Reengineering Toolkit with its Fortran front end could be directly used for this.

EDIT Aug 26 2011: OP confirms he wants to "evolve" (transform) real Fortran code. It is worth noting that building a real Fortran parser (like building parsers for any other real language) is pretty hard; it took us months and our tools are really good at defining parsers (we've done some 40 languages and a variety of dialects using DMS). It is probably not a good idea for him to build his own real Fortran parser, at least not if he wants to get on with his life or his actual task.

It might be possible for OP to constrain the Fortran code to a very restricted subset, and build a parser for that.

Podiatry answered 21/8, 2011 at 11:32 Comment(2)

actuallz, yes that is what i want to do in the end. but at first i am working on just transforming an expression. is there any way to test your software? – Nullifidian 26/8, 2011 at 8:26

DMS's front end can parse full f77/f95 and transform it. But DMS is pretty complex (because manipulating code is very complex) and takes some effort to learn. So we don't offer a just-download-and-play trial because people that think they will make vast progress in an afternoon are invariably disappointed. What they can do in month or two can be pretty spectacular. Normally we go through a series of confidence building steps to convince both parties that a longer experiment makes sense. The first step is a direct discussion with you; I suggest you contact us via our web site. – Podiatry 26/8, 2011 at 13:42

-1

What are you trying to do? Looking for the right permutation of an equation might be easy but time consuming (n! possibilities), but generating new ones and optimize those using a genetic algorithm is imho impossible, because it`s not an optimization problem... For example x^0.00 and x^0.01 are fundamental different. Also, you can not optimize for the right operator, that just won't work. Sorry.

Although, the situation isn't that bad. Looking for the right function is an extremely common task. I am now assuming that you do not know the function, but you know a couple of points from measurements (you would have needed that to calculate the fitness in your genetic algorithm anyway, didn't you?). You now can use Lagrange to get a polynomial which passes those given points. There are two good examples in the middle of the wikipedia article, and lagrange is quite easy to implement (<10 lines of code I guess). Also note that you have the ability to improve the accuracy of the polynomial just by adding more reference points.

Armourer answered 24/6, 2011 at 12:54 Comment(2)

I think he's asking to generate equations based on an input, then those equations will be input into a genetic algorithm system which is separate/already complete. – Pagepageant 25/6, 2011 at 23:27

actually i want to generate just random equations, just valid code. I know kinda like the equation shoul look like and want the algorithm just to try out a lot of possibilities within the genetic framework i already have. – Nullifidian 26/6, 2011 at 6:32

Recommended topics

Hot tags