compiler construction - depth first traversal of a tree using generators in python -
at moment, i'm programming compiler small subset of python in python. managed construct syntax tree got problems coding tree traversal (which essential generating code). i'll first go along showing data structures:
class abstractsyntaxtree(object): def __init__(self, startsymbol): self.root = node(startsymbol, val=none) self.nodes = [self.root] def addnode(self, name, val, parentid): parent = self.nodes[parentid] self.nodes.append(node(name=name, val=val, parent=parent)) return len(self.nodes)-1 def getlastid(self): return len(self.nodes)-1 def __iter__(self): node in self.root: yield node
this node definition:
class node: def __init__(self, name, val, parent=none): self.name = name self.val = val self.parent = parent self.children = [] if parent: parent.children.append(self) def __iter__(self): yield self child in self.children: node in child: yield node
my parser recursive descent parser, every grammar symbol function calling other grammar symbols. program
start symbol.
def program(self, indentlvl=0): parent = self.syntree.getlastid() if self.smellandconsume(tok.eof, parentid=parent): return self.smellandconsume(tok.newl, parentid=parent) self.syntree.addnode(name=var.statement, val=none, parentid=parent) self.statement() while self.accept , not self.isconsumable(tok.eof): self.consume(tok.newl, parentid=parent) self.syntree.addnode(name=var.statement, val=none, parentid=parent) self.statement() self.consume(tok.eof, parentid=parent)
now curious, if after succesful parsing able print nodes in syntax tree iterating depth first using __iter__
generator defined in node
, abstractsyntaxtree
.
def test_tree_traversal(): node in minipygrammar.syntree: print(node)
doesn't print nodes! when debugged code, realized, root node doesnt't have children in children list, although call addnode
root nodes id. have clue what's happening here?
if need more information or more code snippets, feel free ask.
edit: found solution (although still find odd what's happening here.) code behaves expected:
def test_tree_traversal(code): grammar = grammar() grammar.parse(code) node in grammar.syntree: print(node) def execute_tests(): name, code in programs.items(): parse_test(name, code) test_tree_traversal(code)
before had global grammar object , execute_tests call parse on grammar, after ran test_tree_traversal, accesses grammar-object syntree. strangely, in between calls, garbage collection deleted of nodes in ast. why suppose it's garbage collection? because behaviour nondeterministic.
edit: error-prone code: notice difference instantiate new grammar object before executing test. grammar has method ´parse´ returns true if program syntactically correct , constructs ast accessible via grammar.syntree.
minipygrammar = grammar() def parse_test( programname: str, programcode: str): success = minipygrammar.parse(programcode) if success: print('{} valid minipyprogram :)'.format(programname)) else: print('{} not valid minipyprogram'.format(programname)) print(minipygrammar.syntree) def tree_traversal(code): node in minipygrammar.syntree: print(node) def execute_tests(): name, code in programs.items(): parse_test(name, code) tree_traversal(code) if __name__ == '__main__': execute_tests()
rather trying iterate on tree, recommend using visitor pattern instead. approach allows modularize abstract syntax tree traversal.
note using approach, have create specfic classes each node type in tree. example, have operator
class operator nodes, functioncall
class function call nodes, etc.
here simple example of vistor pattern ast should started. ast consists of operator
nodes operators, , number
nodes numbers:
class node: pass class operator(node): def __init__(self, op, left, right): self.op = op self.left = left self.right = right class number(node): def __init__(self, value): self.value = value class astwalker: def visit(self, node): name = 'visit_' + node.__class__.__name__ vistor = getattr(self, name, self.missing) vistor(node) def missing(self, node): name = node.__class__.__name__ raise exception('no visitor method node type: ' + name) def visit_operator(self, node): print('operator:', node.op) self.visit(node.left) self.visit(node.right) def visit_number(self, node): print('number:', node.value) # ast represents expression: # # (1 * 5) - (3 / 5) # ast = operator(op='-', left=operator(op='*', left=number(1), right=number(5) ), right=operator(op='/', left=number(3), right=number(5) ) ) walker = astwalker() walker.visit(ast)
the output of above code is:
operator: - operator: * number: 1 number: 5 operator: / number: 3 number: 5
the interesting part of above code astwalker
class. it's here implement pattern. here's quick rundown.
the visit
method meat of above code. it's magic happens. keep long story short, visit
takes 1 argument node
. either operator
node or number
. takes name of node's class using node.__class__.__name__
. can see, perpended name visit
since visitor method's each node in tree - visit_operator
, visit_number
have visit.
lastly in self.visit
, use getattr
correct visitor method class. if node number
getattr
return visit_number
method. same applies operator
. visitor method called , node
passed in.
if find somehow node
object passed in doesn't have visitor method, return self.missing
, call it. self.missing
simple reports node object encountered didn't have visitor method.
as said above, each visitor method takes 1 argument node
. current node visiting. in example above, print attributes of each node
. modified generate bytecode.
Comments
Post a Comment