I'm using ply as my lex parser. My specifications are the following :
t_WHILE = r'while'
t_THEN = r'then'
t_ID = r'[a-zA-Z_][a-zA-Z0-9_]*'
t_NUMBER = r'\d+'
t_LESSEQUAL = r'<='
t_ASSIGN = r'='
t_ignore = r' \t'
When i try to parse the following string :
"while n <= 0 then h = 1"
It gives following output :
LexToken(ID,'while',1,0)
LexToken(ID,'n',1,6)
LexToken(LESSEQUAL,'<=',1,8)
LexToken(NUMBER,'0',1,11)
LexToken(ID,'hen',1,14) ------> PROBLEM!
LexToken(ID,'h',1,18)
LexToken(ASSIGN,'=',1,20)
LexToken(NUMBER,'1',1,22)
It doesn't recognize the token THEN, instead it takes "hen" as an identifier.
Any ideas?
r' \t'
is a raw string. My guess is the\t
could not be escaped in it and it would have worked had you removed the initialr
=>t_ignore = ' \t'
– Domash