Why don't you just use regular expressions to recognize the comments? The whole point of lex/flex
is to save you from having to write lexical scanners by hand. The code you present should work (if you put the pattern /*
at the beginning of the line), but it's a bit ugly, and it is not obvious that it will work.
Your question says that you want to skip comments, but the code you provide uses putchar()
to print the comment, except for the /*
at the beginning. Which is it that you want to do? If you want to echo the comments, you can use an ECHO
action instead of doing nothing.
Here are the regular expressions:
Single line comment
This one is easy because in lex/flex, .
won't match a newline. So the following will match from //
to the end of the line, and then do nothing.
"//".* { /* DO NOTHING */ }
Multiline comment
This is a bit trickier, and the fact that * is a regular expression character as well as a key part of the comment marker makes the following regex a bit hard to read. I use [*]
as a pattern which recognizes the character *; in flex/lex, you can use "*"
instead. Use whichever you find more readable. Essentially, the regular expression matches sequences of characters ending with a (string of) * until it finds one where the next character is a /. In other words, it has the same logic as your C code.
[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/] { /* DO NOTHING */ }
The above requires the terminating */
; an unterminated comment will force the lexer to back up to the beginning of the comment and accept some other token, usually a / division operator. That's likely not what you want, but it's not easy to recover from an unterminated comment since there's no really good way to know where the comment should have ended. Consequently, I recommend adding an error rule:
[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/] { /* DO NOTHING */ }
[/][*] { fatal_error("Unterminated comment"); }