How can I rewrite the programs, so that I don't have to call `flex` but only call `bison` and `cc`?
Asked Answered
C

1

0

I already have a calculator program based on bison and flex which takes input from command line arguments.

Now how can I rewrite the programs, so that I don't have to call flex but only call bison and cc during building process? (Achieve something similar to https://unix.stackexchange.com/questions/499190/where-is-the-official-documentation-debian-package-iproute-doc#comment919875_499225).

$ ./fb1-5 '1+3'
= 4

Makefile:

fb1-5:  fb1-5.l fb1-5.y
    bison -d fb1-5.y
    flex fb1-5.l
    cc -o $@ fb1-5.tab.c lex.yy.c -lfl

fb1-5.y

/* simplest version of calculator */

%{
#  include <stdio.h>
%}

/* declare tokens */
%token NUMBER
%token ADD SUB MUL DIV ABS
%token OP CP

%%

calclist: /* nothing */
 | calclist exp { printf("= %d\n> ", $2); }
 ;

exp: factor
 | exp ADD exp { $$ = $1 + $3; }
 | exp SUB factor { $$ = $1 - $3; }
 | exp ABS factor { $$ = $1 | $3; }
 ;

factor: term
 | factor MUL term { $$ = $1 * $3; }
 | factor DIV term { $$ = $1 / $3; }
 ;

term: NUMBER
 | ABS term { $$ = $2 >= 0? $2 : - $2; }
 | OP exp CP { $$ = $2; }
 ;
%%
int main(int argc, char** argv)
{
  // printf("> ");
  if(argc > 1) {
    if(argv[1]){
      yy_scan_string(argv[1]);
    }
  }

  yyparse();
  return 0;
}

yyerror(char *s)
{
  fprintf(stderr, "error: %s\n", s);
}

fb1-5.l:

/* recognize tokens for the calculator and print them out */

%{
# include "fb1-5.tab.h"
%}

%%
"+" { return ADD; }
"-" { return SUB; }
"*" { return MUL; }
"/" { return DIV; }
"|"     { return ABS; }
"("     { return OP; }
")"     { return CP; }
[0-9]+  { yylval = atoi(yytext); return NUMBER; }

"//".*  
[ \t]   { /* ignore white space */ }
.   { yyerror("Mystery character %c\n", *yytext); }
%%

Update:

I tried to follow the advice in the reply, see the modified code below. in main(), why is yyerror() called before printf("argv[%d]: %s ", n, argv[n])? Isn't yyerror() called only by yyparse(), and isn't yyparse only called after printf("argv[%d]: %s ", n, argv[n]) in main() in main().

$ ./fb1-5  2*4
2*4error: �
= 8

fb1-5.y:

/* simplest version of calculator */

%{
#  include <stdio.h>
  FILE * fin;
  int yylex (void);
  void yyerror(char *s);  
  %}

/* declare tokens */
%token NUMBER
%token ADD SUB MUL DIV ABS
%token OP CP

%%

calclist: /* nothing */
 | calclist exp { printf("= %d\n", $2); }
 ;

exp: factor
 | exp ADD exp { $$ = $1 + $3; }
 | exp SUB factor { $$ = $1 - $3; }
 | exp ABS factor { $$ = $1 | $3; }
 ;

factor: term
 | factor MUL term { $$ = $1 * $3; }
 | factor DIV term { $$ = $1 / $3; }
 ;

term: NUMBER
 | ABS term { $$ = $2 >= 0? $2 : - $2; }
 | OP exp CP { $$ = $2; }
 ;
%%




/* The lexical analyzer returns a double floating point
   number on the stack and the token NUM, or the numeric code
   of the character read if not a number.  It skips all blanks
   and tabs, and returns 0 for end-of-input.  */

#include <ctype.h>
#include <string.h>

int yylex (void)
{
  char c;

/* Skip white space.  */
  while ((c = getc(fin)) == ' ' || c == '\t'){
    continue;
  }

  // printf("%s", &c);

  /* Process numbers.  */
  if (c == '.' || isdigit (c))
    {
      ungetc(c, fin);
      fscanf (fin, "%d", &yylval);
      return NUMBER;
    }

  /* Process addition.  */
  if (c == '+')
    {
      return ADD;
    }

  /* Process sub.  */
  if (c == '-')
    {
      return SUB;
    }

  /* Process mult.  */
  if (c == '*')
    {
      return MUL;
    }

  /* Process division.  */
  if (c == '/')
    {
      return DIV;
    }

  /* Process absolute.  */
  if (c == '|')
    {
      return ABS;
    }

   /* Process left paren.  */
   if (c == '(')
    {
      return OP;
    }

  /* Process right paren.  */
  if (c == ')')
    {
      return CP;
    }

  /* Return a single char.  */
  yyerror(&c);
  return c;
}


int main(int argc, char** argv)
{
  // evaluate each command line arg as an arithmetic expression
  int n=1;
  while (n < argc) {
    if(argv[n]){
      // yy_scan_string(argv[n]);
      // fin = stdin;
      fin = fmemopen(argv[n], strlen (argv[n]), "r");
      printf("%s ",argv[n]);
      fflush(stdout);
      yyparse();
    }
    n++;
  }

  return 0;
}

void yyerror(char *s)
{
  fprintf(stderr, "error: %s\n", s);
}
Competence answered 8/2, 2019 at 21:47 Comment(2)
flex reads the input. bison doesn't: it just calls flex. You don't have to do anything to bison to take the input from the command line. The grammar has nothing do with it either. Removing flex doesn't accomplish your objective. Your question doesn't make sense.Appetency
If what you really want is to read from a string (like a command line arg) instead of a file, you can just use flex's yy_scan_string function to read from a string instead of a file...Flashover
B
1

There is a basic implementation of a lexical scanner in the examples section of the bison manual. (Slightly less basic versions are later in the manual.)

That won't help you directly because it is based on fscanf, which means that it works on an input stream. Most C libraries contain functions which let you treat a character string as a FILE* (see, for example, the Posix standard fmemopen). Failing that, you'd have to replace the getc and scanf calls with string based alternatives, which means you will need to keep track of a buffer and input pointer somewhere. strtoul (or strtod) will prove useful because the second argument helps you keep track of how much of the string was used by the number.

Bettor answered 8/2, 2019 at 22:37 Comment(14)
Thanks. could you show me the code after change? I am still reading bison's manual.Competence
@tim: i think you would learn a lot more if you at least try to write the code yourself. It's really not that complicated.Bettor
Thanks. See my update. I can't figure out how bison uses yyflex() and my modification doesn't work.Competence
Why are you doing all those ungetc calls? You don't unget a token you have handled... Bison calls yylex when it needs a token. Yylex returns a token. That's it. It's not more complicated than that.Bettor
Also, using single character tokens as with the example in the bison manual does make everything simpler. But you can do it your way, too.Bettor
I am stuck because I can't figure out how bison uses yyflex(). I just want to see how the program can be converted to not use flex, to help me understand thatCompetence
@tim: what do you not understand about "bison calls yylex() when it needs a token. yylex() returns a token."?Bettor
after removing all ungetc calls, it still doesn't work. There are more problems than you mentioned. Could you just show the correct conversion to not using flex?Competence
@tim: you need the ungetc before the scanf. Do you see why? That should be sufficient to make it work.Bettor
Looking at it more closely, it would be good to add if (c == EOF) return 0; but I think it will work without that. Read the "debugging your parser" section of the bison manual to see how to enable bison's trace facility, which might help you debug your parser.Bettor
Thanks. See my update. when is yyerror() called? Why is it called before printf("argv[%d]: %s ", n, argv[n]) in main()?Competence
@tim: it isn't. Your printf in main doesn't output a newline so its output is buffered. The fprintf to stderr happens immediately because stderr is unbuffered. There are lots of SO posts about this aspect of the C library (nothing to do with bison or flex).Bettor
Thanks. updated my code. (1) What and why is that mysterious character in error: � in the output? (2) How shall I place the call to yyerror()? I want it to be called only if a character is not recognized. But it is called out of my expectationCompetence
@tim: (updated comment). See my comment about handling EOF. Since you don't handle it, the EOF triggers a call to yyerror. Note that the argument to yyerror is supposed to be a NUL-terminated character array, and that a single char is not NUL-terminated. That results in Undefined Behaviour, which might result in a segfault, printing garbage (as you see), or any other random outcome. This has nothing to do with bison or flex either. Also: c should be an int, not achar; getc returns an int because EOF isn't a valid char. See man getc.Bettor

© 2022 - 2024 — McMap. All rights reserved.