Where to free memory in Bison/Flex?
Asked Answered
C

3

5

I'm using Bison & Flex for 1 month more or less, so I'm sorry if I don't see something obvious (but I don't think it is).

I have a problem about freeing memory with Flex Bison. Here is what my code looks like:

 parser.l

 {DATE}      { yylval.str= strdup(yytext);
             pair<string,string> newpair = make_pair("DATE",yytext);
             myvector.push_back(newpair);
              return TOKEN_DATE ;}

This is one of the example of my .l file. I copy the value of yytext into yylval.str. Then I create a new pair with that content (key/value, actually), then I return the token date for bison. My parser .y is not more than yyparse; and when something is caught, it just prints it.

I tried to run valgrind on this and I have multiple errors concerning strdup. I know this function uses malloc, but I have no idea WHERE and WHEN to use FREE.

I probably guess it's in the .y file, but where ?

 test:
      TOKEN_DATE                 { cout << $1 << endl; // here ? and what to free ?}

I don't really get all of this, I would really appreciate a simple and clear explanation.

Thanks in advance,


EDIT:

I tried several things like:

 test:
      TOKEN_DATE TOKEN_TOTO TOKEN_BLABLA { cout << $1 << endl; free($1); free($2);}
    | TOKEN_DATE test { cout << $1 << endl, free($1); }

It seems to compile and execute good, but valgrind still says to me that there is a problem with the malloc contained in the strdup function. But I cannot write free(yylval.str) inside the flex file, otherwise, bison will not be aware of the value (if I understood correctly, I tried it doesn't work). I really have no idea on how to remove this leaking problem.

Convergence answered 20/3, 2014 at 13:48 Comment(0)
S
7

You need to free the copied string once you no longer need it. In your rather simple case you could free($1) after printing it out, but it is often the case that the parser inserts the copied string into some datastructure, in which case that datastructure becomes the owner of the malloc'd storage, and the call to free will be performed in a destructor.

It's not really different from any other resource management issue; you need to always be clear who is the owner of an allocated resource because the owner has the responsibility for releasing the resource when it is no longer needed.

What's going on internally is that bison maintains a stack of semantic values, each of which has the type YYSTYPE (i.e. "semantic type"), which is also the type of yylval. When a token is shifted onto the stack, bison copies yylval onto the top of the stack. Before executing an action corresponding to a production, bison arranges for the semantic values of each terminal and non-terminal in the production to be known as $1, $2, etc. (This is not a copy; the various $x symbols are replaced with a reference to a location on the bison stack.)

Non-terminals also have semantic values, because each action stores a value into the pseudo-variable $$. (If the action doesn't do this, the value of $$ is unpredictable, but it still exists.) After the action finishes, bison removes the $1, $2... values from the top of the stack, and then copies the pseudo-variable $$ onto the top of the stack. It does not do anything to the values which were popped, so if they need to be freed or otherwise destructed, the action must do this itself.

Because semantic values are copied naively, the semantic type should not contain any C++ object which is not trivially copyable.

If you use the %union declaration, then the semantic type YYSTYPE is a union object, and you need to tell bison which union tag applies to each terminal and non-terminal. In that case, $$ and all the $n have the correct .tag appended to them automatically, and the actions become somewhat more type-safe.

Spiros answered 20/3, 2014 at 14:23 Comment(3)
Are you talking about the "%union" when you say datastructure ? Plus I tried to do this : ` TOKEN_DATE TOKEN_X TOKEN_Y { cout << $1 << $2 << $3 << endl; free($$) } ` it compiles, but it doesnt change anything. Why ?Convergence
@Convergence No, I mean some datastructure, like an AST (likely a tree or a graph) independent of semantic types. free($$) is definitely wrong because $$ doesn't have a value at that point. It only has a value when you assign it (or if there is no action, in which case @blackmesa: flex inserts the default action $$ = $1. In your modified rule, you have three char*s; if you want to free them (which you do, since you're not saving them anywhere), you need to free all of them: free($1); free($2); free($3);Spiros
@blackmesa: Tried to explain what's going on internally. I'm pretty sure all that is in the bison manual as well.Spiros
S
2

From the flex manual:

21.3 A Note About yytext And Memory

When flex finds a match, yytext points to the first character of the match in the input buffer. The string itself is part of the input buffer, and is NOT allocated separately. The value of yytext will be overwritten the next time yylex() is called. In short, the value of yytext is only valid from within the matched rule's action.

Therefore shouldn't make_pair be the following?

pair<string,string> newpair = make_pair("DATE",yylval.str);

if so, then the string should be freed when cleaning up the pairs memory.

Superheat answered 18/3, 2015 at 10:48 Comment(0)
A
1

The easy answer: it depends. You should free memory when you are done using it, whenever that may be.

The more helpful answer: try using as few allocations of memory as possible. If you never malloc any memory, you never have to free any memory. In the example you give, you pattern match on a date. Typically dates hold to a format and have some upper bound in length; for example, format 'yyyy/mm/dd' has 10 characters in it. If you can expect this, instead of creating a new string to hold the date, you can simply put a statically sized character array into yylval, such as char date[10];.

Augsburg answered 20/3, 2014 at 14:19 Comment(3)
I only gave a sample of my code, I have more than 50 patterns actually, so I take everything as a char*. The problem here is that I have to copy the value of yytext, this create allocation, but I have no idea of what or where to free this "yylval.str" (char* str). I mean, usually I know how to deal with memory, but the "bison-flex" thing still looks a bit blur to me.Convergence
Free it when you no longer need it. In the example you gave, you'd just free($1); after you print it if that is all you're going to use the value for.Augsburg
Does TOKEN_BLABLA need to be freed?Augsburg

© 2022 - 2024 — McMap. All rights reserved.