How can I parse a C string (char *) with flex/bison?
Asked Answered
W

5

8

In my programming project I want to parse command line attributes using flex/bison. My program is called like this:

./prog -a "(1, 2, 3)(4, 5)(6, 7, 8)" filename

Is it possible to parse this string using flex/bison without writing it to a file and parsing that file?

Windham answered 25/11, 2010 at 17:43 Comment(3)
I would think that writing a simple state machine would be easier and cleaner than using Flex or Bison for this.Geo
If you think you need flex and bison, how complex is this grammar anyway? And I have to agree with James: for just parsing comma-separated lists of integers with optional whitespace and parentheses, C would be best.Etruria
I only used flex and bison together so far. Having a closer look it makes more sense to only use flex.Windham
N
6

See this question String input to flex lexer

Niigata answered 25/11, 2010 at 18:0 Comment(2)
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes.Camilacamile
Or maybe just close the question as a duplicate?Acaleph
H
3

I think you can achieve something like that (I did a similar thing) by using fmemopen to create a stream from a char*and then replace that to stdin

Something like that (not sure if it's fully functional since I'm actually trying to remember available syscalls but it would be something similar to this)

char* args = "(1,2,3)(4,5)(6,7,8)"
FILE *newstdin = fmemopen (args, strlen (args), "r");
FILE *oldstdin = fdup(stdin);

stdin = newstdin;

// do parsing

stdin = oldstdin;
Headwaiter answered 25/11, 2010 at 17:58 Comment(0)
B
2

Here is a complete flex example.

%%

<<EOF>> return 0;

.   return 1;

%%

int yywrap()
{
    return (1);
}

int main(int argc, const char* const argv[])
{
    YY_BUFFER_STATE bufferState = yy_scan_string("abcdef");

    // This is a flex source. For yacc/bison use yyparse() here ...    
    int token;
    do {
        token = yylex();
    } while (token != 0);

    // Do not forget to tell flex to clean up after itself. Lest
    // ye leak memory.
    yy_delete_buffer(bufferState);

    return (EXIT_SUCCESS);
}
Bethezel answered 28/3, 2016 at 0:51 Comment(0)
U
0

another example. this one redefines the YY_INPUT macro:

%{
int myinput (char *buf, int buflen);
char *string;
int offset;
#define YY_INPUT(buf, result, buflen) (result = myinput(buf, buflen));
%}
%%

[0-9]+             {printf("a number! %s\n", yytext);}

.                  ;
%%

int main () {
    string = "(1, 2, 3)(4, 5)(6, 7, 8)";
    yylex();
}

int myinput (char *buf, int buflen) {
    int i;
    for (i = 0; i < buflen; i++) {
        buf[i] = string[offset + i];
        if (!buf[i]) {
            break;
        }
    }
    offset += i;
    return i;
}
Unimpeachable answered 30/5, 2016 at 18:51 Comment(0)
S
-1

The answer is "Yes". See the O'Reilly publication called "lex & yacc", 2nd Edition by Doug Brown, John Levine, Tony Mason. Refer to Chapter 6, the section "Input from Strings".

I also just noticed that there are some good instructions in the section "Input from Strings", Chapter 5 of "flex and bison", by John Levine. Look out for routines yy_scan_bytes(char *bytes, int len), yy_scan_string("string"), and yy_scan_buffer(char *base, yy_size_t size). I have not scanned from strings myself, but will be trying it soon.

Scuta answered 7/8, 2015 at 13:52 Comment(1)
"go buy a book" isn't an answer.Knowledgeable

© 2022 - 2024 — McMap. All rights reserved.