Writing `eval()` in C
Asked Answered
B

5

19

I've been trying to make an eval function in C for a while.

At the moment, my idea is to make a hash String -> function pointer with all the standard library C functions, and all the functions that I make, that way I could handle function invocations (on already defined functions).

However, defining functions with strings (i.e, calling eval("int fun(){return 1;}")) is still a problem, I don't know how I could handle this on runtime, does anyone have any idea?

Variable definitions don't seem too much of a problem, as I could just use another hash var_name -> pointer and use that pointer whenever the variable is required.

By the way, I don't care about performance, I want to get this to work.

Bergess answered 23/8, 2016 at 3:3 Comment(18)
Oh this is way too much for a question here. You basically are looking to write something like this: https://mcmap.net/q/12886/-is-there-an-interpreter-for-c-closed/1116364Carthy
One generally do this by either using a language suitable for embedding. Or by creating ones own language including parser and interpreter. You can't dynamically evaluate C, it's a statically compiled language. There are embeddable interpreters that accept a C-like language though, you might want to look into that?Swedish
And if you need dynamic evaluation maybe you are looking at it the wrong way? Either with the design of with the choice of language?Swedish
@DanielJour Mind elaborating on why this is too much for a question? I'll look into those interpreters, thanks.Bergess
@Bergess You need parser, some way to store the results of that parser (abstract syntax tree = AST), some way to "execute" that AST or parts thereof, so some form of a "virtual machine". You need to take care of correct scoping, correct semantics with respect to sequence points, correct representation of types. That's just too much to explain/reference for one answer. I suggest this to get a grasp on what it's like to build a language interpreter: buildyourownlisp.comCarthy
Stackoverflow is generally for relatively small, concrete and specific problems. CA compiler error, help finding out why some piece of code crashes, etc. This is a very broad question that could possibly involve everything from compiler and parser theory to how interpreters and byte-code work. Please read the sections named "What topics can I ask about here?" and "What types of questions should I avoid asking?" from the help pages.Swedish
I'm voting to reopen this thing because I know the answer and it isn't "you can't". The answer does not involving writing a compiler or even parsing C nor does it involve bytecode. It's rather simple really.Bellinzona
@Daniel Jour While I agree that writing a full fledged eval might be too much for a question here, I did ask a much more concrete question: How can I parse function definitions (without checking many of the "should be necessary" validations)? A working example with functions like int f(int a){ return a+1;} would've sufficed.Bergess
@Bellinzona Thanks for you contribution, I'm very interested.Bergess
@Bellinzona I'm looking forward to an answer then, voted to reopen, too.Carthy
@Bergess Maybe Joshua can show an approach that's "small enough" for the site. Parsing such a function definition alone is a complex thing, at least when done correctly. See lysator.liu.se/c/ANSI-C-grammar-y.html for an example.Carthy
@Bellinzona Can you give us an overview in a comment? If it's good I'll vote to reopen too.Swedish
My plan is to invoke the compiler on the string to compile a dynamic library and load the resulting library. It's a lot easier than trying to parse C.Bellinzona
@Bellinzona It's a workaround IMO, not a solution. It also have some problems that are hard to overcome to make it generic (like what header files needs to be included in the generated source file). It's also quite complex for beginners and definitely non-portable. But it might be enough for the needs of the OP so I'll vote to reopen. It is better than creating an executable and then use system to get the result, which is a really bad workaround.Swedish
@joshua: ok, voted to reopen. I'm curious to see your solution to invoking a compiled function without parsing its prototype.Sevenfold
XY problem? Why do you need this? Dynamic languages embeddable in C are a dime a dozen, why do you want to grow your own and make it resemble C?Laureen
@Bellinzona Also how you handle cases like eval("void set_x(){x=1;}") which accesses a global variable in the main program, or eval("void callme(int x){func(x);}") which calls a function in the main program. Or how you handle two eval calls where the second one modifies state established by the first call, such as through a shared variable. (Clearly that is in scope, seeing as OP has a hash of variable names.)Vigil
@RaymondChen: Good question. I had considered such things in preparing my answer last night but I don't like multi-paragraph comments.Bellinzona
H
10

A couple of weeks back, I wanted to do something similar and this is the first question that I stumbled upon, hence answering here, now that I have some hold on this :) I am surprised nobody mentioned tcc (specifically libtcc) which lets you compile code from a string and invoke the function thus defined. For e.g.:

int (*sqr)(int) = NULL;
TCCState *S = tcc_new();

tcc_set_output_type(S, TCC_OUTPUT_MEMORY);
tcc_compile_string(S, "int squarer(int x) { return x*x; }");
tcc_relocate(S, TCC_RELOCATE_AUTO);
sqr = tcc_get_symbol(S, "func");

printf("%d", sqr(2));
tcc_delete(S);

(Error handling omitted for brevity). Beyond this basic example, if one wants to use the variables of the host program within the dynamic function, a little more work is needed. If I had a variable int N; and I wanted to use it, I would need 2 things: In the code string:

 ... "extern int N;"

Tell tcc:

tcc_add_symbol(S, "N", &N);

Similarly, there are APIs to inject Macros, open entire libraries etc. HTH.

Handlebar answered 2/4, 2019 at 11:9 Comment(0)
B
9

Trying to parse C is a real pain in the behind; but we already know how to parse C; invoke the C compiler! Here we compile the eval code into a dynamic library and load it.

You may run into problems where your dynamic code can't find other functions or variables in your own code; the easy solution for that is to compile your whole program except for main() as a library and link the dynamic code library against it. You can avoid the -fpic penalty by setting your library load address only a few K above your main load address. On Linux, unresolved symbols in a library can be resolved by the executable if not stripped, and glibc depends on this functionality; however, compiler optimizations get in the way sometimes so the total library method may be necessary.

Sample code below is for Linux. This can be adopted to other Unix including Mac OSX with minor work. Attempting on Windows is possible, but harder as you have no guarantee of a C compiler unless you're willing to ship one; and on Windows there's the obnoxious rule about multiple C runtimes so you must build with the same one you ship, and therefore must also build with the same compiler you ship. Also, you must use the total library technique here or symbols in your main program just won't resolve in the library (PE file format can't express the necessary).

This sample code provides no way for the eval() code to save state; if you need this you should do so either by variables in the main program or (preferred) passing in a state structure by address.

If you are trying to do this in an embedded environment, don't. This is a bad idea in the embedded world.

In answer to rici's comment; I have never seen a case where the argument types and return type of an eval() block were not statically determined from the surrounding code; besides else how would you be able to call it? Example code below could be cut up extracting the shared part so the per-type part is only a couple of lines; exercise is left for the reader.

If you don't have a specific reason to want dynamic C; try embedded LUA instead with a well-defined interface.

/* gcc -o dload dload.c -ldl */

#include <dlfcn.h>
#include <stdio.h>

typedef void (*fevalvd)(int arg);

/* We need one of these per function signature */
/* Disclaimer: does not support currying; attempting to return functions -> undefined behavior */
/* The function to be called must be named fctn or this does not work. */
void evalvd(const char *function, int arg)
{
        char buf1[50];
        char buf2[50];
        char buf3[100];
        void *ctr;
        fevalvd fc;
        snprintf(buf1, 50, "/tmp/dl%d.c", getpid());
        snprintf(buf2, 50, "/tmp/libdl%d.so", getpid());
        FILE *f = fopen(buf1, "w");
        if (!f) { fprintf (stderr, "can't open temp file\n"); }
        fprintf(f, "%s", function);
        fclose(f);
        snprintf(buf3, 100, "gcc -shared -fpic -o %s %s", buf2, buf1);
        if (system(buf3)) { unlink(buf1); return ; /* oops */ }

        ctr = dlopen(buf2, RTLD_NOW | RTLD_LOCAL);
        if (!ctr) { fprintf(stderr, "can't open\n"); unlink(buf1); unlink(buf2); return ; }
        fc = (fevalvd)dlsym(ctr, "fctn");
        if (fc) {
                fc(arg);
        } else {
                fprintf(stderr, "Can't find fctn in dynamic code\n");
        }
        dlclose(ctr);
        unlink(buf2);
        unlink(buf1);
}

int main(int argc, char **argv)
{
        evalvd("#include <stdio.h>\nvoid fctn(int a) { printf(\"%d\\n\", a); }\n", 10);
}
Bellinzona answered 23/8, 2016 at 15:16 Comment(7)
As noted in the comments, this does seem like a roundabout way, and requires many more permissions (creating files, running gcc, etc) on the system than a real eval (dynamically parsing the code, only requiring (potentially a lot of) memory).Azo
I just tested your example, your eval example calls the function fctn with the value 10, but nowhere in the string there's an invocation, this is surely not the intended behaviour.Azo
The 10 is the second argument to the evalvd function; it is indeed the intended behavior.Bellinzona
Let me rephrase myself, it is the indented behavior by you, not in any regular eval function, why are you invoking a function that's just being defined in that code?Azo
@YoTengoUnLCD: Because in C you cannot have an expression in top-level code; therefore the top-level code must be wrapped in a function by the code-builder implied to exist by the question; and it takes arguments because closures are probably not possible without compiling the surrounding code with a custom compiler. Also note that OP knows this and even his example wraps it in a function.Bellinzona
In my case (macOS), #include <stdlib.h> and #include <unistd.h> are required to compile. It works perfect for me, tyFortnightly
Meaning of library dl in gccCupule
U
5

It's possible, but a pain to do. You need to write parser that takes the text as input and generates a syntax tree; then you need to simplify constructs (eg. converting loops into goto statements and simplify expressions into single-static assignments that have only 1 operation). Then you need to match all of the patterns in your syntax tree with sequences of instructions on the target machine that perform the same tasks. Finally, you need to select the registers to use for each of those instructions, spilling them onto the stack if necessary.

In short, writing an implementation for eval in C is possible, but a huge amount of work that requires a lot of expertise and knowledge in several fields of computer science. The complexity of writing a compiler is the precise reason why most programming languages are either interpreted or use a virtual machine with a custom bytecode. Tools like clang and llvm make this a lot easier, but those are written in C++, not C.

Unswerving answered 23/8, 2016 at 6:9 Comment(2)
There's still the issue of calling the function. You might need to look at the Foreign Function Interface library (libffi or on Github libffi). Or there might be another way to do that work.Overstudy
@JonathanLeffler I considered mentioning libffi, but I figured he could probably read between the lines that I was saying that what he's trying to do is just completely impractical. He would be far better off using Lua or Python like everyone else.Unswerving
G
0

Bearing in mind some restrictions, using OpenCL can be a possible way to implement eval in C/C++. Once your OpenCL implementation provides ability to compile kernels and execute them, not matter where on CPU or GPU (or some other 'accelerator' device), that means you can generate kernel code strings at your C/C++ application runtime, compile them and enqueue to execute. Also OpenCL APIs provide an ability to lookup for kernel compilation, linking and execution errors. So, please take a look onto OpenCL.

Gyrfalcon answered 2/12, 2018 at 13:2 Comment(0)
W
0

If you only need an eval() function that get mathematical formulas like: "sin(pi)/e/sqrt(24)" This formulas is set from console, you only need an recursive algorithm in C that analyze the code. I have this program in no more that 417 lines. The program not compile, only is an interpreter in real time. Have constants too like C, e, pc, UA. If you need I can share it. You can compile with: gcc fx.c -lm -o fx (This is a prototype only, take with your own responsibility)

    //fx.c version 1.0
    // Calculadora que admite fórmulas matemáticas:
    //__next={coments, debug, no pointer, idiom=es}
    //1 parsec  3.086 × 10^16 metros
    #include <stdio.h>
    
    //#include <conio.h>
    #include <math.h>
    #include <string.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <ctype.h>
    #include <libintl.h>
    #include <locale.h>
    
    /* Tipos definidos: */
    typedef char MaxCalcCad[256];
    typedef char str80[81];
    typedef long double Ldoble;
    typedef int bool;
    
    int i,p,ruptura;
    str80 formula="";
    Ldoble resultado=0L;
    bool error;
    bool _fact=0;
    
    /* Prototipos de las funciones: */
    char *ch2ptr(char ch);
    char *copy(char *cadena, int index, int count);
    Ldoble fact(Ldoble f);
    Ldoble calcular_formula(int *p, MaxCalcCad cad, bool *error);
    bool eval(MaxCalcCad formula, Ldoble *valor, int *ruptura);
    char sigp(MaxCalcCad formula,int *p);
    Ldoble expr(MaxCalcCad formula,int *p,char *c);
    Ldoble exprsimp(MaxCalcCad formula,char *c,int *p);
    Ldoble termino(MaxCalcCad formula,int *p,char *c);
    Ldoble fact_s(MaxCalcCad formula,char *c,int *p);
    Ldoble fct(MaxCalcCad formula,char *c,int *p);
    Ldoble proc_as_num(MaxCalcCad formula,char *c,int *p);
    Ldoble proc_as_new_exp(MaxCalcCad formula,char *c, int *p);
    Ldoble proc_like_stdfunc(MaxCalcCad formula,char *c,int *p);
    
    
    int main(void){
        /*
        idioma esEs
        */
    setlocale(LC_ALL, "es_Es");
    int cont,opcion, SALIR=0;
    Ldoble a,b,c,d,x;
    resultado=0L;
    //textcolor(1);
    //textbackground(0);
    //clrscr();
    printf("\n");
    error=0;
    cont=0;SALIR=0;
    fgets(formula,80,stdin);
    formula[strcspn(formula, "\r\n")] = 0;
    
    //printf("El resultado de la formula=%s es:\n",formula);
    resultado=calcular_formula(&p,formula,&error);
    if(error){
        //gotoxy(10,5);
        printf("\n");
        //textcolor(3);
        printf("\nError!\n");
        printf("%s\n",formula);
        for(i=1;i<=p;++i)
            printf(" ");
            printf("^\n\r");
            //textcolor(1);
        }
        else
            printf("\n%10.19Lf\n",resultado);

    return 0;
}

char *ch2ptr(char ch){
    char *cp=" ";
    cp[0]=ch;
    return (cp);
}

char *copy(char *cadena, int index, int count)
{
    char *cp;
    index;
    if((cp=strdup(cadena+index))==NULL){
        //gotoxy(10,25);
        printf("\n\n");
        printf("Error en asignación de memoria. _copy()");
        exit(1);
    }
    cp[count]='\0';
    return (cp);
}

Ldoble calcular_formula(int *p, MaxCalcCad cad, bool *error){
    static Ldoble r;
    ruptura=0;
    //static int i,ruptura;
    //static char c;
    *error=eval(cad, &r, p);
    return (r);
}

bool eval(MaxCalcCad formula, Ldoble *valor, int *ruptura){
    static int p,i;
    register int l;
    static char c;
    MaxCalcCad aux="";
    l=strlen(formula);
    for(i=0; i<l;i++)
        formula[i]=toupper(formula[i]);
    if(formula[0]=='.'){
        strcat(aux,ch2ptr('0'));
        strcat(aux,formula);
        strcpy(formula,aux);
    }
    l=strlen(formula);
    if(formula[0]=='+'){
        for(i=0;i<l;i++)
            formula[i]=formula[i+1];
        formula[l]='\0';
    }
    p=-1;
    c=sigp(formula,&p);
    *valor = expr(formula,&p,&c);
    if(c=='\r')
        error=0;
    else
        error=1;
    *ruptura=p;
    return (error);
}
char sigp(MaxCalcCad formula,int *p){
    char c;
    do{
        ++*p;
        if(*p<strlen(formula))
            c=formula[*p];
        else
            c='\r';
    }while (c==' ');
    return c;
}
Ldoble expr(MaxCalcCad formula,int *p,char *c){
    Ldoble e;
    char operador;

    e=exprsimp(formula,c,p);
    while ((*c=='+')||(*c=='-')){
        operador=*c;
        *c=sigp(formula,p);
        switch(operador){
            case '+':e=e+exprsimp(formula,c,p); break;
            case '-':e=e-exprsimp(formula,c,p); break;
        }
    }
    return (e);
}
Ldoble exprsimp(MaxCalcCad formula,char *c,int *p){
    Ldoble s;
    char operador;
    s=termino(formula,p,c);
    while((*c=='*')||(*c=='/')){
        operador=*c;
        *c=sigp(formula,p);
        switch(operador){
            case '*': s=s*termino(formula,p,c);break;
            case '/': s=s/termino(formula,p,c);break;
        }
    }
    return (s);
}
Ldoble termino(MaxCalcCad formula,int *p,char *c){
    Ldoble t;
    unsigned long int nt;
    t=fact_s(formula,c,p);
    while(*c=='^'){
        *c=sigp(formula,p);
        t=powl(t,fact_s(formula,c,p));
    }
    return (t);
}
Ldoble fact_s(MaxCalcCad formula,char *c,int *p){
    Ldoble f;
    if(*c=='-'){
        *c=sigp(formula,p);
        if(_fact){
            --*p;
            ruptura=*p;
            return((Ldoble)-1);
        }
        //*c=sigp(formula,p);
        return(-fct(formula,c,p));
    }
    return(fct(formula,c,p));
}

Ldoble fct(MaxCalcCad formula,char *c,int *p){
    //char fn[20];
    //int l,inicio;
    Ldoble f;
    if ((isdigit(*c))||(*c=='.'))
        f=proc_as_num(formula,c,p);
    else if (*c=='(')
        f=proc_as_new_exp(formula,c,p);
    else if(*c=='-')
        f=fact_s(formula,c,p);
    else f=proc_like_stdfunc(formula,c,p);
    return (f);
}
Ldoble proc_as_num(MaxCalcCad formula,char *c,int *p){
    int l,inicio;
    bool _error=0;
    Ldoble f;
    char ch;

    inicio=*p;
    do{
        if((*c=='.')&&_fact)
            return((Ldoble)-1);
        *c=sigp(formula,p);
    }while((isdigit(*c)));//||(*c=='.'));
    if(*c=='.'){
        if(_fact) _error=1;
        l=*p;
        do{
            *c=sigp(formula,p);
        }while((isdigit(*c)));//||(*c=='.'));
        f=atof(copy(formula,l,*p-l));
        if((f!=(Ldoble)0)&&_error){
            *p=l;
            return(f);
        }
    }
    if(*c=='E'){
        //*c=sigp(formula,p);
        ch=sigp(formula,p);
        l=*p;
        do{
            *c=sigp(formula,p);
        }while(isdigit(*c));//||*c=='.');
        if((ch=='-')&&(++l==*p)&&(*c=='\r')){
            ruptura=--*p;
            *c=ch;
            //return(f);
        }
    }
    if(*c=='.'){
        ruptura=*p;
        return(f);
    }

    f=(atof(copy(formula,inicio,*p-inicio)));
    if(*c=='!'){
        f=fact(f);
        *c=sigp(formula,p);
    }
    else if(*c=='²'){
        f=f*f;
        *c=sigp(formula,p);
    }

    return(f);
}
Ldoble proc_as_new_exp(MaxCalcCad formula,char *c, int *p){
    Ldoble f;
    *c=sigp(formula,p);
    f=expr(formula,p,c);
    if(*c==')'){
        *c=sigp(formula,p);
        if(*c=='²'){
            f*=f;
            *c=sigp(formula,p);
        }
        else if(*c=='!'){
            f=fact(f);
            *c=sigp(formula,p);
        }
    }
    else
        ruptura = *p;
    return (f);
}
Ldoble fact(Ldoble f){
    //int n;
    //n=(int)f;
    if(f>0)
        return(f*fact(f-1));
    return((Ldoble)1);
}
Ldoble proc_like_stdfunc(MaxCalcCad formula,char *c,int *p){
    Ldoble f;
    if(!strcmp(copy(formula,*p,3),"ABS")){
        *p=*p+2;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=fabsl(f);
        }
    else if(!strcmp(copy(formula,*p,4),"SQRT")){
        *p+=3;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=sqrt(f);
    }
    else if(!strcmp(copy(formula,*p,1),"¿")){
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=sqrt(f);
    }
    else if(!strcmp(copy(formula,*p,3),"SQR")){
        *p+=2;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f*=f;
    }
    else if(!strcmp(copy(formula,*p,3),"SIN")){
        *p+=2;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=sin(f);
    }
    else if(!strcmp(copy(formula,*p,3),"SEN")){
        *p+=2;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=sin(f);
    }
    else if(!strcmp(copy(formula,*p,3),"COS")){
        *p+=2;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=cos(f);
    }
    else if(!strcmp(copy(formula,*p,3),"TAN")){
        *p+=2;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=tanl(f);
    }
    else if(!strcmp(copy(formula,*p,2),"TG")){
        *p+=1;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=tanl(f);
    }
    else if(!strcmp(copy(formula,*p,6),"ARCTAN")){
        *p+=5;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=atanl(f);
    }
    else if(!strcmp(copy(formula,*p,2),"LN")){
        ++*p;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=logl(f);
    }
    else if(!strcmp(copy(formula,*p,3),"LOG")){
        *p+=2;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
        f=log(f);
    }
    else if(!strcmp(copy(formula,*p,2),"PI")){
        ++*p;
        *c=sigp(formula,p);
        //f=fct(formula,c,p);
        f=(Ldoble) 3.1415926535897932384626433832795029;
    }
    else if(!strcmp(copy(formula,*p,2),"UA")){
        ++*p;
        *c=sigp(formula,p);
        f=(Ldoble) 149597870700L;
    }
    else if(!strcmp(copy(formula,*p,2),"PC")){
        ++*p;
        *c=sigp(formula,p);
        f=(Ldoble) 30860000000000000L;
    }   
    else if(!strcmp(copy(formula,*p,1),"C")){
        //++*p;
        *c=sigp(formula,p);
        f=(Ldoble) 299792458L;
    }
    else if(!strcmp(copy(formula,*p,3),"EXP")){
        *p+=2;
        *c=sigp(formula,p);
        f=fct(formula,c,p);
// verificar que expl(f) no haga overflow: if((f=expl(f))==_LHUGE_VAL))
        f=expl(f);
    }   
    else if(!strcmp(copy(formula,*p,1),"E")){
       //++*p; no se incrementa porque se sumaría 0 al valor del puntero
       *c=sigp(formula,p);
       f=(Ldoble) 2.7182818284590452353602874713526625L;
    }
    else if(!strcmp(copy(formula,*p,4),"FACT")){
        *p+=3;
        *c=sigp(formula,p);
        _fact=1;
        f=fct(formula,c,p);
        _fact=0;
        if(f<0){
            return(f);
         }
        f=fact(f);
    }
    //else
    ruptura=*p;
return(f);
}
Wilmerwilmette answered 17/2 at 5:21 Comment(3)
It is great to have you on this community! The question was about evaluating general C code with an emphasis on function and variable definitions, your answer does not address that so it is not really answering the question.Weatherglass
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From ReviewTowe
My apologies for my response. Given the examples of mathematical formulas, I only gave a solution so as not to have to compile the infinite compositions that there could be. That's why I gave the idea of a general purpose calculator in case it helps anyone, nothing more. Thank you.Wilmerwilmette

© 2022 - 2024 — McMap. All rights reserved.