How to Write a Boolean Expression Evaluator in C?
Asked Answered
C

9

17

Suppose I have a string such as this in a text file:

(((var1 AND var2 AND var3) OR var4) AND ((var5 OR var6) AND var7))

After parsing this into the C program and the vars are handled and set correctly it will end up looking something like this:

(((1 AND 0 AND 0) OR 1) AND ((0 OR 1) AND 1))

Are there any useful libraries out there for evaluating expressions that are represented as one string like this? I was thinking I could just call a Perl program with the string as an argument that would be able to return the result easily, but I wasn't sure if there was a library in C that did this, or if there are any known algorithms for solving such expressions?

What I'm actually looking for is something that would spit out an answer to this expression (maybe parse was a bad word) i.e. 1 or 0.

In a nutshell, it's a file containing a bunch of random expressions (already known to be in the right format) that need to be evaluated to either 0 or 1. (The example above evaluates to 1 because it results in (1 AND 1)).

Croydon answered 23/9, 2009 at 13:11 Comment(6)
Will expressions always be unambiguously parenthesized like that, so you don't have to deal with precedence? Or is X OR Y AND Z allowed?Pemba
It can be represented anyway, i.e. there will not always be parenthesis, anythings allowed.Croydon
Evaluating the above expression needs a ~60 lines long program in pure C. Using a library may overkill.Musky
Snarky, not very help full suggestions: #929063 and #1385311. Actual help on the general problem: https://mcmap.net/q/63539/-learning-to-write-a-compiler-closedViridissa
@noɥʇʎԀʎzɐɹƆ Do you look for a parser that build directly a fast evaluator in x86 asm ? Or a fast parser and evaluator that can be compiled on x86 ?Camillacamille
@Camillacamille The parser and evaluator should compile the expression into x86 asm.Tomokotomorrow
M
8

I tried to write the most compact C code for this bool expression evaluation problem. Here is my final code:

EDIT: deleted

Here is the added negation handling:

EDIT: test code added

char *eval( char *expr, int *res ){
  enum { LEFT, OP1, MID, OP2, RIGHT } state = LEFT;
  enum { AND, OR } op;
  int mid=0, tmp=0, NEG=0;

  for( ; ; expr++, state++, NEG=0 ){
    for( ;; expr++ )
         if( *expr == '!'     ) NEG = !NEG;
    else if( *expr != ' '     ) break;

         if( *expr == '0'     ){ tmp  =  NEG; }
    else if( *expr == '1'     ){ tmp  = !NEG; }
    else if( *expr == 'A'     ){ op   = AND; expr+=2; }
    else if( *expr == '&'     ){ op   = AND; expr+=1; }
    else if( *expr == 'O'     ){ op   = OR;  expr+=1; }
    else if( *expr == '|'     ){ op   = OR;  expr+=1; }
    else if( *expr == '('     ){ expr = eval( expr+1, &tmp ); if(NEG) tmp=!tmp; }
    else if( *expr == '\0' ||
             *expr == ')'     ){ if(state == OP2) *res |= mid; return expr; }

         if( state == LEFT               ){ *res  = tmp;               }
    else if( state == MID   && op == OR  ){  mid  = tmp;               }
    else if( state == MID   && op == AND ){ *res &= tmp; state = LEFT; }
    else if( state == OP2   && op == OR  ){ *res |= mid; state = OP1;  }
    else if( state == RIGHT              ){  mid &= tmp; state = MID;  }
  }
}

Testing:

#include <stdio.h> 

void test( char *expr, int exprval ){
  int result;
  eval( expr, &result );
  printf("expr: '%s' result: %i  %s\n",expr,result,result==exprval?"OK":"FAILED");
}
#define TEST(x)   test( #x, x ) 

#define AND       && 
#define OR        || 

int main(void){
  TEST( ((( 1 AND 0 AND 0) OR 1) AND ((0 OR 1) AND 1)) );
  TEST( !(0 OR (1 AND 0)) OR !1 AND 0 );
}
Musky answered 24/9, 2009 at 16:59 Comment(4)
what about negation?, i can't get it to work with the ! character allowed :(), also needs to do things like "(!(0 or 1) and !1)" negation allowed on the values as well as the groups of values.Croydon
I got it to work for negating the 0's or 1's but I can't get it to work with negation at the start of parenthesis.Croydon
Actually this doesn't work completely: things like (!(0 or (1 and 0)) or !1 and 0) will fail. however it works for what I need it for where everything in a () group has the same operator.Croydon
I tested and it works well for me. I modified the code to handle '&&' and '||', and the C compiler gives the same result as my eval() function.Musky
S
8

You can embed lua in your program and then invoke it's interpreter to evaluate the expression.

Supererogate answered 23/9, 2009 at 13:23 Comment(0)
M
8

I tried to write the most compact C code for this bool expression evaluation problem. Here is my final code:

EDIT: deleted

Here is the added negation handling:

EDIT: test code added

char *eval( char *expr, int *res ){
  enum { LEFT, OP1, MID, OP2, RIGHT } state = LEFT;
  enum { AND, OR } op;
  int mid=0, tmp=0, NEG=0;

  for( ; ; expr++, state++, NEG=0 ){
    for( ;; expr++ )
         if( *expr == '!'     ) NEG = !NEG;
    else if( *expr != ' '     ) break;

         if( *expr == '0'     ){ tmp  =  NEG; }
    else if( *expr == '1'     ){ tmp  = !NEG; }
    else if( *expr == 'A'     ){ op   = AND; expr+=2; }
    else if( *expr == '&'     ){ op   = AND; expr+=1; }
    else if( *expr == 'O'     ){ op   = OR;  expr+=1; }
    else if( *expr == '|'     ){ op   = OR;  expr+=1; }
    else if( *expr == '('     ){ expr = eval( expr+1, &tmp ); if(NEG) tmp=!tmp; }
    else if( *expr == '\0' ||
             *expr == ')'     ){ if(state == OP2) *res |= mid; return expr; }

         if( state == LEFT               ){ *res  = tmp;               }
    else if( state == MID   && op == OR  ){  mid  = tmp;               }
    else if( state == MID   && op == AND ){ *res &= tmp; state = LEFT; }
    else if( state == OP2   && op == OR  ){ *res |= mid; state = OP1;  }
    else if( state == RIGHT              ){  mid &= tmp; state = MID;  }
  }
}

Testing:

#include <stdio.h> 

void test( char *expr, int exprval ){
  int result;
  eval( expr, &result );
  printf("expr: '%s' result: %i  %s\n",expr,result,result==exprval?"OK":"FAILED");
}
#define TEST(x)   test( #x, x ) 

#define AND       && 
#define OR        || 

int main(void){
  TEST( ((( 1 AND 0 AND 0) OR 1) AND ((0 OR 1) AND 1)) );
  TEST( !(0 OR (1 AND 0)) OR !1 AND 0 );
}
Musky answered 24/9, 2009 at 16:59 Comment(4)
what about negation?, i can't get it to work with the ! character allowed :(), also needs to do things like "(!(0 or 1) and !1)" negation allowed on the values as well as the groups of values.Croydon
I got it to work for negating the 0's or 1's but I can't get it to work with negation at the start of parenthesis.Croydon
Actually this doesn't work completely: things like (!(0 or (1 and 0)) or !1 and 0) will fail. however it works for what I need it for where everything in a () group has the same operator.Croydon
I tested and it works well for me. I modified the code to handle '&&' and '||', and the C compiler gives the same result as my eval() function.Musky
B
5

It's easy enough to roll your own recursive descent parser for simple expressions like these.

Blackface answered 23/9, 2009 at 13:16 Comment(0)
D
5

I has similar program around that implement recursive-decent parser so I brush it up and here it is.

 #include <stdio.h>
 #include <stdlib.h>

int doOR(int pOprd1, int pOprd2) { if (pOprd1 == -1) return pOprd2; return pOprd1 || pOprd2; } int doAND(int pOprd1, int pOprd2) { if (pOprd1 == -1) return pOprd2; return pOprd1 && pOprd2; } int doProcess(char pOpert, int pOprd1, int pOprd2) { if (pOpert == '0') return pOprd2; if (pOpert == 'O') return doOR (pOprd1, pOprd2); if (pOpert == 'A') return doAND(pOprd1, pOprd2); puts("Unknown Operator!!!"); exit(-1); } int* doParse(char pStr, int pStart) { char C; int i = pStart; int Value = -1; char Operator = '0'; for(; (C = pStr[i]) != 0; i++) { if (C == '0') { Value = doProcess(Operator, Value, 0); continue; } if (C == '1') { Value = doProcess(Operator, Value, 1); continue; } if (C == ' ') continue; if (C == ')') { int aReturn; aReturn = malloc(2*sizeof aReturn); aReturn[0] = Value; aReturn[1] = i + 1; return aReturn; } if (C == '(') { int * aResult = doParse(pStr, i + 1); Value = doProcess(Operator, Value, aResult[0]); i = aResult[1]; if (pStr[i] == 0) break; continue; } if ((C == 'A') && ((pStr[i + 1] == 'N') && (pStr[i + 2] == 'D'))) { if ((Operator == '0') || (Operator == 'A')) { Operator = 'A'; i += 2; continue; } else { puts("Mix Operators are not allowed (AND)!!!"); exit(-1); } } if ((C == 'O') && (pStr[i + 1] == 'R')) { if ((Operator == '0') || (Operator == 'O')) { Operator = 'O'; i += 1; continue; } else { puts("Mix Operators are not allowed (OR)!!!"); exit(-1); } } printf("Unknown character: '%c (\"%s\"[%d])'!!!", C, pStr, i); exit(-1); } int* aReturn; aReturn = malloc(2*sizeof aReturn); aReturn[0] = Value; aReturn[1] = i; return aReturn; }

And this is a test code:

int main(void) {
    char* aExpr   = "1";
    int*  aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "0";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "1 AND 0";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "1 AND 1";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "0 OR 0 OR 0";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "1 OR 0 OR 0";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "1 OR 1 OR 0";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "(1 OR 0)";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "(0 OR 0)";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    aExpr   = "((( 1 AND 0 AND 0) OR 1) AND ((0 OR 1) AND 1))";
    aResult = doParse(aExpr, 0);
    printf("%s = %d\n", aExpr, ((int*)aResult)[0]);
    free(aResult);
    puts("DONE!!!");
    return EXIT_SUCCESS;
}

This is fun :-D.

Drais answered 23/9, 2009 at 14:51 Comment(0)
C
4

@noɥʇʎԀʎzɐɹƆ is looking for a canonical parser and evaluator that can be build on a x86 and that generate x86 asm JIT evaluator for any boolean expression.

Here is a ANSI C99 basic parser and evaluator that build an expression tree with parse_tree and evaluate the parsed tree with eval_expr. This one avoid most of the usual pointer arithmetic and should be easily translated in many langages.

Error management is set to the basic minimum (missing or unexpected expression, wrong expr like NOT 1 NOT 0). It should not leak memory in case of wrong expression at parse, but not really tested in that way.

The eval_leaf could be replaced with one kind of libc strtol(), but it seems here better to do a basic int parsing delimited by an offset to avoid string allocations, as we deal with simple int normally.

The compile_expr method generates an x86 asm evaluator for the expression and the exec_expr_JIT method executes the resulting x86 asm evaluator. The exec_expr_JIT actually uses mmap and MapViewOfFile which is Linux/BSD/MacOS and Windows only. A more generic way is in study.

EDIT: Added bitwise NOT operator (BNOT) as the ~ in expression.

To test it, simply build it with an ANSI C99 compiler and launch it with the expected expression to parse and evaluate as first argument :

$ cc -o parser parser.c
$ ./parser '(NOT0AND(1OR(1 AND 0))AND(1OR(0OR0)))AND1ANDNOT!~0'
<<<<[NOT]0>[AND]<1[OR]<1 [AND] 0>>>[AND]<1[OR]<0[OR]0>>>[AND]1>[AND]<[NOT]<[NOT]<[BNOT]0>>>
 => 1
Compiled: 88 [B8 00 00 00 00 F7 D0 25 01 00 00 00 50 B8 01 00 00 00 50 B8 01 00 00 00 25 00 00 00 00 59 09 C1 59 21 C1 50 B8 01 00 00 00 50 B8 00 00 00 00 0D 00 00 00 00 59 09 C1 59 21 C1 25 01 00 00 00 50 B8 00 00 00 00 F7 D0 F7 D0 25 01 00 00 00 F7 D0 25 01 00 00 00 59 21 C1]
Res JIT : 1

It will first display the parsed expression in flat tree form with <leaf> and [operand], and then the result of its evaluation with => <result>. Next, it will display the x86 asm codeof the evaluator build for the parsed expression with its size in bytes, followed by the result of the execution of this evaluator.

The parser and evaluator source :

#include <stdio.h>
#include <malloc.h>
#include <string.h>
#include <stdint.h>
#ifdef _MSC_VER
#include <windows.h>
#include <memoryapi.h>
#else
#include <sys/mman.h>
#endif
enum e_operand { NONE, AND, OR, NOT, BNOT };
struct s_expr {
  struct s_expr_leaf {
    char* value;
    int offset;
    struct s_expr* expr;
  } *left;
  enum e_operand operand;
  struct s_expr_leaf* right;
};
struct s_expr* build_expr() {
  struct s_expr* lExpr = (struct s_expr*)malloc(sizeof(struct s_expr));
  if (lExpr == NULL) return(NULL);
  lExpr->left = NULL;
  lExpr->operand = NONE;
  lExpr->right = NULL;
  return(lExpr);
}
struct s_expr_leaf* build_leaf(struct s_expr* pExpr) {
  struct s_expr_leaf* lLeaf = (struct s_expr_leaf*)malloc(sizeof(struct s_expr_leaf));
  if (lLeaf == NULL) return(NULL);
  lLeaf->value = NULL;
  lLeaf->offset = 0;
  lLeaf->expr = pExpr;
  return(lLeaf);
}
struct s_expr* left_expr(struct s_expr** pExpr) {
  struct s_expr* lExprParent = build_expr();
  if (lExprParent == NULL) {
    perror("Can't allocate enough memory...");
    return(NULL);
  }
  lExprParent->left = build_leaf(*pExpr);
  if (lExprParent->left == NULL) {
    perror("Can't allocate enough memory...");
    free(lExprParent);
    return(NULL);
  }
  *(pExpr) = lExprParent;
  return lExprParent;
}
int free_expr(struct s_expr*);
struct s_expr* parse_tree(char** pExpr) {
  if (pExpr == NULL || *pExpr == NULL) return(NULL);
  struct s_expr* lExpr = build_expr();
  if (lExpr == NULL) {
    perror("Can't allocate enough memory...");
    return(NULL);
  }
  struct s_expr_leaf** lSide = &lExpr->left;
  while (**pExpr != 0) {
    switch (**pExpr & 0xDF) {
    case 8: // (
      (*pExpr)++;
      if (*lSide == NULL) {
        *lSide = build_leaf(NULL);
      }
      if (*lSide != NULL) (*lSide)->expr = parse_tree(pExpr);
      if (*lSide == NULL || (*lSide)->expr == NULL) {
        perror("Can't allocate enough memory...");
        free_expr(lExpr);
        return(NULL);
      }
      break;
    case 9: // )
      return(lExpr);
    case 'N': // NOT?
    case 1:
      if (**pExpr == '!' || (((*pExpr)[1] & 0xDF) == 'O' && ((*pExpr)[2] & 0xDF) == 'T')) {
        if (lExpr->operand != NONE) {
          if (lExpr->right != NULL) {
            printf("Wrong expression\n");
            free_expr(lExpr);
            return(NULL);
          }
          lExpr->right = build_leaf(parse_tree(pExpr));
          if (lExpr->right == NULL || lExpr->right->expr == NULL) {
            perror("Can't allocate enough memory...");
            free_expr(lExpr);
            return(NULL);
          }
          return(lExpr);
        }
        lExpr->operand = NOT;
        if (**pExpr != '!') (*pExpr) += 2;
        lSide = &lExpr->right;
      }
      break;
    case '^': // Bitwise NOT ?
      if (**pExpr == '~') {
        if (lExpr->operand != NONE) {
          if (lExpr->right != NULL) {
            printf("Wrong expression\n");
            free_expr(lExpr);
            return(NULL);
          }
          lExpr->right = build_leaf(parse_tree(pExpr));
          if (lExpr->right == NULL || lExpr->right->expr == NULL) {
            perror("Can't allocate enough memory...");
            free_expr(lExpr);
            return(NULL);
          }
          return(lExpr);
        }
        lExpr->operand = BNOT;
        lSide = &lExpr->right;
      }
      break;
    case 'A': // AND?
      if (((*pExpr)[1] & 0xDF) == 'N' && ((*pExpr)[2] & 0xDF) == 'D') {
        if (lExpr->operand != NONE) {
          if (left_expr(&lExpr) == NULL) {
            free_expr(lExpr);
            return(NULL);
          }
        }
        lExpr->operand = AND;
        (*pExpr) += 2;
        lSide = &lExpr->right;
      }
      break;
    case 'O': // OR?
      if (((*pExpr)[1] & 0xDF) == 'R') {
        if (lExpr->operand != NONE) {
          if (left_expr(&lExpr) == NULL) {
            free_expr(lExpr);
            return(NULL);
          }
        }
        lExpr->operand = OR;
        (*pExpr)++;
        lSide = &lExpr->right;
      }
      break;
    default:
      if (*lSide == NULL) {
        *lSide = build_leaf(NULL);
        if (*lSide == NULL) {
          perror("Can't allocate enough memory...");
          free_expr(lExpr);
          return(NULL);
        }
        (*lSide)->value = *pExpr;
      }
      if ((*lSide)->value == NULL) (*lSide)->value = *pExpr;
      (*lSide)->offset++;
    };
    (*pExpr)++;
  };
  return(lExpr);
}
int free_expr(struct s_expr* pExpr) {
  int lFlag = 0;
  if (pExpr == NULL) return(0);
  if (pExpr->left != NULL) {
    lFlag = free_expr(pExpr->left->expr);
    free(pExpr->left);
  }
  if (pExpr->right != NULL) {
    lFlag = free_expr(pExpr->right->expr);
    free(pExpr->right);
  }
  free(pExpr);
  return(lFlag);
}
int display_expr(struct s_expr* pExpr) {
  if (pExpr == NULL) return 0;
  if (pExpr->left != NULL) {
    if (pExpr->left->expr != NULL) {
      printf("<");
      display_expr(pExpr->left->expr);
      printf(">");
    } else {
      if (pExpr->left->value != NULL) printf("%.*s", pExpr->left->offset, pExpr->left->value);
    }
  }
  switch (pExpr->operand) {
  case NONE:
    break;
  case AND:
    printf("[AND]");
    break;
  case OR:
    printf("[OR]");
    break;
  case NOT:
    printf("[NOT]");
    break;
  case BNOT:
    printf("[BNOT]");
    break;
  };
  if (pExpr->right != NULL) {
    if (pExpr->right->expr != NULL) {
      printf("<");
      display_expr(pExpr->right->expr);
      printf(">");
    } else {
      if (pExpr->right->value != NULL) printf("%.*s", pExpr->right->offset, pExpr->right->value);
    }
  }
  return(0);
}
int eval_leaf(struct s_expr_leaf* pValue, int* pRes) {
  char* lValue;
  int lLimit = 0;
  int lStart = -1;
  int lSign = 1;
  if (pRes == NULL) return(1);
  if (pValue == NULL) return(1);
  lValue = pValue->value;
  lLimit = pValue->offset;
  if (lValue == NULL) return(1);
  *pRes = 0;
  while (lLimit > 0 && *lValue == ' ') { lValue++; lLimit--; }
  if (lLimit > 0 && (*lValue == '-' || *lValue == '+')) {
    if (*lValue == '-') lSign = -1;
    lLimit--;
    lValue++;
  }
  while (lLimit > 0 && *lValue != 0) {
    if (*lValue >= 0x30 && *lValue <= 0x39) {
      if (lStart == -1) lStart = lLimit;
    } else {
      break;
    }
    lLimit--;
    lValue++;
  }
  if (lStart > 0) {
    lStart -= lLimit;
    lValue--;
    lLimit = 1;
    while (lStart > 0) {
      *pRes += ((*lValue & 0xF) * lLimit);
      lLimit *= 10;
      lStart--;
      lValue--;
    };
  } else {
    printf("Expr or value missing ...\n");
    return(2);
  }
  *pRes *= lSign;
  return(0);
}
int eval_expr(struct s_expr* pExpr, int* pRes) {
  int lResLeft = 0;
  int lResRight = 0;
  enum e_operand lOperand = NONE;
  if (pRes == NULL) return(1);
  *pRes = 0;
  if (pExpr == NULL) return(1);
  if (pExpr->left != NULL) {
    if (pExpr->left->expr != NULL) {
      if (pExpr->left->value != NULL) {
        printf("Unexpected left value... %.*s\n", pExpr->left->offset, pExpr->left->value);
        return(2);
      }
      switch (eval_expr(pExpr->left->expr, &lResLeft)) {
      case 2:
        display_expr(pExpr); printf("\n");
        return(1);
      case 1:
        return(1);
      };
    } else {
      if (pExpr->left->value != NULL) {
        switch (eval_leaf(pExpr->left, &lResLeft)) {
        case 2:
          display_expr(pExpr); printf("\n");
          return(1);
        case 1:
          return(1);
        };
      }
    }
  }
  if (pExpr->right != NULL) {
    if (pExpr->right->expr != NULL) {
      if (pExpr->right->value != NULL) {
        printf("Unexpected right value... %.*s\n", pExpr->right->offset, pExpr->right->value);
        return(2);
      }
      switch (eval_expr(pExpr->right->expr, &lResRight)) {
      case 2:
        display_expr(pExpr); printf("\n");
        return(1);
      case 1:
        return(1);
      };
    } else {
      if (pExpr->right->value != NULL) {
        switch (eval_leaf(pExpr->right, &lResRight)) {
        case 2:
          display_expr(pExpr); printf("\n");
          return(1);
        case 1:
          return(1);
        };
      }
    }
  }
  switch (pExpr->operand) {
  case NONE:
    if (pExpr->left == NULL) {
      printf("Expr or value missing ...\n");
      return(2);
    }
    *pRes = lResLeft;
    break;
  case AND:
    if (pExpr->left == NULL || pExpr->right == NULL) {
      printf("Expr or value missing ...\n");
      return(2);
    }
    *pRes = (lResLeft & lResRight);
    break;
  case OR:
    if (pExpr->left == NULL || pExpr->right == NULL) {
      printf("Expr or value missing ...\n");
      return(2);
    }
    *pRes = (lResLeft | lResRight);
    break;
  case NOT:
    if (pExpr->right == NULL) {
      printf("Expr or value missing ...\n");
      return(2);
    }
    *pRes = !lResRight;
    break;
  case BNOT:
    if (pExpr->right == NULL) {
      printf("Expr or value missing ...\n");
      return(2);
    }
    *pRes = ~lResRight;
    break;
  };
  return(0);
}
// Expr compilation
struct s_expr_JIT {
  unsigned char* JIT;
  int size;
  int nbargs;
};
struct s_expr_JIT* build_expr_JIT() {
  struct s_expr_JIT* lExprJIT = (struct s_expr_JIT*)malloc(sizeof(struct s_expr_JIT));
  if (lExprJIT == NULL) {
    perror("Can't allocate enough memory...");
    return(NULL);
  }
  lExprJIT->JIT = NULL;
  lExprJIT->size = 0;
  lExprJIT->nbargs = 0;
  return(lExprJIT);
}
int free_expr_JIT(struct s_expr_JIT* pExpr) {
  if (pExpr == NULL) return(1);
  if (pExpr->JIT != NULL) free(pExpr->JIT);
  free(pExpr);
  return(0);
}
int set_expr_JIT(struct s_expr_JIT* pExpr, unsigned char* pOpCodes, int pNbOpCodes, int* pValue, unsigned char* pOpCodesNext, int pNbOpCodesNext) {
  unsigned char* lOffset;
  int lSizeref;
  if (pExpr == NULL) return(1);
  if (pExpr->JIT != NULL) {
    lSizeref = pExpr->size;
    pExpr->size += pNbOpCodes;
    pExpr->size += pNbOpCodesNext;
    if (pValue != NULL) pExpr->size += sizeof(int32_t);
    lOffset = (unsigned char*)realloc(pExpr->JIT, pExpr->size);
    if (lOffset == NULL) {
      perror("Can't allocate enough memory...");
      return(1);
    }
    pExpr->JIT = lOffset;
    lOffset = &pExpr->JIT[lSizeref];
  } else {
    pExpr->size = pNbOpCodes;
    pExpr->size += pNbOpCodesNext;
    if (pValue != NULL) pExpr->size += sizeof(int32_t);
    pExpr->JIT = (unsigned char*)malloc(pExpr->size);
    if (pExpr->JIT == NULL) {
      perror("Can't allocate enough memory...");
      return(1);
    }
    lOffset = pExpr->JIT;
  }
  if (pOpCodes != NULL) {
    if (memcpy(lOffset, pOpCodes, pNbOpCodes) == NULL) return(1);
    lOffset += pNbOpCodes;
  }
  if (pValue != NULL) {
    *((int32_t*)lOffset) = (int32_t)*pValue; // Keep endianness
    lOffset += sizeof(int32_t);
  }
  if (pOpCodesNext != NULL) {
    if (memcpy(lOffset, pOpCodesNext, pNbOpCodesNext) == NULL) return(1);
    lOffset += pNbOpCodesNext;
  }
  return(0);
}
int merge_expr_JIT(struct s_expr_JIT* pExpr, unsigned char* pOpCodes, int pNbOpCodes, struct s_expr_JIT* pSrc, unsigned char* pOpCodesMerge, int pNbOpCodesMerge) {
  unsigned char* lOffset;
  int lSizeref;
  if (pExpr == NULL) return(1);
  if (pExpr->JIT != NULL) {
    lSizeref = pExpr->size;
    pExpr->size += pNbOpCodes;
    pExpr->size += pNbOpCodesMerge;
    if (pSrc != NULL) pExpr->size += pSrc->size;
    lOffset = (unsigned char*)realloc(pExpr->JIT, pExpr->size);
    if (lOffset == NULL) {
      perror("Can't allocate enough memory...");
      return(1);
    }
    pExpr->JIT = lOffset;
    lOffset = &pExpr->JIT[lSizeref];
  } else {
    pExpr->size = pNbOpCodes;
    pExpr->size += pNbOpCodesMerge;
    if (pSrc != NULL) pExpr->size += pSrc->size;
    pExpr->JIT = (unsigned char*)malloc(pExpr->size);
    if (pExpr->JIT == NULL) {
      perror("Can't allocate enough memory...");
      return(1);
    }
    lOffset = pExpr->JIT;
  }
  if (pOpCodes != NULL) {
    if (memcpy(lOffset, pOpCodes, pNbOpCodes) == NULL) return(1);
    lOffset += pNbOpCodes;
  }
  if (pSrc != NULL) {
    if (memcpy(lOffset, pSrc->JIT, pSrc->size) == NULL) return(1);
    lOffset += pSrc->size;
  }
  if (pOpCodesMerge != NULL) {
    if (memcpy(lOffset, pOpCodesMerge, pNbOpCodesMerge) == NULL) return(1);
    lOffset += pNbOpCodesMerge;
  }
  return(0);
}
int compile_expr(struct s_expr* pExpr, struct s_expr_JIT** pRes) {
  int lResLeftValue = 0;
  int lResRightValue = 0;
  if (pExpr == NULL) return(1);
  if (pRes == NULL) return(1);
  struct s_expr_JIT* lResLeft = NULL;
  struct s_expr_JIT* lResRight = NULL;
  enum e_operand lOperand = NONE;
  *pRes = build_expr_JIT();
  if (*pRes == NULL) {
    return(1);
  }
  if (pExpr->left != NULL) {
    if (pExpr->left->expr != NULL) {
      if (pExpr->left->value != NULL) {
        printf("Unexpected left value... %.*s\n", pExpr->left->offset, pExpr->left->value);
        free_expr_JIT(*pRes);
        return(2);
      }
      switch (compile_expr(pExpr->left->expr, &lResLeft)) {
      case 2:
        display_expr(pExpr); printf("\n");
        free_expr_JIT(*pRes);
        return(1);
      case 1:
        free_expr_JIT(*pRes);
        return(1);
      };
    } else {
      if (pExpr->left->value != NULL) {
        switch (eval_leaf(pExpr->left, &lResLeftValue)) {
        case 2:
          display_expr(pExpr); printf("\n");
          free_expr_JIT(*pRes);
          return(1);
        case 1:
          free_expr_JIT(*pRes);
          return(1);
        };
      }
    }
  }
  if (pExpr->right != NULL) {
    if (pExpr->right->expr != NULL) {
      if (pExpr->right->value != NULL) {
        printf("Unexpected right value... %.*s\n", pExpr->right->offset, pExpr->right->value);
        free_expr_JIT(lResLeft); free_expr_JIT(*pRes);
        return(2);
      }
      switch (compile_expr(pExpr->right->expr, &lResRight)) {
      case 2:
        display_expr(pExpr); printf("\n");
        free_expr_JIT(lResLeft); free_expr_JIT(*pRes);
        return(1);
      case 1:
        free_expr_JIT(lResLeft); free_expr_JIT(*pRes);
        return(1);
      };
    } else {
      if (pExpr->right->value != NULL) {
        switch (eval_leaf(pExpr->right, &lResRightValue)) {
        case 2:
          display_expr(pExpr); printf("\n");
          free_expr_JIT(lResLeft); free_expr_JIT(*pRes);
          return(1);
        case 1:
          free_expr_JIT(lResLeft); free_expr_JIT(*pRes);
          return(1);
        };
      }
    }
  }
  switch (pExpr->operand) {
  case NONE:
    if (pExpr->left == NULL) {
      printf("Expr or value missing ...\n");
      free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
      return(2);
    }
    if (lResLeft != NULL) {
      if (merge_expr_JIT(*pRes, NULL, 0, lResLeft, NULL, 0) != 0) {
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    } else {
      if (set_expr_JIT(*pRes, (unsigned char[]) { 0xB8 }, 1, & lResLeftValue, NULL, 0) != 0) { // MOVL <int32 value>, %EAX
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    }
    break;
  case AND:
    if (pExpr->left == NULL || pExpr->right == NULL) {
      printf("Expr or value missing ...\n");
      free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
      return(2);
    }
    if (lResLeft != NULL) {
      if (merge_expr_JIT(*pRes, NULL, 0, lResLeft, NULL, 0) != 0) {
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    } else {
      if (set_expr_JIT(*pRes, (unsigned char[]) { 0xB8 }, 1, & lResLeftValue, NULL, 0) != 0) { // MOVL <int32 value>, %EAX
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    }
    if (lResRight != NULL) { // 0x91 XCHG %EAX,%ECX
      if (merge_expr_JIT(*pRes, (unsigned char[]) { 0x50 }, 1, lResRight, (unsigned char[]) { 0x59, 0x21, 0xC1 }, 3) != 0) {  // PUSH %EAX POP %ECX AND %EAX,%ECX
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    } else {
      if (set_expr_JIT(*pRes, (unsigned char[]) { 0x25 }, 1, & lResRightValue, NULL, 0) != 0) { // AND %EAX, <int32 value>
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    }
    break;
  case OR:
    if (pExpr->left == NULL || pExpr->right == NULL) {
      printf("Expr or value missing ...\n");
      free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
      return(2);
    }
    if (lResLeft != NULL) {
      if (merge_expr_JIT(*pRes, NULL, 0, lResLeft, NULL, 0) != 0) {
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    } else {
      if (set_expr_JIT(*pRes, (unsigned char[]) { 0xB8 }, 1, & lResLeftValue, NULL, 0) != 0) { // MOVL <int32 value>, %EAX
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    }
    if (lResRight != NULL) { // 0x91 XCHG %EAX,%ECX
      if (merge_expr_JIT(*pRes, (unsigned char[]) { 0x50 }, 1, lResRight, (unsigned char[]) { 0x59, 0x09, 0xC1 }, 3) != 0) {  // PUSH %EAX POP %ECX OR %EAX,%ECX
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    } else {
      if (set_expr_JIT(*pRes, (unsigned char[]) { 0x0D }, 1, & lResRightValue, NULL, 0) != 0) { // OR %EAX, <int32 value>
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    }
    break;
  case NOT:
    if (pExpr->right == NULL) {
      printf("Expr or value missing ...\n");
      free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
      return(2);
    }
    if (lResRight != NULL) {
      if (merge_expr_JIT(*pRes, NULL, 0, lResRight, (unsigned char[]) { 0xF7, 0xD0, 0x25, 0x01, 0x00, 0x00, 0x00 }, 7) != 0) {  // NOT %EAX AND %EAX,1
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    } else {
      if (set_expr_JIT(*pRes, (unsigned char[]) { 0xB8 }, 1, & lResRightValue, (unsigned char[]) { 0xF7, 0xD0, 0x25, 0x01, 0x00, 0x00, 0x00 }, 7) != 0) { // MOV %EAX, <int32 value> NOT %EAX
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    }
    break;
  case BNOT:
    if (pExpr->right == NULL) {
      printf("Expr or value missing ...\n");
      free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
      return(2);
    }
    if (lResRight != NULL) {
      if (merge_expr_JIT(*pRes, NULL, 0, lResRight, (unsigned char[]) { 0xF7, 0xD0 }, 2) != 0) {  // NOT %EAX 
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    } else {
      if (set_expr_JIT(*pRes, (unsigned char[]) { 0xB8 }, 1, & lResRightValue, (unsigned char[]) { 0xF7, 0xD0 }, 2) != 0) { // MOV %EAX, <int32 value> NOT %EAX
        free_expr_JIT(lResLeft); free_expr_JIT(lResRight); free_expr_JIT(*pRes);
        return(1);
      }
    }
    break;
  };
  if (lResLeft != NULL) free_expr_JIT(lResLeft);
  if (lResRight != NULL) free_expr_JIT(lResRight);
  return(0);
}
int dump_expr_JIT(struct s_expr_JIT* pExpr) {
  unsigned char* lOffset;
  int lSize;
  if (pExpr != NULL) {
    lOffset = pExpr->JIT;
    lSize = pExpr->size;
    while (lSize > 0) {
      printf("%02X", lOffset[0]);
      lOffset++;
      lSize--;
      if (lSize > 0) printf(" ");
    }
  }
  return(0);
}
int exec_expr_JIT(struct s_expr_JIT* pExpr, int* pRes) {
  unsigned char* lOffset;
  int (*lJit)();
  if (pRes == NULL) return(1);
  if (pExpr != NULL) {
#ifdef _MSC_VER
    HANDLE lHandle = CreateFileMappingA(INVALID_HANDLE_VALUE, NULL, PAGE_EXECUTE_READWRITE | SEC_COMMIT, 0, pExpr->size + 10, NULL);
    if (lHandle == NULL) {
      perror("Mapping failed...");
      return 1;
    }
    lOffset = (unsigned char*)MapViewOfFile(lHandle, FILE_MAP_ALL_ACCESS | FILE_MAP_EXECUTE, 0, 0, pExpr->size + 10);
    if (lOffset == NULL) {
      perror("Mapping failed...");
      return 1;
    }
#else
    lOffset = (unsigned char*)mmap(NULL, pExpr->size + 10, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
    if (lOffset == MAP_FAILED) {
      perror("Mapping failed...");
      return 1;
    }
#endif
    lOffset[0] = 0x55; // PUSH %EBP
    lOffset[1] = 0x48; // MOV %ESP, %EBP
    lOffset[2] = 0x89;
    lOffset[3] = 0xE5;
    lOffset[4] = 0x9C; // PUSHFD
    lOffset[5] = 0x51; // PUSH %ECX
    if (memcpy(&lOffset[6], pExpr->JIT, pExpr->size) == NULL) {
#ifdef _MSC_VER
      if (!UnmapViewOfFile(lOffset)) {
        perror("Unmapping failed...");
        return 1;
      }
      CloseHandle(lHandle);
#else
      if (munmap(lOffset, pExpr->size + 10) != 0) {
        perror("Unmapping failed...");
        return 1;
      }
#endif
      return(1);
    }
    lOffset[pExpr->size + 6] = 0x59; // POP %ECX
    lOffset[pExpr->size + 7] = 0x9D; // POPF
    lOffset[pExpr->size + 8] = 0xC9; // LEAVE
    lOffset[pExpr->size + 9] = 0xC3; // RETF

    lJit = (int (*)())lOffset;
    *pRes = lJit();
#ifdef _MSC_VER
    if (!UnmapViewOfFile(lOffset)) {
      perror("Unmapping failed...");
      return 1;
    }
    CloseHandle(lHandle);
#else
    if (munmap(lOffset, pExpr->size + 10) != 0) {
      perror("Unmapping failed...");
      return 1;
    }
#endif
  }
  return(0);
}
int parse(char* pExpr, int* pRes) {
  // Decomposing pExpr in tree form
  struct s_expr* lRoot = parse_tree(&pExpr);
  int lRes = 0;
  if (lRoot == NULL) return(1);
  display_expr(lRoot);
  printf("\n");
  if (eval_expr(lRoot, &lRes) == 0) {
    printf(" => %d\n", lRes);
  } else {
    printf(" Error in expr...\n");
  }
  struct s_expr_JIT* lCompile;
  if (compile_expr(lRoot, &lCompile) == 0) {
    printf("Compiled: %d [", lCompile->size);
    dump_expr_JIT(lCompile);
    printf("]\n");
    if (exec_expr_JIT(lCompile, &lRes) == 0) {
      printf("Res JIT : %d\n", lRes);
    } else {
      printf("Failed to exec JIT..\n");
    }
    free_expr_JIT(lCompile);
  }
  free_expr(lRoot);
  return(0);
}
int main(int nbargs, char* args[]) {
  int lRes = 0;
  if (nbargs > 0) parse(args[1], &lRes);
  return(0);
}
Camillacamille answered 25/4, 2021 at 20:22 Comment(18)
@Programmer It was mainly for the fun. 25 years ago i had designed a generic install script langage and wrote the related script interpreter. This question bring many memories for me.Camillacamille
Okay, have fun with them ! XDPandolfi
Thanks, I will look at using this. But I challenge you to write a x86 JIT that optimizes boolean expressions.Tomokotomorrow
@noɥʇʎԀʎzɐɹƆ I'll review the answer to include that tomorrow.Camillacamille
@noɥʇʎԀʎzɐɹƆ Just added the x86 asm evaluator, was fun. Work still in progress because the exec_expr_JIT require actually mmap which is Linux only. With a variable size asm code, i could go with "copy/exec on stack" and compile with no stack protection to have it work with *NIX and Windows, but i'm looking for something better.Camillacamille
@noɥʇʎԀʎzɐɹƆ Added windows with MapViewOfFile, should compile on x86 Linux/BSD/MacOS and Windows. I'm still looking for a way less OS dependent for asm code execution.Camillacamille
@noɥʇʎԀʎzɐɹƆ A better way for portability would be to use a ready-to-use JIT asm library like this one, are you ok with adding an edit using one for JIT compile&execute ?Camillacamille
@Camillacamille I'm ok with you editing it to use a JIT library! ThanksTomokotomorrow
Does this code process booleans packed into a number? Imagine storing 1 boolean for each bit of an unsigned integer. Can it shift and mask? Or would it be more efficient to unpack into separate bytes and pass it into your code?Tomokotomorrow
@noɥʇʎԀʎzɐɹƆ This code applies bitwise operators on signed int32 values provided in decimal radix (see eval_leaf), except for the NOT operator which is a pure boolean operator. Thus you can apply the same expression on 32 bits at once. For convenience, eval_leaf can be reviewed to eval binary ints instead of decimal. We may add too a bitwise NOT operator if useful. FYI, with asmjit, i've converted C code in C++ to avoid a C wrapper for asmjit C++ API.Camillacamille
@Camillacamille int32 values are perefect. I do need a bitwise NOT, actually.Tomokotomorrow
@noɥʇʎԀʎzɐɹƆ The C++ (C11) version with asmjit works well with gcc-8 but the size of the resulting code will exceed SO 30K limit. If anyone is interested i'll post it as another answer. Actually, the C++ version compile on VS2019 but fails to link with the VS asmjit.lib.Camillacamille
@Camillacamille Very interesting. I need the NOT operator.Tomokotomorrow
@noɥʇʎԀʎzɐɹƆ The bitwise NOT ?Camillacamille
@Camillacamille Yes, bitwise not.Tomokotomorrow
@noɥʇʎԀʎzɐɹƆ I'll add that tomorrow in the morning. If you need that now, you can change the '!' C boolean with '~' in the code, the NOT operator will then be a bitwise NOT.Camillacamille
@noɥʇʎԀʎzɐɹƆ Bitwise NOT operator added with the ~ operator in expression.Camillacamille
@noɥʇʎԀʎzɐɹƆ Added the C++ with a JIT asm library here.Camillacamille
B
3

I believe Lex and Yacc are still the best tools for simple parsing tasks like this.

Buschi answered 23/9, 2009 at 13:23 Comment(0)
K
2

A while ago, I wrote a complete C expression evaluator (i.e. evaluated expressions written using C syntax) for a command line processor and scripting language on an embedded system. I used this description of the algorithm as a starting point. You could use the accompanying code directly, but I did not like the implementation, and wrote my own from the algorithm description. It needed some work to support all C operators, function calls, and variables, but is a clear explanation and therefore a good starting point, especially if you don't need that level of completeness.

The basic principle is that expression evaluation is easier for a computer using a stack and 'Reverse Polish Notation', so the algorithm converts a in-fix notation expression with associated order of precedence and parentheses to RPN, and then evaluates it by popping operands, performing operations, and pushing results, until there are no operations left and one value left on the stack.

Kyrakyriako answered 23/9, 2009 at 18:54 Comment(0)
C
2

As an annex to this answer, here is the C++ version of the parser using a JIT asm library (asmjit) to ehance portability. Credit to @Kamiccolo for the JIT library idea. This code requires C11 C++ compiler and the asmjit library.

Tested (build compile and run) with GCC/G++-8 and VS2019 (x86 build for the code and for the asmjit library).

#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <iostream>
#include <asmjit/x86.h>
using namespace std;
//
// namespace for the expression parser
//
namespace parser {
  enum e_operand { NONE, AND, OR, NOT, BNOT };
  //
  // expression class
  //
  class expression {
  public:
    //
    // leaf class
    //
    class expression_leaf {
    public:
      char* value;
      int offset;
      expression* expr;
      expression_leaf(expression* pExpr) {
        this->value = NULL;
        this->offset = 0;
        this->expr = pExpr;
      }
      int eval_leaf(int* pRes) {
        char* lValue;
        int lLimit = 0;
        int lStart = -1;
        int lSign = 1;
        if (pRes == NULL) return(1);
        lValue = this->value;
        lLimit = this->offset;
        if (lValue == NULL) return(1);
        *pRes = 0;
        while (lLimit > 0 && *lValue == ' ') { lValue++; lLimit--; }
        if (lLimit > 0 && (*lValue == '-' || *lValue == '+')) {
          if (*lValue == '-') lSign = -1;
          lLimit--;
          lValue++;
        }
        while (lLimit > 0 && *lValue != 0) {
          if (*lValue >= 0x30 && *lValue <= 0x39) {
            if (lStart == -1) lStart = lLimit;
          } else {
            break;
          }
          lLimit--;
          lValue++;
        }
        if (lStart > 0) {
          lStart -= lLimit;
          lValue--;
          lLimit = 1;
          while (lStart > 0) {
            *pRes += ((*lValue & 0xF) * lLimit);
            lLimit *= 10;
            lStart--;
            lValue--;
          };
        } else {
          cerr << "eval_leaf: Expr or value missing ...\n";
          return(2);
        }
        *pRes *= lSign;
        return(0);
      }
    };
  private:
    expression_leaf* left;
    enum e_operand operand;
    expression_leaf* right;
  public:
    //
    // Main constructor
    //
    expression() {
      this->left = (expression_leaf*)NULL;
      this->operand = NONE;
      this->right = (expression_leaf*)NULL;
    }
    //
    // expression fabrics
    //
    static expression* build_expression(char* pExpr) {
      // Decomposing pExpr in tree form
      return(expression::parse_tree(&pExpr));
    }
    static expression* left_expr(expression** pExpr) {
      expression* lExprParent = new expression();
      if (lExprParent == NULL) {
        perror("Can't allocate enough memory...");
        return(NULL);
      }
      lExprParent->left = new expression_leaf(*pExpr);
      if (lExprParent->left == NULL) {
        perror("Can't allocate enough memory...");
        return(NULL);
      }
      *(pExpr) = lExprParent;
      return lExprParent;
    }
    static expression* parse_tree(char** pExpr) {
      if (pExpr == NULL || *pExpr == NULL) return(NULL);
      expression* lExpr = new expression();
      if (lExpr == NULL) {
        perror("Can't allocate enough memory...");
        return(NULL);
      }
      expression_leaf** lSide = &lExpr->left;
      while (**pExpr != 0) {
        switch (**pExpr & 0xDF) {
        case 8: // (
          (*pExpr)++;
          if (*lSide == NULL) {
            *lSide = new expression_leaf(NULL);
          }
          if (*lSide != NULL) (*lSide)->expr = parse_tree(pExpr);
          if (*lSide == NULL || (*lSide)->expr == NULL) {
            perror("Can't allocate enough memory...");
            return(NULL);
          }
          break;
        case 9: // )
          return(lExpr);
        case 'N': // NOT?
        case 1:
          if (**pExpr == '!' || (((*pExpr)[1] & 0xDF) == 'O' && ((*pExpr)[2] & 0xDF) == 'T')) {
            if (lExpr->operand != NONE) {
              if (lExpr->right != NULL) {
                cerr << "parse_tree: Wrong expression\n";
                return(NULL);
              }
              lExpr->right = new expression_leaf(lExpr->parse_tree(pExpr));
              if (lExpr->right == NULL || lExpr->right->expr == NULL) {
                perror("Can't allocate enough memory...");
                return(NULL);
              }
              return(lExpr);
            }
            lExpr->operand = NOT;
            if (**pExpr != '!') (*pExpr) += 2;
            lSide = &lExpr->right;
          }
          break;
        case '^': // Bitwise NOT ?
          if ((*pExpr)[0] == '~') {
            if (lExpr->operand != NONE) {
              if (lExpr->right != NULL) {
                cerr << "parse_tree: Wrong expression\n";
                return(NULL);
              }
              lExpr->right = new expression_leaf(lExpr->parse_tree(pExpr));
              if (lExpr->right == NULL || lExpr->right->expr == NULL) {
                perror("Can't allocate enough memory...");
                return(NULL);
              }
              return(lExpr);
            }
            lExpr->operand = BNOT;
            lSide = &lExpr->right;
          }
          break;
        case 'A': // AND?
          if (((*pExpr)[1] & 0xDF) == 'N' && ((*pExpr)[2] & 0xDF) == 'D') {
            if (lExpr->operand != NONE) {
              if (lExpr->left_expr(&lExpr) == NULL) {
                return(NULL);
              }
            }
            lExpr->operand = AND;
            (*pExpr) += 2;
            lSide = &lExpr->right;
          }
          break;
        case 'O': // OR?
          if (((*pExpr)[1] & 0xDF) == 'R') {
            if (lExpr->operand != NONE) {
              if (lExpr->left_expr(&lExpr) == NULL) {
                return(NULL);
              }
            }
            lExpr->operand = OR;
            (*pExpr)++;
            lSide = &lExpr->right;
          }
          break;
        default:
          if (*lSide == NULL) {
            *lSide = new expression_leaf(NULL);
            if (*lSide == NULL) {
              perror("Can't allocate enough memory...");
              return(NULL);
            }
            (*lSide)->value = *pExpr;
          }
          if ((*lSide)->value == NULL) (*lSide)->value = *pExpr;
          (*lSide)->offset++;
        };
        (*pExpr)++;
      };
      return(lExpr);
    }
    //
    // expression methods
    //
    int display_expr() {
      if (this->left != NULL) {
        if (this->left->expr != NULL) {
          cout << "<";
          this->left->expr->display_expr();
          cout << ">";
        } else {
          if (this->left->value != NULL) printf("%.*s", this->left->offset, this->left->value);
        }
      }
      switch (this->operand) {
      case NONE:
        break;
      case AND:
        cout << "[AND]";
        break;
      case OR:
        cout << "[OR]";
        break;
      case NOT:
        cout << "[NOT]";
        break;
      case BNOT:
        cout << "[BNOT]";
        break;
      };
      if (this->right != NULL) {
        if (this->right->expr != NULL) {
          cout << "<";
          this->right->expr->display_expr();
          cout << ">";
        } else {
          if (this->right->value != NULL) printf("%.*s", this->right->offset, this->right->value);
        }
      }
      return(0);
    }
    int eval_expr(int* pRes) {
      int lResLeft = 0;
      int lResRight = 0;
      enum e_operand lOperand = NONE;
      if (pRes == NULL) return(1);
      *pRes = 0;
      if (this->left != NULL) {
        if (this->left->expr != NULL) {
          if (this->left->value != NULL) {
            fprintf(stderr, "eval_expr: Unexpected left value... %.*s\n", this->left->offset, this->left->value);
            return(2);
          }
          switch (this->left->expr->eval_expr(&lResLeft)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
        } else {
          if (this->left->value != NULL) {
            switch (this->left->eval_leaf(&lResLeft)) {
            case 2:
              this->display_expr(); cout << "\n";
              return(1);
            case 1:
              return(1);
            };
          }
        }
      }
      if (this->right != NULL) {
        if (this->right->expr != NULL) {
          if (this->right->value != NULL) {
            fprintf(stderr, "eval_expr: Unexpected right value... %.*s\n", this->right->offset, this->right->value);
            return(2);
          }
          switch (this->right->expr->eval_expr(&lResRight)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
        } else {
          if (this->right->value != NULL) {
            switch (this->right->eval_leaf(&lResRight)) {
            case 2:
              this->display_expr(); cout << "\n";
              return(1);
            case 1:
              return(1);
            };
          }
        }
      }
      switch (this->operand) {
      case NONE:
        if (this->left == NULL) {
          cerr << "eval_expr: Expr or value missing ...\n";
          return(2);
        }
        *pRes = lResLeft;
        break;
      case AND:
        if (this->left == NULL || this->right == NULL) {
          cerr << "eval_expr: AND operator, Expr or value missing ...\n";
          return(2);
        }
        *pRes = (lResLeft & lResRight);
        break;
      case OR:
        if (this->left == NULL || this->right == NULL) {
          cerr << "eval_expr: OR operator, Expr or value missing ...\n";
          return(2);
        }
        *pRes = (lResLeft | lResRight);
        break;
      case NOT:
        if (this->right == NULL) {
          cerr << "eval_expr: NOT operator, Expr or value missing ...\n";
          return(2);
        }
        *pRes = !lResRight;
        break;
      case BNOT:
        if (this->right == NULL) {
          cerr << "eval_expr: BNOT operator, Expr or value missing ...\n";
          return(2);
        }
        *pRes = ~lResRight;
        break;
      };
      return(0);
    }
    ///////////////// JIT ///////////////////////////
    int compile_expr(asmjit::CodeHolder* pCode, asmjit::x86::Assembler* pAssembly) {
      if (pCode == NULL) return(1);
      if (pAssembly == NULL) return(1);
      int lResLeftValue = 0;
      int lResRightValue = 0;
      int lResLeft = 0;
      int lResRight = 0;
      enum e_operand lOperand = NONE;
      if (this->left != NULL) {
        if (this->left->expr != NULL) {
          if (this->left->value != NULL) {
            fprintf(stderr, "compile_expr: Unexpected left value... %.*s\n", this->left->offset, this->left->value);
            return(2);
          }
          lResLeft = 1;
        } else {
          if (this->left->value != NULL) {
            switch (this->left->eval_leaf(&lResLeftValue)) {
            case 2:
              this->display_expr(); cout << "\n";
              return(1);
            case 1:
              return(1);
            };
          }
        }
      }
      if (this->right != NULL) {
        if (this->right->expr != NULL) {
          if (this->right->value != NULL) {
            fprintf(stderr, "compile_expr: Unexpected right value... %.*s\n", this->right->offset, this->right->value);
            return(2);
          }
          lResRight = 1;
        } else {
          if (this->right->value != NULL) {
            switch (this->right->eval_leaf(&lResRightValue)) {
            case 2:
              this->display_expr(); cout << "\n";
              return(1);
            case 1:
              return(1);
            };
          }
        }
      }
      switch (this->operand) {
      case NONE:
        if (this->left == NULL) {
          cerr << "compile_expr: Expr or value missing ...\n";
          return(2);
        }
        if (lResLeft == 1) {
          switch (this->left->expr->compile_expr(pCode, pAssembly)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
        } else {
          pAssembly->mov(asmjit::x86::eax, (int32_t)lResLeftValue);
        }
        break;
      case AND:
        if (this->left == NULL || this->right == NULL) {
          cerr << "compile_expr: AND operator, Expr or value missing ...\n";
          return(2);
        }
        if (lResLeft == 1) {
          switch (this->left->expr->compile_expr(pCode, pAssembly)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
        } else {
          pAssembly->mov(asmjit::x86::eax, (int32_t)lResLeftValue);
        }
        if (lResRight == 1) {
          pAssembly->push(asmjit::x86::eax);
          switch (this->right->expr->compile_expr(pCode, pAssembly)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
          pAssembly->pop(asmjit::x86::ecx);
          pAssembly->and_(asmjit::x86::eax, asmjit::x86::ecx);
        } else {
          pAssembly->and_(asmjit::x86::eax, (int32_t)lResRightValue);
        }
        break;
      case OR:
        if (this->left == NULL || this->right == NULL) {
          cerr << "compile_expr: OR operator, Expr or value missing ...\n";
          return(2);
        }
        if (lResLeft == 1) {
          switch (this->left->expr->compile_expr(pCode, pAssembly)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
        } else {
          pAssembly->mov(asmjit::x86::eax, (int32_t)lResLeftValue);
        }
        if (lResRight == 1) {
          pAssembly->push(asmjit::x86::eax);
          switch (this->right->expr->compile_expr(pCode, pAssembly)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
          pAssembly->pop(asmjit::x86::ecx);
          pAssembly->or_(asmjit::x86::eax, asmjit::x86::ecx);
        } else {
          pAssembly->or_(asmjit::x86::eax, (int32_t)lResRightValue);
        }
        break;
      case NOT:
        if (this->right == NULL) {
          cerr << "compile_expr: NOT operator, Expr or value missing ...\n";
          return(2);
        }
        if (lResRight == 1) {
          switch (this->right->expr->compile_expr(pCode, pAssembly)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
          pAssembly->not_(asmjit::x86::eax);
          pAssembly->and_(asmjit::x86::eax, (int32_t)1);
        } else {
          pAssembly->mov(asmjit::x86::eax, (int32_t)lResRightValue);
          pAssembly->not_(asmjit::x86::eax);
          pAssembly->and_(asmjit::x86::eax, (int32_t)1);
        }
        break;
      case BNOT:
        if (this->right == NULL) {
          cerr << "compile_expr: BNOT operator, Expr or value missing ...\n";
          return(2);
        }
        if (lResRight == 1) {
          switch (this->right->expr->compile_expr(pCode, pAssembly)) {
          case 2:
            this->display_expr(); cout << "\n";
            return(1);
          case 1:
            return(1);
          };
          pAssembly->not_(asmjit::x86::eax);
        } else {
          pAssembly->mov(asmjit::x86::eax, (int32_t)lResRightValue);
          pAssembly->not_(asmjit::x86::eax);
        }
        break;
      };
      return(0);
    }
    int compile_expr_start(asmjit::JitRuntime* pRunTime, asmjit::CodeHolder* pCode, int (**pCompiled_function)(void)) {
      if (pRunTime == NULL) return(1);
      asmjit::x86::Assembler lAssembly(pCode);
      lAssembly.push(asmjit::x86::ebp); // Prolog
      lAssembly.mov(asmjit::x86::ebp, asmjit::x86::esp);
      lAssembly.pushfd();
      lAssembly.push(asmjit::x86::ecx);
      if (this->compile_expr(pCode, &lAssembly) != 0) {
        return(1);
      }
      lAssembly.pop(asmjit::x86::ecx);
      lAssembly.popfd();
      lAssembly.pop(asmjit::x86::ebp); // Epilog
      lAssembly.ret();
      asmjit::Error lErr = pRunTime->add(pCompiled_function, pCode);
      if (lErr) return(1);
      return(0);
    }
    int dump_expr(asmjit::CodeHolder* pCode, int (*pCompiled_function)(void)) {
      if (pCode == NULL) return(1);
      unsigned char* lOffset;
      unsigned long lSize;
      if (pCompiled_function != NULL) {
        lOffset = (unsigned char*)pCompiled_function;
        lSize = (unsigned long)pCode->codeSize();
        while (lSize > 0) {
          printf("%02X", lOffset[0]);
          lOffset++;
          lSize--;
          if (lSize > 0) cout << " ";
        }
      }
      return(0);
    }
    int exec_expr(int (*pCompiled_function)(void), int* pRes) {
      if (pCompiled_function == NULL) return(1);
      if (pRes == NULL) return(1);
      *pRes = pCompiled_function();
      return(0);
    }
  };
}
using namespace parser;
int main(int nbargs, char* args[]) {
  expression* lExpression;
  int lRes;
  if (nbargs > 0) {
    lExpression = expression::build_expression(args[1]);
    lExpression->display_expr();
    cout << "\n";
    if (lExpression->eval_expr(&lRes) == 0) {
      cout << " => " << lRes << "\n";
    } else {
      cerr << " Error in expr...\n";
    }
    // JIT
    int (*compiled_function)(void) = NULL;
    asmjit::JitRuntime* lRunTime = new asmjit::JitRuntime();
    if (lRunTime == NULL) return(1);
    asmjit::CodeHolder lCode;
    lCode.init(lRunTime->environment());
    if (lExpression->compile_expr_start(lRunTime, &lCode, &compiled_function) == 0) {
      cout << "Compiled: " << lCode.codeSize() << " [";
      lExpression->dump_expr(&lCode, compiled_function);
      cout << "]\n";
      if (lExpression->exec_expr(compiled_function, &lRes) == 0) {
        cout << "Res JIT : " << lRes << "\n";
      } else {
        cerr << "Failed to exec JIT..\n";
      }
      lRunTime->release(compiled_function);
    } else {
      cerr << "Failed to build JIT..\n";
    }
  }
  return(0);
}
Camillacamille answered 1/5, 2021 at 15:12 Comment(0)
B
1

Writing an expression parser is easy in principle, but takes a fair amount of effort.

Here's a basic to-down recursive-descent expression parser I wrote in Java: http://david.tribble.com/src/java/tribble/parse/sql/QueryParser.java http://david.tribble.com/src/java/tribble/parse/sql/ExprLexer.java http://david.tribble.com/src/java/tribble/parse/sql/ExprLexer.java http://david.tribble.com/docs/tribble/parse/sql/package-summary.html

This may not be exactly what you're looking for, but it will give you an idea of what you need.

Busboy answered 23/9, 2009 at 13:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.