What I would like to do is to parse an expression such this one:
result = A + B + sqrt(B + 4)
Where A and B are columns of a dataframe. So I would have to parse the expresion like this in order to get the result:
new_col = df.B + 4
result = df.A + df.B + new_col.apply(sqrt)
Where df
is the dataframe.
I have tried with re.sub
but it would be good only to replace the column variables (not the functions) like this:
import re
def repl(match):
inner_word = match.group(1)
new_var = "df['{}']".format(inner_word)
return new_var
eq = 'A + 3 / B'
new_eq = re.sub('([a-zA-Z_]+)', repl, eq)
result = eval(new_eq)
So, my questions are:
- Is there a python library to do this? If not, how can I achieve this in a simple way?
- Creating a recursive function could be the solution?
- If I use the "reverse polish notation" could simplify the parsing?
- Would I have to use the
ast
module?
result = df["A"] + df["B"] + sqrt(df["B"] + 4)
? It should work – Odesqrt
function as you say I get this errorTypeError: cannot convert the series to <class 'float'>
. So the function must be used withapply
– Brunsfloat64
values,int32
values, evennumpy.nan
values. – Bruns