Reciprocals in patsy
Asked Answered
C

1

3

Patsy's power doesn't allow for negative integers, so, if we have some series data X,

patsy.dmatrices('X + X**(-1)', X)

returns an error. How would I add the reciprocal of X to such a patsy formula?

Carlotacarlotta answered 9/9, 2015 at 15:56 Comment(4)
Is (1 / X) also not allowed?Catinacation
Nope, 1 is reserved for constants.Carlotacarlotta
Also, the way / is defined in patsy, this would compute 1 + 1:X.Carlotacarlotta
Which is patsy for I + I * X, where I is a constant vector.Carlotacarlotta
C
6

The special patsy meaning of operators gets switched off inside embedded function calls; so if you write X + 1 / x then patsy interprets that as the special patsy + and / operators, but if you write something like X + sin(1 / X), then patsy continues to interpret the + as a special patsy operator, but the whole sin(1 / X) expression gets passed to Python to evaluate, and Python will evaluate the / as regular division.

So that's fine if we wanted to compute sin(1 / X). But we don't (why would we?). We just want plain 1 / X. So how can we do that?

Well, we can be tricky: we need a function call to trick patsy's parser into ignoring the / and giving it to Python -- but there's nothing that says that function has to do anything. We could just define an identify function:

def identity(value):
    return value

and then use that in a formula like X + identity(1 / X).

And in fact, this trick is so handy that patsy has already predefined an function for you, and provides it as a built-in called I(...). Generally, you can think of I(...) as a kind of quoting operator -- it's a way to say "hey patsy, please do not try to interpret anything in this region, just pass it through to Python kthx".

So to answer your original question: try writing dmatrix("X + I(1 / X)", data)

(Next question: why this weird hack with the function I and everything? The answer to that is that this is how R did it 30 years ago, and I couldn't think of anything sufficiently better to be worth breaking compatibility.)

Chosen answered 11/4, 2016 at 2:55 Comment(3)
I know I've asked an obscure question when I finally get an answer 6 months later from the guy who wrote the program that I'm asking about.Carlotacarlotta
I’m wondering how patsy would handle things like $I(X%in%c(“A”, ”B”, ”D”))$ which is very straight forward in RBarayon
Thank you! As I love R and I have to use python, I'm so glad about this.. :) yet, this I() expression could be mentioned a bit more obvious somewhere in the documentation. Maybe or probably it is but it took me an hour until I found this post here.. however, thanks!Bergerac

© 2022 - 2024 — McMap. All rights reserved.