Get the right hand side variables of an R formula
Asked Answered
R

5

22

I'm writing my first S3 class and associated methods and I would like to know how to subset my input data set in order to keep only the variables specified in the formula?

data(iris)
f <- Species~Petal.Length + Petal.Width

With model.frame(f,iris) I get a subset with all the variables in the formula. How to automatically keep only the right hand side variables (in the example Petal.Length and Petal.Width)?

Reputable answered 24/1, 2014 at 10:46 Comment(2)
model.frame(f,iris)[, -1]?Dupery
You don't need as.formula here. Species~Petal.Length + Petal.Width is already a formula.Foxglove
A
42

You want labels and terms; see ?labels, ?terms, and ?terms.object.

labels(terms(f))
# [1] "Petal.Length" "Petal.Width" 

In particular, labels.terms returns the "term.labels" attribute of a terms object, which excludes the LHS variable.

Adriaadriaens answered 24/1, 2014 at 10:52 Comment(0)
D
20

If you have a function in your formula, e.g., log, and want to subset the data frame based on the variables, you can use get_all_vars. This will ignore the function and extract the untransformed variables:

f2 <- Species ~ log(Petal.Length) + Petal.Width

get_all_vars(f2[-2], iris)

    Petal.Length Petal.Width
1            1.4         0.2
2            1.4         0.2
3            1.3         0.2
4            1.5         0.2
...

If you just want the variable names, all.vars is a very helpful function:

all.vars(f2[-2])

[1] "Petal.Length" "Petal.Width" 

The [-2] is used to exclude the left hand side.

Darling answered 24/1, 2014 at 11:55 Comment(3)
This won't take interactions into account, e.g. get_all_vars(mpg ~ hp * cyl, mtcars).Fricandeau
@Fricandeau This is the intended behaviour of get_all_vars.Darling
Sure, great command I didn't know before, I intended just a little side note for someone (like me) who expected the interactions terms to also appear as e.g. in labels(terms(mpg ~ hp * cyl)) from the other solution.Fricandeau
A
10

One way is to use subsetting to remove the LHS from the formula. Then you can use model.frame on this:

f[-2]
~Petal.Length + Petal.Width

model.frame(f[-2],iris)
    Petal.Length Petal.Width
1            1.4         0.2
2            1.4         0.2
3            1.3         0.2
4            1.5         0.2
5            1.4         0.2
6            1.7         0.4
...
Arneson answered 24/1, 2014 at 11:0 Comment(1)
I like this answer because it removes the dependent part, no matter the number of terms in it. formula(a + b ~ c +d)[-2]Dianthus
B
7

The package formula.tools has a number of functions to make life easier working with formulas. In your case:

> formula.tools::rhs.vars(f)
[1] "Petal.Length" "Petal.Width"

Relying on base R can be dangerous because the left hand side can be missing, meaning that element 1 no longer refers to that.

Baum answered 30/5, 2018 at 21:14 Comment(0)
H
1

You can use f_rhs function from the rlangpackage to extract the right handside of the formula and combine it with all.vars

> f <- Species ~ Petal.Length + Petal.Width
> 
> # RHS
> rlang::f_rhs(f)
Petal.Length + Petal.Width
> all.vars(rlang::f_rhs(f))
[1] "Petal.Length" "Petal.Width" 
> 
> # LHS
> rlang::f_lhs(f)
Species
> all.vars(rlang::f_lhs(f))
[1] "Species"
Heard answered 10/1, 2023 at 15:54 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.