How to run a robit model in Stan?

Asked 19/10, 2014 at 5:8 Answered 25/10, 2014 at 17:20

I would like to run a robust logistic regression (robit) in Stan. The model is suggested in Gelman & Hill's "Data Analysis Using Regression and Multilevel Methods" (2006, pp. 124), but I'm not sure how to implement it. I checked Stan's Github repository and the reference manual, but unfortunately I'm still confused. Here's some code I've used to model a regular logistic regression. What should I add to it so that the errors follow, say, a t distribution with 7 degrees of freedom? By any chance, would it be the same procedure if I run a multilevel analysis?

library(rstan)

set.seed(1)
x1 <- rnorm(100)  
x2 <- rnorm(100)
z <- 1 + 2*x1 + 3*x2      
pr <- 1/(1+exp(-z))       
y <- rbinom(100,1,pr)  

df <- list(N=100, y=y,x1=x1,x2=x2)

# Stan code
model1 <- '
data {                          
  int<lower=0> N;          
  int<lower=0,upper=1> y[N];  
  vector[N] x1;         
  vector[N] x2;
}
parameters {
  real beta_0;     
  real beta_1;        
  real beta_2; 
}
model {
  y ~ bernoulli_logit(beta_0 + beta_1 * x1 + beta_2 * x2);
}
'
# Run the model
fit <- stan(model_code = model1, data = df, iter = 1000, chains = 4)
print(fit)

Thank you!

Earldom answered 19/10, 2014 at 5:8 Comment(2)

Sorry, I do not know the answer but you might want to fix your code in line df = data.list(N=100, y=y,x1=x1,x2=x2). There is no data.list() in R; it should read: df = list(N=100, y=y,x1=x1,x2=x2). – Amontillado 19/10, 2014 at 5:44

FWIW, @johnmyleswhite has a JAGS example of robit regression here. – Calisa 19/10, 2014 at 12:13

I must be missing something, but I had trouble adapting the solution that danilofreire posted from Luc. So I just translated a model from JAGS.

I think this is correct, despite looking a bit different from Luc's solution.

library(rstan)

N <- 100
x1 <- rnorm(N)
x2 <- rnorm(N)
beta0 <- 1
beta1 <- 2
beta2 <- 3

eta <- beta0 + beta1*x1 + beta2*x2                         # linear predictor
p <- 1/(1 + exp(-eta))                                     # inv-logit
y <- rbinom(N, 1, p)                   

dlist <- list(y = y, x1 = x1, x2 = x2, N = N, nu = 3)      # adjust nu as desired df

mod_string <- "
  data{
    int<lower=0> N;
    vector[N] x1;
    vector[N] x2;
    int<lower=0, upper=1> y[N];
    real nu;
  }
  parameters{
    real beta0;
    real beta1;
    real beta2;
  }
  model{
    vector[N] pi;

    for(i in 1:N){
      pi[i] <- student_t_cdf(beta0 + beta1*x1[i] + beta2*x2[i], nu, 0, 1);
      y[i] ~ bernoulli(pi[i]);
    }
  }
"
fit1 <- stan(model_code = mod_string, data = dlist, chains = 3, iter = 1000)
print(fit1)

Plight answered 25/10, 2014 at 17:20 Comment(3)

Thanks for the answer, paulstey. Here your code works perfectly, and I actually don't know if I should accept Luc's answer or yours as the most helpful one. Yours works great! – Earldom 26/10, 2014 at 13:49

Thanks, danilofreire. Like I said, I can't claim any deep level of knowledge on this, but I just didn't really understand Luc's model. This one is a more direct adaptation of robit models I've written in JAGS. And I can't say much about speed either, since Luc's model uses vectorized code. – Plight 27/10, 2014 at 1:13

This looks like the right answer to me. The 0 and 1 in the last arguments are location and scale, so this sets scale to 1. The amount of robustness would then be controlled by the degrees of freedom parameter nu. I should warn everyone that our CDFs aren't very fast compared to logit or even probit (Phi). – Admass 16/3, 2015 at 11:4

Luc Coffeng sent me this answer on the Stan mailing list and I thought I should add it here. He said:

"Adopt a GLM as a base for your robit regression: just substitute the standard error term with e ~ student_t(7, 0, sigma_e), where sigma_e ~ cauchy(0, 2) or whatever scale you think is ok (I wouldn't go beyond 5 anyway as the inverse logit of (-5,5) covers most of the [0,1] interval. In addition to the scale of the t-error, you may also specify the df of the t-error as a parameter. See below for suggested code.

I do hope however that your data contains more information than the toy example you provided, i.e. multiple observations per individual (as below). With just one observation per individual / unit, the model is practically impossible to identify."

He then provided the following example:

library(rstan)

set.seed(1)
x1 <- rnorm(100)  
x2 <- rnorm(100)
z <- 1 + 2*x1 + 3*x2 + 0.1 * rt(100, 7)
pr <- 1/(1+exp(-z))       
y <- rbinom(100,10,pr)  

df <- list(N=100, y=y, x1=x1, x2=x2, nu = 7)

# Stan code
model1 <- '
data {                          
   int<lower=0> N;          
   int<lower=0,upper=10> y[N];  
   vector[N] x1;         
   vector[N] x2;
   real nu;
}
parameters {
   real beta_0;     
   real beta_1;        
   real beta_2; 
   real<lower=0> sigma_e;
   vector[N] e;
}
model {
   e ~ student_t(nu, 0, sigma_e);
   sigma_e ~ cauchy(0, 1);
   y ~ binomial_logit(10, beta_0 + beta_1 * x1 + beta_2 * x2 + e);
}
'
# Run the model
fit <- stan(model_code = model1, data = df, iter = 4000, chains = 2)
print(fit)

Bob Carpenter also briefly commented on the question:

"[...] And yes, you can do the same thing in a hierarchical setting, but you have to be careful because modeling the degrees of freedom can be tricky given the scale runs off to infinity as you approach normality."

Answering Bernd's question, Luc clarified why he wrote y ~ bernoulli_logit(10... in the model code:

"In the example code I provided you with, 10 is the sample size. You may have noticed that the toy data contain multiple observations per individual / unit (i.e. 10 observations per unit).

The Stan manual also provides extensive information on arguments to functions and sampling statements."

Earldom answered 21/10, 2014 at 20:57 Comment(4)

If you could privide a link to the answer on the Stan mailing list, I will award you the bounty. And, I do not get the meaning of the 10 in binomial_logit(10, .... – Amontillado 22/10, 2014 at 4:29

Thanks for your comment, Bernd. The link was added above, and I also asked them to clarify that point. – Earldom 22/10, 2014 at 11:45

Added Luc's comment to the main answer. – Earldom 22/10, 2014 at 11:53

Thanks, makes sense. I didn't read the code very carefully and overlooked the rbinom(100,10,pr) part. – Amontillado 22/10, 2014 at 12:2

Update: My translation of johnmyleswhite example to Stan Synthax isn't working. I don't understand well Stan Synthax to translate the code. Maybe someone can help? Below is the original answer.

If you check the johnmyleswhite example, mentioned by jbaums, you can see that the important piece of code is:

y[i] ~ dbern(p[i])
p[i] <- pt(z[i], 0, 1, 1)
z[i] <- a * x[i] + b

As you can see, insted of using invlogit to compute probabilites, he uses the t distribution (actually, the cumulative t). In stan, just use:

student_t_cdf

I don't know vey well Stan synthax, but I assume you can use something like the follow:

   model {
y ~ bernoulli(theta);
theta <- student_t_cdf(df, mu, sigma)
mu <- beta_0 + beta_1 * x1 + beta_2 * x2;
}

Note that you will have to put priors on df and sigma. Something like:

df_inv ~ uniform(0, 0.5);
df <- 1 / df_inv;
sigma_z <- sqrt((df-2)/df);

I'll try here to see if it works. Let me know if tweaking my answer a little makes it to work.

Lymanlymann answered 21/10, 2014 at 1:19 Comment(0)

Page 26 of the Stan 2.4 Reference Manual:

y ~ bernoulli(Phi( beta_0 + beta_1 * x1 + beta_2 * x2 ))

The general solution is y ~ bernoulli(link_function(eta)) where link_function is, e.g. Phi. There just happens to be a special function bernoulli_logit that wraps this functionality and is numerically more stable.

I recommend reading up on generalized linear models if the reason for this isn't clear. The Wikipedia page isn't such a bad review.

Shadbush answered 21/10, 2014 at 4:0 Comment(2)

If i'm not mistaken, phi is for the normal distribution, and the OP asked for t distribution. – Lymanlymann 21/10, 2014 at 18:20

I explain that in my answer, and Stan has a function student_t_cdf – Shadbush 21/10, 2014 at 19:33

Recommended topics

Hot tags