Your example is a simple dataset, so you don't need to use the survey package at all. I would also suggest that, when learning multiple regression with R, you start with simpler examples and gradually build up your understanding of the methods involved.
After all, my opinion is that the philosophies of Stata and R differ. Stata easily throws a ton of info in your face, without you knowing what it means or how it is derived. On the other hand R can be just as (or even more) powerful and more versatile, but lacks Stata's "automation" and forces you to go slow to get the results you want.
Here's a more literal translation of your Stata code:
require(foreign)
data <- read.dta("http://www.ats.ucla.edu/stat/stata/faq/eyestudy.dta")
with(data, table(carrot, lenses))
glm.out.1 <- glm(lenses ~ carrot, family="binomial", data=data)
summary(glm.out.1)
logLik(glm.out.1) # get the log-likelihood for the model, as well
glm.out.2 <- glm(lenses ~ factor(carrot, levels=c(1,0)), family="poisson"(link="log"), data=data)
summary(glm.out.2)
logLik(glm.out.2)
exp(cbind(coefficients(glm.out.2), confint(glm.out.2)))
# deriving robust estimates. vcovHC() is in the sandwich package.
require(sandwich)
robSE <- sqrt(diag(vcovHC(glm.out.2, type="HC")))[2]
rob <- coef(glm.out.2)[2]
rob <- exp(c(rob, rob+qnorm(0.025)*robSE, rob-qnorm(0.025)*robSE))
names(rob) <- c("", "2.5 %", "97.5 %")
rob
Note that the (link="log")
in the second glm()
call is not necessary, because it is the default link function when family="poisson"
.
For deriving the "robust" estimates, I had to read this, which was very helpful. You have to use the vcovHC()
function in the sandwich package to obtain a different variance-covariance matrix than the one calculated by glm()
, and use that to compute the standard error and parameter estimates.
The "robust" estimates were almost identical to what I got from Stata, down to the third decimal. All other results were completely identical; run the code and see for yourself.
Oh, and one more thing: when you find your way using glm() with non-stratified designs, then map your way across the survey
package, which contains counterparts of this and other analysis functions modified for complex designs. I also recommend you read Thomas Lumley's book "Complex Surveys: A guide to analysis using R".