I think lavaan sounds like a good option, I note that @Philip pointed me in the right direction. I just mention here how to extract a few extra model features using lavaan (particularly, r-squared and adjusted r-squared) that you might want.
For the latest version see: https://gist.github.com/jeromyanglim/9f766e030966eaa1241f10bd7d6e2812
:
# get data
library(MASS)
data("Cars93")
x <- Cars93[,c("EngineSize", "Horsepower", "RPM")]
# define sample statistics
covx <- cov(x)
n <- nrow(x)
means <- sapply(x, mean) # this is optional
fit <- lavaan::sem("EngineSize ~ Horsepower + RPM", sample.cov = covx,
sample.mean = means,
sample.nobs = n)
coef(fit) # unstandardised coefficients
standardizedSolution(fit) # Standardised coefficients
inspect(fit, 'r2') # r-squared
# adjusted r-squared
adjr2 <- function(rsquared, n, p) 1 - (1-rsquared) * ((n-1)/(n-p-1))
# update p below with number of predictor variables
adjr2(inspect(fit, 'r2'), n = inspect(fit, "nobs"), p = 2)
Custom function
And here is a bit of a function that supplies the fit from lavaan along with a few features of relevance (i.e., basically packaging most of the above). It assumes a case where you don't have the means.
covlm <- function(dv, ivs, n, cov) {
# Assumes lavaan package
# library(lavaan)
# dv: charcter vector of length 1 with name of outcome variable
# ivs: character vector of names of predictors
# n: numeric vector of length 1: sample size
# cov: covariance matrix where row and column names
# correspond to dv and ivs
# Return
# list with lavaan model fit
# and various other features of the model
results <- list()
eq <- paste(dv, "~", paste(ivs, collapse = " + "))
results$fit <- lavaan::sem(eq, sample.cov = cov,
sample.nobs = n)
# coefficients
ufit <- parameterestimates(results$fit)
ufit <- ufit[ufit$op == "~", ]
results$coef <- ufit$est
names(results$coef) <- ufit$rhs
sfit <- standardizedsolution(results$fit)
sfit <- sfit[sfit$op == "~", ]
results$standardizedcoef <- sfit$est.std
names(results$standardizedcoef) <- sfit$rhs
# use unclass to not limit r2 to 3 decimals
results$r.squared <- unclass(inspect(results$fit, 'r2')) # r-squared
# adjusted r-squared
adjr2 <- function(rsquared, n, p) 1 - (1-rsquared) * ((n-1)/(n-p-1))
results$adj.r.squared <- adjr2(unclass(inspect(results$fit, 'r2')),
n = n, p = length(ivs))
results
}
For example:
x <- Cars93[,c("EngineSize", "Horsepower", "RPM")]
covlm(dv = "EngineSize", ivs = c("Horsepower", "RPM"),
n = nrow(x), cov = cov(x))
This all produces:
$fit
lavaan (0.5-20) converged normally after 27 iterations
Number of observations 93
Estimator ML
Minimum Function Test Statistic 0.000
Degrees of freedom 0
Minimum Function Value 0.0000000000000
$coef
Horsepower RPM
0.01491908 -0.00100051
$standardizedcoef
Horsepower RPM
0.7532350 -0.5755326
$r.squared
EngineSize
0.867
$adj.r.squared
EngineSize
0.864
lm
but merely takes covariance instead of data. I.e., so that it's then easy to get things like model fits, and so on. – Alsup