I am using statsmodel for OLS regression. When I clustered the standard error, there was this warning message indicating multicolinearity issue. However, if I just fit the model without clustered errors, there is no such warning.
mod = smf.ols(formula = var ~ treatment_r1 + block + has_multiple_treat', data = df)
mod_res = mod.fit(cov_type='cluster', cov_kwds={'groups': df['block']}, use_t=True)
ValueWarning: covariance of constraints does not have full rank. The number of constraints is 3, but rank is 1
'rank is %d' % (J, J_), ValueWarning)
I checked co-linearity following this post Capturing high multi-collinearity in statsmodels and didn't find any problem.
corr = np.corrcoef(df_new[["var", "has_multiple_treat", "treatment_r1", "block1"]], rowvar=0)
w, v = np.linalg.eig(corr)
w
np.linalg.det(corr)
The var can be a 0 or 1 variable or a continuous variable; treatment_r1, has_multiple_treat is a 0 or 1 variable; block is a categorical variable with two categories and in the df_new dataframe, I turned block into a dummy variable (block1 and block2) and dropped block2.
cov_type
code. There can be a problem with singular matrix in cluster robust cov_params if there are too few clusters relative to within cluster size. – Spinneretgroups
variable "block" has only two clusters, then cluster robust doesn't work, and you need to assume more structure on the within cluster correlation. – Spinneret