I have a dataframe with 3 columns ['X', 'Y', 'Z']
and I would like to study how the X
and Y
influence the distribution of Z
. For that, I wanted to use the nonparametric regressor of nadaraya watson. In statsmodels there is a class called KernelReg that implement it.
While I am able to successfully run the code for a 1-dimensional regression (Z
on X
and Z
on Y
), I struggle to run it for the 2-dimensional regression.
My code is as follows: XYZ
is my dataframe
xv = XYZ['X'].values; yv = XYZ['Y'].values; zv = XYZ['Z'].values
from statsmodels.nonparametric.kernel_regression import KernelReg
ksrmv = KernelReg(endog=zv, exog= [xv, yv], var_type='c')
The error I get is cannot reshape array of size 3171442 into shape (2,1)
xv.shape = yv.shape = zv.shape =(1585721,)
I already tried different alternative of specifying the exog like
XYZ.loc[:, ['X', 'Y']] or XYZ.loc[:, ['X', 'Y']].values or np.concatenate([xv[:, None], yv[:, None]])
always the same error.
In the description of exog in statsmodels. It should be a list of independent variable(s). Each element in the list is a separate variable, I am not sure how to interpret it.
np.column_stack([xv, yv])
, var_type should most likely have 2 characters for the types of the exog. – Ankerite