statsmodels 2 dimensional kernel regression
Asked Answered
O

1

6

I have a dataframe with 3 columns ['X', 'Y', 'Z'] and I would like to study how the X and Y influence the distribution of Z. For that, I wanted to use the nonparametric regressor of nadaraya watson. In statsmodels there is a class called KernelReg that implement it.

While I am able to successfully run the code for a 1-dimensional regression (Z on X and Z on Y), I struggle to run it for the 2-dimensional regression.

My code is as follows: XYZ is my dataframe

xv = XYZ['X'].values; yv = XYZ['Y'].values; zv = XYZ['Z'].values

from statsmodels.nonparametric.kernel_regression import KernelReg
ksrmv = KernelReg(endog=zv, exog= [xv, yv], var_type='c')

The error I get is cannot reshape array of size 3171442 into shape (2,1)

xv.shape = yv.shape = zv.shape =(1585721,)

I already tried different alternative of specifying the exog like

XYZ.loc[:, ['X', 'Y']] or XYZ.loc[:, ['X', 'Y']].values or np.concatenate([xv[:, None], yv[:, None]])

always the same error.

In the description of exog in statsmodels. It should be a list of independent variable(s). Each element in the list is a separate variable, I am not sure how to interpret it.

Ose answered 17/3, 2018 at 21:13 Comment(1)
Can you provide a full working example to check. exog should (most likely) have observations in rows, e.g. use np.column_stack([xv, yv]), var_type should most likely have 2 characters for the types of the exog.Ankerite
I
6

I believe the variable type needs to be given for each independent variable in the same string (i.e., in the var_type argument). If both variables are continuous, the code in your case would be:

ksrmv = KernelReg(endog=zv, exog= [xv, yv], var_type='cc')
Idell answered 13/1, 2019 at 19:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.