Panel data regression with fixed effects using Python

I have the following panel stored in df:

	state	district	year	y	constant	x1	x2	time
0	01	01001	2009	12	1	0.956007	639673	1
1	01	01001	2010	20	1	0.972175	639673	2
2	01	01001	2011	22	1	0.988343	639673	3
3	01	01002	2009	0	1	0	33746	1
4	01	01002	2010	1	1	0.225071	33746	2
5	01	01002	2011	5	1	0.450142	33746	3
6	01	01003	2009	0	1	0	45196	1
7	01	01003	2010	5	1	0.427477	45196	2
8	01	01003	2011	9	1	0.854955	45196	3

y is the number of protests in each district
constant is a column full of ones
x1 is the proportion of the district's area covered by a mobile network provider
x2 is the population count in each district (note that it is fixed in time)

How can I run the following model in Python?

Here's what I tried

# Transform `x2` to match model
df['x2'] = df['x2'].multiply(df['time'], axis=0)
# District fixed effects
df['delta'] = pd.Categorical(df['district'])
# State-time fixed effects
df['eta'] = pd.Categorical(df['state'] + df['year'].astype(str))
# Set indexes
df.set_index(['district','year'])

from linearmodels.panel import PanelOLS
m = PanelOLS(dependent=df['y'], exog=df[['constant','x1','x2','delta','eta']])

ValueError: exog does not have full column rank. If you wish to proceed with model estimation irrespective of the numerical accuracy of coefficient estimates, you can set rank_check=False.

What am I doing wrong?

# Import model from linearmodels.panel import PanelOLS # Model m = PanelOLS(dependent=df['y'], exog=df[['constant','x1','x2']], entity_effects=True, time_effects=False, other_effects=df['eta']) m.fit(cov_type='clustered', cluster_entity=True)

Recommended topics

Hot tags