Hi I'm using the lifelines package to do Cox regression. I want to examine the effects of a categorical variable which is non-binary. Is there a built-in way of doing this? Or should I transform each category factor into a number? Alternatively, using the kmf fitter in lifelines, is it possible to do this for each factor and then get a p-value? I'm able to make the separate plots but I can't find how to evaluate the p-value.
Thank you!
Update: Okay if after using pd.get_dummies I have a dataframe df of the form:
event time categorical_1 categorical_2 categorical_3
0 0 11.54 0 0 1
1 0 6.95 0 0 1
2 1 0.24 0 1 0
3 0 3.00 0 0 1
4 1 10.26 1 0 1
... ... ... ... ... ...
1215 1 6.80 1 0 0
I now need to drop one of the dummy variables. And then do:
cph.fit(df, duration_col=time, event_col=event)
If I now want to plot how the categorical variables affect the survival plot, how would I go about this? I've tried:
summary = cph.summary
for index, row in summary.iterrows():
print(index)
cph.plot_covariate_groups(index, [a[index].mean()], ax=ax)
plt.show()
But it plots all the different factors of the variable on the same curve, I'd expect the curves to be different. Well, I'm actually not sure if it plots all the curves or only the last curve, but it plots the legend for all the possibilities in the categorical variable.
Thanks