Python 3 statsmodels Logit ValueError: On entry to DLASCL parameter number 5 had an illegal value
Asked Answered
E

1

9

Working through a logistic regression example and encountering some difficulties when approaching the statsmodels portion. I have difficulties in the past with Python 3 and pandas dataframes where the df returns an iterator not a list. I have tried adjusting the same with 'logit' however still receiving a ValueError

import numpy as np
import pandas as pd
import os
import statsmodels.api as sm
import pylab as pl

df = pd.read_csv('admissions.csv')
df.head(n=5)

df.columns = ['admit', 'gre', 'gpa', 'prestige']
dummy_ranks = pd.get_dummies(df['prestige'], prefix='prestige')
cols_to_keep = ['admit', 'gre', 'gpa']
data = df[cols_to_keep].join(dummy_ranks.ix[:, 'prestige_2':])
data['intercept'] = 1.0
train_cols = data.columns[1:]


logit = sm.Logit(data['admit'], data[train_cols])

result = logit.fit()

ValueError: On entry to DLASCL parameter number 5 had an illegal value

Electroscope answered 4/10, 2016 at 2:15 Comment(2)
you should give the link of 'admissions'csv'Seely
This type of error message is almost always an inf or nan in the data when calling a linear algebra function. If there are missing values, then either remove them with pandas or use the missing keyword in the models.Fluorescent
L
7

Your 'admissions.csv' has a blank value in it.

Using the data from http://www.ats.ucla.edu/stat/data/binary.csv as per the blog http://blog.yhat.com/posts/logistic-regression-python-rodeo.html works. Try deleting a value in the data and you will get the illegal value error.

Correct:

admit   gre gpa rank
0   380 3.61    3
1   520 2.93    4

Incorrect:

admit   gre gpa rank
0       3.61    3
1   520 2.93    4
Lemuel answered 6/12, 2016 at 21:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.