Kaggle TypeError: slice indices must be integers or None or have an __index__ method

Asked 16/5, 2017 at 19:5 Answered 23/5, 2017 at 20:33

Solved python pandas jupyter seaborn kaggle

I am trying to plot a seaborn histogram on a Kaggle notebook in this way:

 sns.distplot(myseries, bins=50, kde=True)

but I get this error:

TypeError: slice indices must be integers or None or have an __index__ method

Thi is the Kaggle notebook: https://www.kaggle.com/asindico/slice-indices-must-be-integers-or-none/

here is the series head:

0     5850000
1     6000000
2     5700000
3    13100000
4    16331452
Name: price_doc, dtype: int64

Lattimore answered 16/5, 2017 at 19:5 Comment(3)

Of what type is your Series? – Morita 16/5, 2017 at 19:14

@Morita I have updated the question – Lattimore 16/5, 2017 at 19:39

@Morita you were right kde=False fix the issue. If you post the right answer I can assign you the bounty – Lattimore 20/5, 2017 at 6:20

As @ryankdwyer pointed out, it was an issue in the underlying statsmodels implementation which is no longer existent in the 0.8.0 release.

Since kaggle won't allow you to access the internet from any kernel/script, upgrading the package is not an option. You basically have the following two alternatives:

Use sns.distplot(myseries, bins=50, kde=False). This will of course not print the kde.
Manually patch the statsmodels implementation with the code from version 0.8.0. Admittedly, this is a bit hacky, but you will get the kde plot.

Here is an example (and a proof on kaggle):

import numpy as np

def _revrt(X,m=None):
    """
    Inverse of forrt. Equivalent to Munro (1976) REVRT routine.
    """
    if m is None:
        m = len(X)
    i = int(m // 2+1)
    y = X[:i] + np.r_[0,X[i:],0]*1j
    return np.fft.irfft(y)*m

from statsmodels.nonparametric import kdetools

# replace the implementation with new method.
kdetools.revrt = _revrt

# import seaborn AFTER replacing the method. 
import seaborn as sns

# draw the distplot with the kde function
sns.distplot(myseries, bins=50, kde=True)

Why does it work? Well, it relates to the way Python loads modules. From the Python docs:

5.3.1. The module cache

The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths. So if foo.bar.baz was previously imported, sys.modules will contain entries for foo, foo.bar, and foo.bar.baz. Each key will have as its value the corresponding module object.

Therefore, the from statsmodels.nonparametric import kdetools is inside this module cache. The next time seaborn acquires it, the cached version will be returned by the Python module loader. Since this cached version is the module that we have adapted, our patch of the revrt function is used. By the way, this practice is very handy when writing unit tests and is called mocking.

Chivaree answered 20/5, 2017 at 12:23 Comment(0)

This error appears to be a known issue.

https://github.com/mwaskom/seaborn/issues/1092

Potential Solution -> update your statsmodels package to 0.8.0

pip install -U statsmodels

Erubescence answered 16/5, 2017 at 19:27 Comment(3)

it happens on Kaggle ... I can not uninstall/update anything – Lattimore 16/5, 2017 at 19:35

If you can live without the KDE it might avoid this bug. – Morita 16/5, 2017 at 20:51

@kyle I have seen other Kaggle kernels successfully using it – Lattimore 17/5, 2017 at 5:9

From the seaborn issue from @ryankdwyer, it sounds like a bug in the kde. Try turning it off with kde=False.

 sns.distplot(myseries, bins=50, kde=False)

Morita answered 23/5, 2017 at 20:33 Comment(0)

Recommended topics

Hot tags