As the message says, this error occurs if you try to index a numpy array using an invalid value such as a float or a string.
Apart from the case where a float is used to index an array, e.g. arr[0.]
, whose solution is to convert the float into an int like arr[0]
(or if the index is created dynamically, just cast it into an int like arr[int(idx)]
), another pretty common case occurs when a numpy ndarray is attempted to be indexed using a string, which is especially prevalent when using scikit-learn.
For example, some common preprocessing functions take pandas dataframes but return numpy ndarrays upon transformation, which makes it not possible to select columns using labels. In that case, a solution is either to use it as a ndarray or create a pandas dataframe from the transformed data and select columns of that transformed data.
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
df = pd.DataFrame({'A': [1, 2, 3], 'B': [10, 20, 30]})
sc = MinMaxScaler()
df = sc.fit_transform(df)
df['A'] # <--- IndexError
df[:, 0] # <--- OK
# explicitly create a pandas dataframe from transformed data
df1 = pd.DataFrame(sc.fit_transform(df), columns=df.columns)
df1['A'] # <--- OK
IndexError: arrays used as indices must be of integer (or boolean) type
, indexing with a naked float or a list with floats gives the error in the title. – Pauper