I'm resampling my data (multiclass) by using SMOTE.
sm = SMOTE(random_state=1)
X_res, Y_res = sm.fit_resample(X_train, Y_train)
However, I'm getting this attribute error. Can anyone help?
I'm resampling my data (multiclass) by using SMOTE.
sm = SMOTE(random_state=1)
X_res, Y_res = sm.fit_resample(X_train, Y_train)
However, I'm getting this attribute error. Can anyone help?
Short answer
You need to upgrade scikit-learn
to version 0.23.1.
Long answer
The newest version 0.7.0 of imbalanced-learn
seems to have an undocumented dependency on scikit-learn
v0.23.1. It would give you AttributeError: 'SMOTE' object has no attribute '_validate_data'
if your scikit-learn
is 0.22 or below.
If you are using Anaconda
, installing scikit-learn
version 0.23.1 might be tricky. conda update scikit-learn
might not update scikit-learn
version 0.23 or higher because the newest scikit-learn
version Conda has at this point of time is 0.22.1. If you try to install it using conda install scikit-learn=0.23.1
or pip install scikit-learn==0.23.1
, you will get tons of compatibility checks and installation might not be quick. Therefore the easiest way to install scikit-learn
version 0.23.1 in Anaconda is to create a new virtual environment with minimum packages so that there are less or no conflict issues. Then, in the new virtual environment install scikit-learn
version 0.23.1 followed by version 0.7.0 of imbalanced-learn
.
conda create -n test python=3.7.6
conda activate test
pip install scikit-learn==0.23.1
pip install imbalanced-learn==0.7.0
Finally, you need to reinstall your IDE in the new virtual environment in order to use these packages.
However, once scikit-learn
version 0.23.1 becomes available in Conda and there are no compatibility issues, you can install it in the base environment directly.
Step 1- Open your jupyter notebook
Step 2 - type pip install --upgrade scikit-learn
Step 3 - Restart the kernel
Follow all the steps as it is and it's done!!(upgraded)
Welcome to SO! For your next question like this, you'll probably want to include the versions of python, sklearn, and imblearn you are using.
I ran into this same problem myself and the developers have noticed it: https://github.com/scikit-learn-contrib/imbalanced-learn/issues/727
Might want to follow this page to see if a solution is posted in the next few days. It seems to be about the sklearn library not being cleaned up properly after installing imblearn.
UPDATE
This can be fixed by updating your sklearn to Version 0.23 or higher. This should be possible for you through either:
pip update scikit-learn
OR
conda update scikit-learn
although updating sklearn did not work for me as well, however setting up a new environment did , as proposed in one of the solutions provided in the link https://github.com/scikit-learn-contrib/imbalanced-learn/issues/727 mentioned in the answer.
My OS: Ubuntu MATE 18.04 x64
Had this same issue and tried other solutions to no avail.
I was originally using python 3.7.7
and got it working by using python 3.6.8
instead.
Anaconda
conda create -n myenv python=3.6.8
conda activate myenv
pip install scikit-learn
pip install imblearn
VirtualEnv - you will need python 3.6.8 already installed on your ystem
virtualenv --python=python3.6 myenv
source myenv/bin/activate
pip install scikit-learn
pip install imblearn
verify versions
import sklearn
sklearn.__version__
>>> '0.23.1'
import imblearn
imblearn.__version__
>>> '0.7.0'
...
# Now works
X_res, Y_res = sm.fit_resample(X_train, Y_train)
Error recieved was: AttributeError: 'SMOTE' object has no attribute '_validate_data'
Root Cause: Requires scikit-learn 0.23, but in conda - python 3.7 we only have scikit-learn 0.22
Solution: Create Virtual Enviornment with python3.6.8 and install scikit-learn 0.23 as below
Create Virtual Env for python 3.6.8
PS C:\Users\harish\Documents> conda create -n myenv python=3.6.8
Activate the enviornment
PS C:\Users\harish\Documents> conda activate myenv
Install scikit-learn and imblearn in the virtual enviornment
PS C:\Users\harish\Documents> pip install scikit-learn PS C:\Users\harish\Documents> pip install imblearn --user NOTE: this updates scikit-learn .... Collecting scikit-learn>=0.23 PS C:\Users\harish\Documents> conda list NOTE: it should be 0.23 ... scikit-learn 0.23.2 pypi_0 pypi
Activate the kernel
PS C:\Users\harish\Documents> python -m ipykernel install --user --name=myenv Installed kernelspec myenv in C:\Users\harish\AppData\Roaming\jupyter\kernels\myenv PS C:\Users\harish\Documents> cd C:\Users\harish\AppData\Roaming\jupyter\kernels\myenv
PS C:\Users\harish\AppData\Roaming\jupyter\kernels\myenv> ls
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 8/23/2020 6:41 PM 185 kernel.json
-a---- 1/28/2020 2:18 AM 1084 logo-32x32.png
-a---- 1/28/2020 2:18 AM 2180 logo-64x64.png
PS C:\Users\harish\AppData\Roaming\jupyter\kernels\myenv> cat kernel.json
{
"argv": [
"C:\\Users\\harish\\Anaconda3\\python.exe",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}"
],
"display_name": "myenv",
"language": "python"
}
Upgrading both sklearn and imblearn worked for me
!pip install --upgrade scikit-learn
!pip install --upgrade imblearn
© 2022 - 2024 — McMap. All rights reserved.