ImportError: cannot import name 'joblib' from 'sklearn.externals'
Asked Answered
B

10

134

I am trying to load my saved model from s3 using joblib

import pandas as pd 
import numpy as np
import json
import subprocess
import sqlalchemy
from sklearn.externals import joblib

ENV = 'dev'
model_d2v = load_d2v('model_d2v_version_002', ENV)

def load_d2v(fname, env):
    model_name = fname
    if env == 'dev':
        try: 
            model=joblib.load(model_name)
        except:
            s3_base_path='s3://sd-flikku/datalake/doc2vec_model'
            path = s3_base_path+'/'+model_name
            command = "aws s3 cp {} {}".format(path,model_name).split()
            print('loading...'+model_name)
            subprocess.call(command)
            model=joblib.load(model_name)
    else:
        s3_base_path='s3://sd-flikku/datalake/doc2vec_model'
        path = s3_base_path+'/'+model_name
        command = "aws s3 cp {} {}".format(path,model_name).split()
        print('loading...'+model_name)
        subprocess.call(command)
        model=joblib.load(model_name)
    return model

But I get this error:

    from sklearn.externals import joblib
ImportError: cannot import name 'joblib' from 'sklearn.externals' (C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\externals\__init__.py)

Then I tried installing joblib directly by doing

import joblib

but it gave me this error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in load_d2v_from_s3
  File "/home/ec2-user/.local/lib/python3.7/site-packages/joblib/numpy_pickle.py", line 585, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/joblib/numpy_pickle.py", line 504, in _unpickle
    obj = unpickler.load()
  File "/usr/lib64/python3.7/pickle.py", line 1088, in load
    dispatch[key[0]](self)
  File "/usr/lib64/python3.7/pickle.py", line 1376, in load_global
    klass = self.find_class(module, name)
  File "/usr/lib64/python3.7/pickle.py", line 1426, in find_class
    __import__(module, level=0)
ModuleNotFoundError: No module named 'sklearn.externals.joblib'

Can you tell me how to solve this?

Boudicca answered 19/5, 2020 at 14:36 Comment(0)
K
66

It looks like your existing pickle save file (model_d2v_version_002) encodes a reference module in a non-standard location – a joblib that's in sklearn.externals.joblib rather than at top-level.

The current scikit-learn documentation only talks about a top-level joblib – eg in 3.4.1 Persistence example – but I do see a reference in someone else's old issue to a DeprecationWarning in scikit-learn version 0.21 about an older scikit.external.joblib variant going away:

Python37\lib\site-packages\sklearn\externals\joblib_init_.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.

'Deprecation' means marking something as inadvisable to rely-upon, as it is likely to be discontinued in a future release (often, but not always, with a recommended newer way to do the same thing).

I suspect your model_d2v_version_002 file was saved from an older version of scikit-learn, and you're now using scikit-learn (aka sklearn) version 0.23+ which has totally removed the sklearn.external.joblib variation. Thus your file can't be directly or easily loaded to your current environment.

But, per the DeprecationWarning, you can probably temporarily use an older scikit-learn version to load the file the old way once, then re-save it with the now-preferred way. Given the warning info, this would probably require scikit-learn version 0.21.x or 0.22.x, but if you know exactly which version your model_d2v_version_002 file was saved from, I'd try to use that. The steps would roughly be:

  • create a temporary working environment (or roll back your current working environment) with the older sklearn

  • do imports something like:

import sklearn.external.joblib as extjoblib
import joblib
  • extjoblib.load() your old file as you'd planned, but then immediately re-joblib.dump() the file using the top-level joblib. (You likely want to use a distinct name, to keep the older file around, just in case.)

  • move/update to your real, modern environment, and only import joblib (top level) to use joblib.load() - no longer having any references to `sklearn.external.joblib' in either your code, or your stored pickle files.

Kimbrough answered 19/5, 2020 at 16:4 Comment(0)
X
249

You should directly use

import joblib

instead of

from sklearn.externals import joblib
Xylina answered 8/7, 2020 at 2:47 Comment(2)
ModuleNotFoundError: No module named 'sklearn.externals.joblib'Ruel
To install joblib use pip command python3 -m pip install joblibAbracadabra
K
66

It looks like your existing pickle save file (model_d2v_version_002) encodes a reference module in a non-standard location – a joblib that's in sklearn.externals.joblib rather than at top-level.

The current scikit-learn documentation only talks about a top-level joblib – eg in 3.4.1 Persistence example – but I do see a reference in someone else's old issue to a DeprecationWarning in scikit-learn version 0.21 about an older scikit.external.joblib variant going away:

Python37\lib\site-packages\sklearn\externals\joblib_init_.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.

'Deprecation' means marking something as inadvisable to rely-upon, as it is likely to be discontinued in a future release (often, but not always, with a recommended newer way to do the same thing).

I suspect your model_d2v_version_002 file was saved from an older version of scikit-learn, and you're now using scikit-learn (aka sklearn) version 0.23+ which has totally removed the sklearn.external.joblib variation. Thus your file can't be directly or easily loaded to your current environment.

But, per the DeprecationWarning, you can probably temporarily use an older scikit-learn version to load the file the old way once, then re-save it with the now-preferred way. Given the warning info, this would probably require scikit-learn version 0.21.x or 0.22.x, but if you know exactly which version your model_d2v_version_002 file was saved from, I'd try to use that. The steps would roughly be:

  • create a temporary working environment (or roll back your current working environment) with the older sklearn

  • do imports something like:

import sklearn.external.joblib as extjoblib
import joblib
  • extjoblib.load() your old file as you'd planned, but then immediately re-joblib.dump() the file using the top-level joblib. (You likely want to use a distinct name, to keep the older file around, just in case.)

  • move/update to your real, modern environment, and only import joblib (top level) to use joblib.load() - no longer having any references to `sklearn.external.joblib' in either your code, or your stored pickle files.

Kimbrough answered 19/5, 2020 at 16:4 Comment(0)
C
19

You can import joblib directly by installing it as a dependency and using import joblib,

Documentation.

Catherin answered 19/5, 2020 at 14:46 Comment(3)
I treid installing joblib directly but it gave me ModuleNotFoundError: No module named 'sklearn.externals.joblib'Boudicca
Did you install it with pip beforehand?Catherin
joblib comes as part of Anaconda, and this worked for me.Johnnyjumpup
I
10

Maybe your code is outdated. For anyone who aims to use fetch_mldata in digit handwritten project, you should fetch_openml instead. (link)

In old version of sklearn:

from sklearn.externals import joblib
mnist = fetch_mldata('MNIST original')

In sklearn 0.23 (stable release):

import sklearn.externals
import joblib
    
dataset = datasets.fetch_openml("mnist_784")

features = np.array(dataset.data, 'int16')
labels = np.array(dataset.target, 'int')

For more info about deprecating fetch_mldata see scikit-learn doc

Inaccessible answered 10/7, 2020 at 3:44 Comment(0)
F
6

I had the same problem

What I did not realize was that joblib was already installed!

so what you have to do is replace

from sklearn.externals import joblib

with

import joblib

and that is it

Ferino answered 1/9, 2022 at 6:52 Comment(0)
C
5

none of the answers below works for me, with a little changes this modification was ok for me

import sklearn.externals as extjoblib
import joblib
Cinelli answered 29/4, 2022 at 21:25 Comment(0)
D
4

for this error, I had to directly use the following and it worked like a charm:

import joblib

Simple

Differential answered 6/8, 2021 at 7:12 Comment(0)
S
2

In case the execution / call to joblib is within another .py program instead of your own (in such case even you have installed joblib, it still causes error from within the calling python programme unless you change the code, i thought would be messy), I tried to create a hardlink:

(windows version)

Python> import joblib

then inside your sklearn path >......\Lib\site-packages\sklearn\externals

mklink /J ./joblib .....\Lib\site-packages\joblib

(you can work out the above using a ! or %, !mklink....... or %mklink...... inside your Python juptyter notebook , or use python OS command...)

This effectively create a virtual folder of joblib within the "externals" folder

Remarks: Of course to be more version resilient, your code has to check for the version of sklearn is >= 0.23 again before hand.

This would be alternative to changing sklearn vesrion.

Hardlink for a virtual folder of joblib

Stoll answered 23/8, 2020 at 14:44 Comment(0)
O
2

When getting error:

from sklearn.externals import joblib it deprecated older version.

For new version follow:

  1. conda install -c anaconda scikit-learn (install using "Anaconda Promt")
  2. import joblib (Jupyter Notebook)
Omora answered 25/8, 2020 at 5:41 Comment(0)
I
-1

After a long investigation, given my computer setup, I've found that was because an SSL certificate was required to download the dataset.

Itagaki answered 19/6, 2021 at 9:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.