Upgrade version of scikit-learn included in Enthought Canopy distribution
Asked Answered
S

1

8

I have EPD 7.3.1 installed (nowadays called Enthought Canopy), which comes with scikit-learn v 0.11. I am running Ubuntu 12.04. I need to install v 0.12 of scikit-learn.

The scikit-learn doc says clone the repository, add the scikit-learn directory to your PYTHONPATH, and build the extension in place: python setup.py build_ext --inplace

The problem is that EPD is its own closed world (with mulitple scikit dirs):
./lib/python2.7/site-packages/scikits/
./lib/python2.7/site-packages/sklearn

And then there's:
./EGG-INFO/scikit_learn/

I really don't want to experiment as it has taken a very long time to get things tuned to this point. Should I follow scikit-learn's directions in this case?

Stcyr answered 31/8, 2012 at 17:1 Comment(0)
S
12

The actions described on the scikit-learn website work irrespective of the scikit-learn version in EPD. Python will automatically use the scikit-learn version set in the PYTHONPATH environment variable, which you should set to the directory path of the Git version of scikit-learn.

If you use Bash on a Unix-like system, you should do the following:

  • Perform the actions to install scikit-learn's latest code (in this example I cloned it to /home/yourname/bin/scikit-learn)
  • Edit .bashrc and add the line: export PYTHONPATH="/home/yourname/bin/scikit-learn";
  • Open a new terminal and start Python in interactive mode by typing python
    • Type: import sklearn
    • Type: sklearn.__verion__ this should now show '0.12-git' instead of 0.11

Why does this work? Python uses the variable sys.path (a list of paths) internally to keeps track of all the directories where it should look for modules and packages. Once a module or package is requested, Python will sequentially go through this list until it has found a match. So, e.g., a module can be listed multiple times in sys.path, but only the version which appeared first in the list will be used.

Every Python installation will have its own default set of paths listed in sys.path. One way of extending sys.path is by listing paths in PYTHONPATH. Once Python starts it will read this environment variable and add it to the start of the sys.path list. So if you add the path to another version of scikit-learn to your PYTHONPATH then (EPD's) Python will find that version of scikit-learn first and use it instead of the version listed further on in sys.path.

To view sys.path, simply import sys and then print sys.path. Also, e.g., if you only want to use the 0.12 version of scikit-learn in one Python program and use the 0.11 version as default in all other Python programs then you could leave the PYTHONPATH empty and only insert the path to scikit-learn 0.12 manually at the top of your code:

import sys
sys.path.insert(0, '/home/yourname/bin/scikit-learn')
import sklearn
Stratigraphy answered 31/8, 2012 at 17:28 Comment(4)
On Stack Overflow the community gives downvotes to bad questions and upvotes to good questions. I would leave the question for now, but if people start to downvote it then you have a good indication that you might want to delete it.Stratigraphy
I've run into the same problem, but I have a Mac. I've managed to install scikit only via macports (not using scikits setup.py, always ends in error mid-way). However, I can't find the bashrc. All I want to do is update Scikit in EPD(academic) to .13, and it's proving very difficult, any help would be greatly appreciated!Keffer
@Keffer Try using .profile instead of .bashrc. Let me know if it works.Stratigraphy
@plurker oddly spotlight is unable to find either .profile or .bashrc. However, I was able to finally get it to work! However, it's been a while since I did it. I believe I uninstalled the package from the enthought distribution and then i installed it using the "install" command in the terminal using files provided in the zip file.Keffer

© 2022 - 2024 — McMap. All rights reserved.