Java's Mahout equivalent in Python
Asked Answered
S

5

25

Java based Mahout's goal is to build scalable machine learning libraries. Are there any equivalent libraries in Python ?

Sot answered 27/1, 2011 at 17:3 Comment(2)
You could use Jython or JPype to intergrate Mahout with your Python code. See my simular question: #7492453Slayton
Python is not considered a good choice for large dataset computations since the performance gets prohibitively slow.Dubuffet
D
21

scikits learn is highly recommended http://scikit-learn.sourceforge.net/

Divide answered 27/1, 2011 at 17:7 Comment(1)
Just a note: the current implementation of scikit-learn its not yet able to leverage a Hadoop cluster to do distributed computing. It is however fairly scalable to address medium sized problems (e.g. hundreds of thousands of samples and features for linear models), esp. if you use sparse representations and / or memmap'ed arrays.Osy
Q
3

Spark MLlib is recommmended. It is a scalable machine learning lib, can read data from HDFS and of course runs on top of Spark.

You can access it via PySpark (see the Programming Guide's Python examples).

Quaternion answered 12/11, 2014 at 10:25 Comment(0)
B
1

Orange is supposedly pretty decent, from what I've heard, but I've never used it personally. PyML might be worth taking a look at as well. Also, Monte.

Beer answered 27/1, 2011 at 17:22 Comment(2)
Orange isn't even close to being scalable. Nearly all of its algorithms are slow batch processes, and they have no intention of making them otherwise due to the academic orientation of the project. Sadly, there really isn't any Python equivalent of Mahout.Traction
@Chris: the scikit-learn is probably not there yet, but it has the goal to be scalable and avoid the pitfalls of academic-oriented projects. Some of our implementations for standard problems scale already quite well.Woothen
D
1

pysuggest is a Python wrapper for SUGGEST, a Top-N recommendation engine that implements a variety of recommendation algorithms for collaborative filtering.

Dzungaria answered 30/11, 2011 at 2:32 Comment(0)
A
0

An interesting library is crab.

As of this post, the library only has stable implementations for collaborative filtering algorithms: user-based and item-based.

An SVD implementation is included but it's experimental and content-based algorithms are on the roadmap.

Do check it out!

Apples answered 20/12, 2012 at 16:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.