psutil in Apache Spark
Asked Answered
O

2

9

I'm using PySpark 1.5.2. I got UserWarning Please install psutil to have better support with spilling after I issue the command .collect()

Why is this warning showed?

How can I install psutil?

Outdate answered 29/12, 2015 at 2:22 Comment(2)
As far as I know psutil is python module. You could pip install it. pip install psutil.Wrongdoing
I issued the pip command but doesn't work. I'm using python 3.5 but spark is using python 2.7.Outdate
A
17
pip install psutil

If you need to install specifically for python 2 or 3, try using pip2 or pip3; it works for both major versions. Here is the PyPI package for psutil.

Attu answered 12/2, 2016 at 16:30 Comment(0)
K
1

y can clone or download the psutil project in the following link: https://github.com/giampaolo/psutil.git

then run setup.py to install psutil

in 'spark/python/pyspark/shuffle.py' y can see the following codes:

def get_used_memory():
    """ Return the used memory in MB """
    if platform.system() == 'Linux':
        for line in open('/proc/self/status'):
            if line.startswith('VmRSS:'):
                return int(line.split()[1]) >> 10

    else:
        warnings.warn("Please install psutil to have better "
                      "support with spilling")**
        if platform.system() == "Darwin":
            import resource
            rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
            return rss >> 20
        # TODO: support windows

    return 0

so i guess if yr os is not a linux, so psutil is suggested.

Knit answered 29/12, 2015 at 6:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.