PYCHARM Error-- java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified
Asked Answered
S

4

23

I am getting the below error while running a pyspark program on PYCHARM, Error:

java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified ......

The interpreter is recognizing the python.exe file and I have added the Content root in project structure.

I got a similar issue while running the same program before in on windows command prompt and solved it using What is the right way to edit spark-env.sh before running spark-shell?

Seabrooke answered 8/8, 2021 at 23:22 Comment(1)
Welcome to Stack Overflow. There are a few posts for approximately this error message see pycharm cannot run program is:q. This one may be what you want PyCharm error: Cannot run program, error=2, No such file or directory. However I think there isn't a thread about your exact error message. I'm assuming this is PySpark specific so any details you could add to the question would be helpful.Posturize
H
38

Before creating your spark session, set the following environment variables in your code:

import os
import sys
from pyspark.sql import SparkSession

os.environ['PYSPARK_PYTHON'] = sys.executable
os.environ['PYSPARK_DRIVER_PYTHON'] = sys.executable
spark = SparkSession.builder.getOrCreate()
Hirz answered 22/10, 2021 at 6:58 Comment(0)
B
26

create an environment variable PYSPARK_PYTHON with value 'python' or the path to your respective python executable.

Broadax answered 22/9, 2021 at 9:55 Comment(2)
dude you are a life saver person. I have been struggling with it almost a week!Hospitality
@NukaTejeswaraRao This resolved my issue on Windows 10, as well. Other suggestions did not work for me. Thank you for sharing your knowledge.Denisse
D
10
  1. Go to Environmental variable and within System variable set a new variable as PYSPARK_PYTHON and value as python

PYSPARK_PYTHON=python

  1. Add below codebits to your pyspark code
import os
import sys
from pyspark import SparkContext
os.environ['PYSPARK_PYTHON'] = sys.executable
os.environ['PYSPARK_DRIVER_PYTHON'] = sys.executable
Decimalize answered 29/5, 2022 at 17:32 Comment(1)
This is the way to go. Also, after setting PYSPARK_PYTHON env var, you might need to restart whatever IDE / CMD Prompt you are using.Autopsy
G
0

If you are working with a virtual environment, you might need to point the environmental variable to the python of the virtual environment.

For example, in my case the correct path was: os.environ['PYSPARK_PYTHON'] = '...venv\scripts\python.exe'

Gob answered 4/7 at 12:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.