I'm setting up GeoSpark Python and after installing all the pre-requisites, I'm running the very basic code examples to test it.
from pyspark.sql import SparkSession
from geo_pyspark.register import GeoSparkRegistrator
spark = SparkSession.builder.\
getOrCreate()
GeoSparkRegistrator.registerAll(spark)
df = spark.sql("""SELECT st_GeomFromWKT('POINT(6.0 52.0)') as geom""")
df.show()
I tried running it with python3 basic.py
and spark-submit basic.py
, both give me this error:
Traceback (most recent call last):
File "/home/jessica/Downloads/geo_pyspark/basic.py", line 8, in <module>
GeoSparkRegistrator.registerAll(spark)
File "/home/jessica/Downloads/geo_pyspark/geo_pyspark/register/geo_registrator.py", line 22, in registerAll
cls.register(spark)
File "/home/jessica/Downloads/geo_pyspark/geo_pyspark/register/geo_registrator.py", line 27, in register
spark._jvm. \
TypeError: 'JavaPackage' object is not callable
I'm using Java 8, Python 3, Apache Spark 2.4, my JAVA_HOME
is set correctly, I'm running Linux Mint 19. My SPARK_HOME
is also set:
$ printenv SPARK_HOME
/home/jessica/spark/
How can I fix this?
geospark
as a package on the cluster, then they can also use the location/usr/local/lib/python3.6/site-packages/geospark/jars/2_4/<JAR_FILE>
when specifyingspark.jars
, because that is the location used on EMR for both Master and Core nodes. – Perpetua