spark-submit Questions
2
I am able to run pyspark and run a script on Jupyter notebook.
But when I try to run the file from terminal using spark-submit, getting this error:
Error executing Jupyter command file path [Errn...
Wallaby asked 30/9, 2017 at 23:16
22
Solved
I'd like to stop various messages that are coming on spark shell.
I tried to edit the log4j.properties file in order to stop these message.
Here are the contents of log4j.properties
# Define the...
Kevin asked 5/1, 2015 at 14:4
2
Solved
I submitted my code to the cluster to run, but I encountered the following error.
'''
java.lang.IllegalArgumentException: Too large frame: 5211883372140375593
at org.sparkproject.guava.base.Precond...
Fritzfritze asked 25/9, 2020 at 9:40
1
We have a pyspark based application and we are doing a spark-submit as shown below. Application is working as expected, however we are seeing a weird warning message. Any way to handle this or why ...
Contractor asked 13/7, 2021 at 8:57
3
~/spark/spark-2.1.1-bin-hadoop2.7/bin$ ./spark-submit --master spark://192.168.42.80:32141 --deploy-mode cluster file:///home/me/workspace/myproj/target/scala-2.11/myproj-assembly-0.1.0.jar
Runnin...
Skimmer asked 20/6, 2017 at 20:49
7
Solved
True... it has been discussed quite a lot.
However, there is a lot of ambiguity and some of the answers provided ... including duplicating JAR references in the jars/executor/driver configuration o...
Mcnutt asked 10/5, 2016 at 8:3
5
Solved
I follow the Scala tutorial on https://spark.apache.org/docs/2.1.0/quick-start.html
My scala file
/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._...
Nanette asked 8/11, 2017 at 5:23
2
Solved
I've created a Spark cluster with one master and two slaves, each one on a Docker container.
I launch it with the command start-all.sh.
I can reach the UI from my local machine at localhost:8080 an...
Ishii asked 26/1, 2022 at 8:57
4
Solved
I am fighting it the whole day. I am able to install and to use a package (graphframes) with spark shell or a connected Jupiter notebook, but I would like to move it to the kubernetes based spark e...
Trometer asked 20/3, 2021 at 14:40
1
Solved
I'm trying to submit my Pyspark application to a Kubernetes cluster (Minikube) using spark-submit:
./bin/spark-submit \
--master k8s://https://192.168.64.4:8443 \
--deploy-mode cluster \
--packa...
Lemal asked 24/2, 2021 at 20:7
2
I am running into some problems in (Py)Spark on EMR (release 5.32.0). Approximately a year ago I ran the same program on an EMR cluster (I think the release must have been 5.29.0). Then I was able ...
Redroot asked 5/1, 2021 at 11:56
3
I wrote a spark streaming application built with sbt. It works perfectly fine locally, but after deploying on the cluster, it complains about a class I wrote which clearly in the fat jar (checked u...
Heterocyclic asked 26/4, 2017 at 3:17
1
I have 4 python scripts and one configuration file of .txt . out of 4 python files , one file has entry point for spark application and also importing functions from other python files . But config...
Manouch asked 24/9, 2020 at 9:3
1
I am new to spark, I have start zookeeper, kafka(0.10.1.1) on my local, also spark standalone(2.2.0) with one master and 2 workers. my local scal version is 2.12.3
I was able to run wordcount on s...
Cheryle asked 8/11, 2017 at 21:12
2
Solved
I am trying to deploy spark job by using spark-submit which has bunch of parameters like
spark-submit --class Eventhub --master yarn --deploy-mode cluster --executor-memory 1024m --executor-cores...
Possess asked 16/3, 2017 at 13:49
1
I have spark running in cluster (Remote)
How do I submit application using spark-submit to remote cluster with following scenerio:
spark-submit is executed as command via camel
the application r...
Wellman asked 28/11, 2019 at 14:9
2
Solved
In Spark 2.0. How do you set the spark.yarn.executor.memoryOverhead when you run spark submit.
I know for things like spark.executor.cores you can set --executor-cores 2. Is it the same pattern fo...
Fairly asked 1/8, 2018 at 13:45
2
I'm writing a spark application and run it using spark-submit shell script (using yarn-cluster/yarn-client)
As I see now, exit code of spark-submit is decided according to the related yarn applica...
Merrymerryandrew asked 31/1, 2017 at 15:24
3
Can specifying num-executors in spark-submit command override alreay enabled dynamic allocation (spark.dynamicAllocation.enable true) ?
Slype asked 20/1, 2018 at 5:9
1
Solved
I am running below code in spark using Java.
Code
Test.java
package com.sample;
import org.apache.spark.SparkConf;
import org.apache.spark.SparkContext;
import org.apache.spark.sql.Datase...
Armentrout asked 22/11, 2018 at 7:37
2
I want to execute spark submit job on AWS EMR cluster based on the file upload event on S3. I am using AWS Lambda function to capture the event but I have no idea how to submit spark submit job on ...
Epicurus asked 21/8, 2017 at 11:19
4
I have tried to write a transform method from DataFrame to DataFrame.
And I also want to test it by scalatest.
As you know, in Spark 2.x with Scala API, you can create SparkSession object as follo...
Christology asked 31/7, 2017 at 4:20
1
I use spark to read from elasticsearch.Like
select col from index limit 10;
The problem is that the index is very large, it contains 100 billion rows.And spark generate thousands of tasks to fi...
Supermundane asked 30/11, 2017 at 2:50
0
I have a test.py file
import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn.externals import joblib
import tqdm
import time
print("Successful import")
I have followed this...
Mcelrath asked 16/5, 2018 at 2:9
3
Solved
To submit a Spark application to a cluster, their documentation notes:
To do this, create an assembly jar (or “uber” jar) containing your code and its dependencies. Both sbt and Maven have assem...
Seaside asked 22/2, 2017 at 17:45
1 Next >
© 2022 - 2024 — McMap. All rights reserved.