google-hadoop Questions
4
I am running a Spark job (version 1.2.0), and the input is a folder inside a Google Clous Storage bucket (i.e. gs://mybucket/folder)
When running the job locally on my Mac machine, I am getting th...
Dihedron asked 5/1, 2015 at 15:41
3
Solved
When using BigQuery Connector to read data from BigQuery I found that it copies all data first to Google Cloud Storage. Then reads this data in parallel into Spark, but when reading big table it ta...
Humber asked 4/1, 2017 at 10:57
2
Solved
I am trying to migrate existing data (JSON) in my Hadoop cluster to Google Cloud Storage.
I have explored GSUtil and it seems that it is the recommended option to move big data sets to GCS. It see...
Bysshe asked 13/8, 2014 at 16:25
1
I have a large dataset stored into a BigQuery table and I would like to load it into a pypark RDD for ETL data processing.
I realized that BigQuery supports the Hadoop Input / Output format
https...
Onstad asked 14/7, 2015 at 8:11
1
Solved
With SparkR, I'm trying for a PoC to collect an RDD that I created from text files which contains around 4M lines.
My Spark cluster is running in Google Cloud, is bdutil deployed and is composed w...
Mcnew asked 4/6, 2015 at 13:45
1
© 2022 - 2024 — McMap. All rights reserved.