Export nested BigQuery data to cloud storage
Asked Answered
G

2

8

I am trying to export bigquery data to google cloud storage bucket via the API. I adapted a code snippet from here https://cloud.google.com/bigquery/docs/exporting-data

Job job = table.extract(format, gcsUrl);
// Wait for the job to complete
try {
  Job completedJob = job.waitFor(WaitForOption.checkEvery(1, 
TimeUnit.SECONDS),
      WaitForOption.timeout(3, TimeUnit.MINUTES));
  if (completedJob != null && completedJob.getStatus().getError() == null) {
    // Job completed successfully
  } else {
    // Handle error case
       System.out.println(completedJob.getStatus().getError());
  }
} catch (InterruptedException | TimeoutException e) {
  // Handle interrupted wait

}

I have exchanged format with "JSON" since my data is nested and can't be exported to CSV and the gcsUrl with "gs://mybucket/export_*.json". But the error messages tells me the following problem:

transfer not working  BigQueryError{reason=invalid, location=null, message=Operation cannot be performed on a nested schema. Field: totals}

Any advice what to do? JSON should be able to handle a nested format...

Grits answered 4/7, 2017 at 14:44 Comment(2)
In your code, how is format defined?Sowder
for format I use JSONGrits
S
6

Referring to the destinationFormat option, you should set "NEWLINE_DELIMITED_JSON" for the format variable in order to export as JSON.

Sowder answered 4/7, 2017 at 16:2 Comment(0)
M
2

I know this has been marked as solved but I got the same error while doing it in Python and the extract_table() method in Python doesn't take in the destination_format argument, so for anybody using Python trying to achieve this here is how to export it in JSON format:

# Basically one has to pass job_config instead of destination_format
# Configuring Job Config to export data as JSON
job_config = bigquery.job.ExtractJobConfig()
job_config.destination_format = bigquery.DestinationFormat.NEWLINE_DELIMITED_JSON

extract_job = client.extract_table(
    table_id,
        destination_uri,
        job_config=job_config,
        # Location must match that of the source table.
        location="US"
)

extract_job.result()
Mandatory answered 23/9, 2021 at 17:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.