Unable to connect from BigQuery job to Cloud SQL Postgres
Asked Answered
P

7

7

I am not able to use the federated query capability from Google BigQuery to Google Cloud SQL Postgres. Google announced this federated query capability for BigQuery recently in beta state.

I use EXTERNAL_QUERY statement like described in documentation but am not able to connect to my Cloud SQL instance. For example with query

SELECT * FROM EXTERNAL_QUERY('my-project.europe-north1.my-connection', 'SELECT * FROM mytable;');

or

SELECT * FROM EXTERNAL_QUERY("my-project.europe-north1.pg1", "SELECT * FROM INFORMATION_SCHEMA.TABLES;");

I receive this error :

Invalid table-valued function EXTERNAL_QUERY Connection to PostgreSQL server failed: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.

Sometimes the error is this:

Error encountered during execution. Retrying may solve the problem.

I have followed the instructions on page https://cloud.google.com/bigquery/docs/cloud-sql-federated-queries and enabled BigQuery Connection API. Some documents use different quotations for EXTERNAL_QUERY (“ or ‘ or ‘’’) but all the variants end with same result.

I cannot see any errors in stackdriver postgres logs. How could I correct this connectivity error? Any suggestions how to debug it further?

Pecoraro answered 2/9, 2019 at 18:39 Comment(2)
Some additional information: Cloud SQL has public ip, tested with pg9.6 and pg11, tested with totally separate Google account. All result in same error!Pecoraro
Now tested also with Cloud SQL MySQL instance. Same problem with error message "Invalid table-valued function EXTERNAL_QUERY Failed to connect to MySQL database. Error: MysqlErrorCode(2013): Lost connection to MySQL server during query at [1:15]". I also tested with totally separate account and project, same problem! What am I missing in the setup? Have any of you been able to use external_query?Pecoraro
I
2

I just tried and it works, as far as the bigquery query runs in EU (as of today 6 October it works). My example:

SELECT * FROM EXTERNAL_QUERY("projects/xxxxx-xxxxxx/locations/europe-west1/connections/xxxxxx", "SELECT * FROM data.datos_ingresos_netos")

Just substitute the first xxxxs with your projectid and the last ones with the name you gave to the connection in The bigquery interface (not cloudsql info, that goes into the query)

Inharmonic answered 6/10, 2019 at 12:19 Comment(3)
Thanks for the update Constantino! I re-tested this also with my europe-north1 instance and now it works also here.Pecoraro
All is good, confirming with europe-north1, just one thing to add: if you run your sql instance in a private network it won't work, you need to enable 'Public IP Connectivity' for BigQuery to work with SQL (no need to add any networks tho)Cervix
What is data.datos_ingresos_netos? Is it data base and table?Granulate
W
4

Just adding another possibility for people using Private IP only Cloud SQL instances. I've just encountered that and was wondering why it was still not working after making sure everything else looked right. According to the docs (as of 2021-11-13): "BigQuery Cloud SQL federation only supports Cloud SQL instances with public IP connectivity. Please configure public IP connectivity for your Cloud SQL instance."

Wingfield answered 13/11, 2021 at 0:23 Comment(1)
detail very often forgotten :)Ninebark
I
2

I just tried and it works, as far as the bigquery query runs in EU (as of today 6 October it works). My example:

SELECT * FROM EXTERNAL_QUERY("projects/xxxxx-xxxxxx/locations/europe-west1/connections/xxxxxx", "SELECT * FROM data.datos_ingresos_netos")

Just substitute the first xxxxs with your projectid and the last ones with the name you gave to the connection in The bigquery interface (not cloudsql info, that goes into the query)

Inharmonic answered 6/10, 2019 at 12:19 Comment(3)
Thanks for the update Constantino! I re-tested this also with my europe-north1 instance and now it works also here.Pecoraro
All is good, confirming with europe-north1, just one thing to add: if you run your sql instance in a private network it won't work, you need to enable 'Public IP Connectivity' for BigQuery to work with SQL (no need to add any networks tho)Cervix
What is data.datos_ingresos_netos? Is it data base and table?Granulate
P
1

Unfortunately BigQuery federated queries to Cloud SQL work currently only in US regions (2019 September). The documents (https://cloud.google.com/bigquery/docs/cloud-sql-federated-queries) say it should work also in other regions but this is not the case.

I tested the setup from original question multiple times in EU and europe-north1 but was not able to get it working. When I changed the setup to US or us-central1 it works!

Federated queries to Cloud SQL are in preview so the feature is evolving. Let's hope Google gets this working in other regions soon.

Pecoraro answered 17/9, 2019 at 15:2 Comment(0)
P
0

The BigQuery dataset and the Cloud SQL instance must be in the same region, or same location if the dataset is in a multi-region location such as US and EU.

Double check this according to Known issues listed.

Pilarpilaster answered 2/9, 2019 at 19:10 Comment(3)
I assume this region limit is only for Cloud Bigtable, not for Cloud SQL ("Querying data in Cloud Bigtable is currently only available in the following regions and zones:").Pecoraro
Yes you are right I linked the wrong page. Is your bigquery dataset in EU zone? read the regions section here: cloud.google.com/bigquery/docs/cloud-sql-federated-queriesPilarpilaster
I have tried datasets both in EU and in europe-north1. Cloud SQL is in europe-north1. Should work with these locations according to the document.Pecoraro
M
0

server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.

This message usually means that you will find more information in the logs of the remote database server. No useful information could be sent to the "client" server because the connection disappeared so there was no way to send it. So you have look in the remote server.

Mezzanine answered 2/9, 2019 at 20:8 Comment(3)
Yes, it looks also to me server side error. But the postgres error logs in stackdriver does not show any errors. There are only heartbeat and log entries with metrics (cpu usage, pages, tuples, buffer,...). Other tools are able to connect, e.g. "connect using cloud shell" works fine.Pecoraro
For the connection to be lost, it presumably must have existed to start with. Can you turn log_connections = on on the remote side?Mezzanine
Sounds totally logical but even with log_connections set on I cannot see log entries from BigQuery.Pecoraro
B
0

I encountered the same error, and it was simply a permissions issue. In the Connection info section, you can find the service account used for the external connection, just give the BigQuery Connection Service Agent role to this service account.

It solve this issue for me, hoping that this will help someone

Blida answered 4/4, 2023 at 9:50 Comment(1)
Hi, could you please tell me how you have added the Postgres connection into BigQuery? I have a CloudSQL db with a service account user and in BigQuery, the only way to add the connection is as external, where it asks for a password. Thank youMarkham
S
0

Check the BQ service account has BigQuery Connection Service Agent role or not. For me SA permission was missing. Once I fixed it, it starts working for me

Saxtuba answered 20/4, 2023 at 22:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.