Query Failed Error: Resources exceeded during query execution: The query could not be executed in the allotted memory
Asked Answered
S

3

10

I am using Standard SQL.Even though its a basic query it is still throwing errors. Any suggestions pls

SELECT 
  fullVisitorId,
  CONCAT(CAST(fullVisitorId AS string),CAST(visitId AS string)) AS session,
  date,
  visitStartTime,
  hits.time,
  hits.page.pagepath
FROM
  `XXXXXXXXXX.ga_sessions_*`,
  UNNEST(hits) AS hits
WHERE
  _TABLE_SUFFIX BETWEEN "20160801"
  AND "20170331"
ORDER BY
  fullVisitorId,
  date,
  visitStartTime
Sacramentarian answered 1/9, 2017 at 17:37 Comment(0)
B
12

The only way for this query to work is by removing the ordering applied in the end:

SELECT 
  fullVisitorId,
  CONCAT(CAST(fullVisitorId AS string),CAST(visitId AS string)) AS session,
  date,
  visitStartTime,
  hits.time,
  hits.page.pagepath
FROM
  `XXXXXXXXXX.ga_sessions_*`,
  UNNEST(hits) AS hits
WHERE
  _TABLE_SUFFIX BETWEEN "20160801"
  AND "20170331"

ORDER BY operation is quite expensive and cannot be processed in parallel so try to avoid it (or try applying it in a limited result set)

Bindery answered 1/9, 2017 at 17:43 Comment(5)
Thanks Willian. It's working, but can you tell me the reason why it was not working when I use order by.Sacramentarian
There were too many rows to hold in memory on a single node. If you look at the "Explanation" tab for the query, it will show where it ran out of memory.Schmo
Thanks @ElliottBrossardSacramentarian
I encountered the same issue. Even weirder, the query succeeded in the web UI but not in the python API. Removed ORDER BY clauses solved the issue but it's a bit weird to experience the discrepancy.Peculium
I know this isa bit old, but does OVER (PARTITION BY ... has the same effect?Activate
M
3

Besides the accepted answer, you might want to partition your table by date to lessen the amount of memory used with an expensive query.

Mirabelle answered 11/2, 2018 at 2:35 Comment(1)
The above query is to pull the GA data and by default it is partitioned by date. _TABLE_SUFFIX BETWEEN "20160801" AND "20170331" This is how I am pulling data from different date rangesSacramentarian
M
1

To avoid gathering big chunk into a slot, you can try:

  1. Split the data into small chunks before querying,
  2. Use a LIMIT clause with an ORDER BY operation, or
  3. Remove the ORDER BY operation from the query.

Please refer to the GCP documentations regarding the error:
[1] https://cloud.google.com/bigquery/docs/best-practices-performance-output#use_a_limit_clause_with_large_sorts
[2] https://cloud.google.com/bigquery/docs/error-messages#resourcesExceeded

Myocardiograph answered 10/8, 2022 at 6:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.