How to unpivot in BigQuery?
Asked Answered
C

4

16

Not sure what functions to call, but transpose is the closest thing I can think of.

I have a table in BigQuery that is configured like this: enter image description here

but I want to query a table that is configured like this:

enter image description here

What does the SQL code look like for creating this table?

Thanks!

Coquette answered 8/1, 2015 at 2:48 Comment(0)
O
15

2020 update: fhoffa.x.unpivot()

See:

I created a public persistent UDF. If you have a table a, you can give the whole row to the UDF for it to be unpivotted:

SELECT geo_type, region, transportation_type, unpivotted
FROM `fh-bigquery.public_dump.applemobilitytrends_20200414` a
  , UNNEST(fhoffa.x.unpivot(a, '_2020')) unpivotted

It transforms a table like this:

enter image description here

Into this

enter image description here


As a comment mentions, my solution above doesn't solve for the question problem.

So here's a variation, while I look how to integrate all into one:

CREATE TEMP FUNCTION unpivot(x ANY TYPE) AS (
(
  SELECT 
   ARRAY_AGG(STRUCT(
     REGEXP_EXTRACT(y, '[^"]+') AS key
   , REGEXP_EXTRACT(y, ':([0-9]+)') AS value
   ))
  FROM UNNEST((
    SELECT REGEXP_EXTRACT_ALL(json,'"[smlx][meaxl]'||r'[^:]+:\"?[^"]+?') arr
    FROM (SELECT TO_JSON_STRING(x) json))) y
)
);

SELECT location, unpivotted.*
FROM `robotic-charmer-726.bl_test_data.reconfiguring_a_table` x
  , UNNEST(unpivot(x)) unpivotted


Previous answer:

Use the UNION of tables (with ',' in BigQuery), plus some column aliasing:

SELECT Location, Size, Quantity
FROM (
  SELECT Location, 'Small' as Size, Small as Quantity FROM [table]
), (
  SELECT Location, 'Medium' as Size, Medium as Quantity FROM [table]
), (
  SELECT Location, 'Large' as Size, Large as Quantity FROM [table]
)
Opinionative answered 8/1, 2015 at 3:17 Comment(9)
Thanks for the quick and helpful response @FelipeHoffa !! It didn't need the second and third 'FROM'. So in the end it worked like this: 'SELECT Location, Size, Quantity FROM ( SELECT Location, 'Small' as Size, Small as Quantity FROM [table] ), ( SELECT Location, 'Medium' as Size, Medium as Quantity FROM [table] ), ( SELECT Location, 'Large' as Size, Large as Quantity FROM [table] )' Thanks!Coquette
right! query fixed (if you upload sample data to BQ, then I can test my queries before pasting them here)Opinionative
Good to know! How do I point you to my BigQuery dataset? (I am currently only querying things using Tableau and the BigQuery Web UI, so I have no idea how to give someone directions to my datasets. ThanksCoquette
If you make a dataset public, then anyone on StackOverflow can look at it and give a hand :). For example, this is my wikipedia pageviews log for August: bigquery.cloud.google.com/table/… (project:dataset.table)Opinionative
Awesome, thanks! I think I did this right... link I shared the table with 'All Authenticated Users' and made the cloud storage file public. ThanksCoquette
@FelipeHoffa I tried to apply the UDF to OP's case, to test with list of definites columns but it also unpivots location: SELECT location, unpivotted FROM robotic-charmer-726.bl_test_data.reconfiguring_a_table a , UNNEST(fhoffa.x.unpivot(a, '[small|medium|large|xl|xxl]')) unpivottedDisinherit
Thanks for this 2020 update congrartulations. Just to warn that this works, as far as the columns are not nested inside an struct themselves. where they key values start to do wired things. still the implementation of it had been a great source of inspiration. Thanks again.Orinasal
The regular expressions did not work for my data, which has strings in the value which contain special characters. These regular expressions work for my data. CREATE TEMP FUNCTION unpivot(x ANY TYPE) AS ( ( SELECT ARRAY_AGG(STRUCT( REGEXP_EXTRACT(y, r'"([^"]+)"') AS key , coalesce(REGEXP_EXTRACT(y, r':\"(.+)\"'), REGEXP_EXTRACT(y, r':([^",]+)')) AS value )) FROM UNNEST(( SELECT REGEXP_EXTRACT_ALL(json,r'"[^"]+":\"[^"]+\"|"[^"]+":[^,]+') arr FROM (SELECT TO_JSON_STRING(x) json))) y ) );Zielsdorf
This is A-M-A-Z-I-N-G !! Very intelligent technique.Language
A
34

Update 2021:

A new UNPIVOT operator has been introduced into BigQuery.

Before UNPIVOT is used to rotate Q1, Q2, Q3, Q4 into sales and quarter columns:

product Q1 Q2 Q3 Q4
Kale 51 23 45 3
Apple 77 0 25 2

After UNPIVOT is used to rotate Q1, Q2, Q3, Q4 into sales and quarter columns:

product sales quarter
Kale 51 Q1
Kale 23 Q2
Kale 45 Q3
Kale 3 Q4
Apple 77 Q1
Apple 0 Q2
Apple 25 Q3
Apple 2 Q4

Query:

WITH Produce AS (
  SELECT 'Kale' as product, 51 as Q1, 23 as Q2, 45 as Q3, 3 as Q4 UNION ALL
  SELECT 'Apple', 77, 0, 25, 2
)
SELECT * FROM Produce
UNPIVOT(sales FOR quarter IN (Q1, Q2, Q3, Q4))
Appointive answered 11/5, 2021 at 4:36 Comment(3)
Why was this voted down? I could solve my problem like this.Femoral
This should now be the accepted answerFrankie
How to add all columns except one into in() statement dynamically?Christabelle
O
15

2020 update: fhoffa.x.unpivot()

See:

I created a public persistent UDF. If you have a table a, you can give the whole row to the UDF for it to be unpivotted:

SELECT geo_type, region, transportation_type, unpivotted
FROM `fh-bigquery.public_dump.applemobilitytrends_20200414` a
  , UNNEST(fhoffa.x.unpivot(a, '_2020')) unpivotted

It transforms a table like this:

enter image description here

Into this

enter image description here


As a comment mentions, my solution above doesn't solve for the question problem.

So here's a variation, while I look how to integrate all into one:

CREATE TEMP FUNCTION unpivot(x ANY TYPE) AS (
(
  SELECT 
   ARRAY_AGG(STRUCT(
     REGEXP_EXTRACT(y, '[^"]+') AS key
   , REGEXP_EXTRACT(y, ':([0-9]+)') AS value
   ))
  FROM UNNEST((
    SELECT REGEXP_EXTRACT_ALL(json,'"[smlx][meaxl]'||r'[^:]+:\"?[^"]+?') arr
    FROM (SELECT TO_JSON_STRING(x) json))) y
)
);

SELECT location, unpivotted.*
FROM `robotic-charmer-726.bl_test_data.reconfiguring_a_table` x
  , UNNEST(unpivot(x)) unpivotted


Previous answer:

Use the UNION of tables (with ',' in BigQuery), plus some column aliasing:

SELECT Location, Size, Quantity
FROM (
  SELECT Location, 'Small' as Size, Small as Quantity FROM [table]
), (
  SELECT Location, 'Medium' as Size, Medium as Quantity FROM [table]
), (
  SELECT Location, 'Large' as Size, Large as Quantity FROM [table]
)
Opinionative answered 8/1, 2015 at 3:17 Comment(9)
Thanks for the quick and helpful response @FelipeHoffa !! It didn't need the second and third 'FROM'. So in the end it worked like this: 'SELECT Location, Size, Quantity FROM ( SELECT Location, 'Small' as Size, Small as Quantity FROM [table] ), ( SELECT Location, 'Medium' as Size, Medium as Quantity FROM [table] ), ( SELECT Location, 'Large' as Size, Large as Quantity FROM [table] )' Thanks!Coquette
right! query fixed (if you upload sample data to BQ, then I can test my queries before pasting them here)Opinionative
Good to know! How do I point you to my BigQuery dataset? (I am currently only querying things using Tableau and the BigQuery Web UI, so I have no idea how to give someone directions to my datasets. ThanksCoquette
If you make a dataset public, then anyone on StackOverflow can look at it and give a hand :). For example, this is my wikipedia pageviews log for August: bigquery.cloud.google.com/table/… (project:dataset.table)Opinionative
Awesome, thanks! I think I did this right... link I shared the table with 'All Authenticated Users' and made the cloud storage file public. ThanksCoquette
@FelipeHoffa I tried to apply the UDF to OP's case, to test with list of definites columns but it also unpivots location: SELECT location, unpivotted FROM robotic-charmer-726.bl_test_data.reconfiguring_a_table a , UNNEST(fhoffa.x.unpivot(a, '[small|medium|large|xl|xxl]')) unpivottedDisinherit
Thanks for this 2020 update congrartulations. Just to warn that this works, as far as the columns are not nested inside an struct themselves. where they key values start to do wired things. still the implementation of it had been a great source of inspiration. Thanks again.Orinasal
The regular expressions did not work for my data, which has strings in the value which contain special characters. These regular expressions work for my data. CREATE TEMP FUNCTION unpivot(x ANY TYPE) AS ( ( SELECT ARRAY_AGG(STRUCT( REGEXP_EXTRACT(y, r'"([^"]+)"') AS key , coalesce(REGEXP_EXTRACT(y, r':\"(.+)\"'), REGEXP_EXTRACT(y, r':([^",]+)')) AS value )) FROM UNNEST(( SELECT REGEXP_EXTRACT_ALL(json,r'"[^"]+":\"[^"]+\"|"[^"]+":[^,]+') arr FROM (SELECT TO_JSON_STRING(x) json))) y ) );Zielsdorf
This is A-M-A-Z-I-N-G !! Very intelligent technique.Language
F
2

@Felipe, I tried this using standard-sql but I get an error on the first line of your query that says: "Column name Location is ambiguous at [1:8]"

I've used an alternate query that works for me:

SELECT Location, 'Small' as Size, Small as Quantity FROM `table`
UNION ALL
SELECT Location, 'Medium' as Size, Medium as Quantity FROM `table`
UNION ALL
SELECT Location, 'Large' as Size, Large as Quantity FROM `table`
Flatter answered 24/1, 2019 at 10:39 Comment(0)
H
1

I have a solution that uses STRUCTs, ARRAYs and CROSS JOIN + UNNEST:

WITH
  My_Table_Metrics_Data AS (
  SELECT
    ...,
    [
        STRUCT('...' AS Metric, ... AS Data),
        STRUCT('...' AS Metric, ... AS Data),
    ] AS Metrics_Data
  FROM
    `My_Dataset.My_Table`
  WHERE
    ...
  )
SELECT
  ...,
  Metric_Data
FROM
  My_Table_Metrics_Data
CROSS JOIN
  UNNEST(My_Table_Metrics_Data.Metrics_Data) AS Metric_Data

Full explanation and instructions: https://yuhuisdatascienceblog.blogspot.com/2018/06/how-to-unpivot-table-in-bigquery.html

Holley answered 10/7, 2019 at 14:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.