Google BigQuery: prefix for all columns of joined table with duplicated names
Asked Answered
A

1

13

On Google BigQuery (using #standardSQL), when there is a Join between 2 tables, I need to apply a fixed prefix to all the columns of each table.

Here is the scenario, I have a structure like this:

#standardSQL
WITH user AS (
  SELECT "john" as name, "smith" as surname, 1 as parent
  UNION ALL
  SELECT "maggie" as name, "smith" as surname, 2 as parent
),

parent AS (
  SELECT 1 as id, "john" as name, "doe" as surname
  UNION ALL
  SELECT 2 as id, "jane" as name, "smith" as surname
)

User table

+-----+--------+---------+--------+
| Row |  name  | surname | parent |
+-----+--------+---------+--------+
|   1 | john   | smith   |      1 |
|   2 | maggie | smith   |      2 |
+-----+--------+---------+--------+

Parent table

+-----+----+------+---------+
| Row | id | name | surname |
+-----+----+------+---------+
|   1 |  1 | john | doe     |
|   2 |  2 | jane | smith   |
+-----+----+------+---------+

A query like this

SELECT u.*, p.* FROM user u JOIN parent p ON u.parent = p.id

produces the following error

Error: Duplicate column names in the result are not supported. Found duplicate(s): name, surname

I want to avoid performing a custom aliasing of the table like this

SELECT
  u.name as user_name,
  u.surname as user_surname,
  p.name as parent_name,
  p.surname as parent_surname
FROM user u JOIN parent p ON u.parent = p.id

+-----+-----------+--------------+-------------+----------------+
| Row | user_name | user_surname | parent_name | parent_surname |
+-----+-----------+--------------+-------------+----------------+
|   1 | john      | smith        | john        | doe            |
|   2 | maggie    | smith        | jane        | smith          |
+-----+-----------+--------------+-------------+----------------+

If the table will change on fields, I'll need every time to edit the statement (or the statements) in order to apply the new fields with the given prefix. So this approach using fixed column names is not a suitable way

Is there a way, a query operator, in order to obtain the table as mentioned up there, automatic applying a prefix? Something like:

SELECT u.* AS user_*, p.* AS parent_*
FROM user u JOIN parent p ON u.parent = p.id
Ailis answered 24/11, 2017 at 7:53 Comment(0)
L
13

The only option I can think of so far is as below

#standardSQL
WITH user AS (
  SELECT "john" AS name, "smith" AS surname, 1 AS parent UNION ALL
  SELECT "maggie" AS name, "smith" AS surname, 2 AS parent
), parent AS (
  SELECT 1 AS id, "john" AS name, "doe" AS surname UNION ALL
  SELECT 2 AS id, "jane" AS name, "smith" AS surname   
)
SELECT user, parent  
FROM user  
JOIN parent 
ON user.parent = parent.id  

with result as

Row user.name   user.surname    user.parent parent.id   parent.name parent.surname   
1   john        smith           1           1           john        doe  
2   maggie      smith           2           2           jane        smith   

It is not exactly what you expect, but the closest to it, as it wraps each row from respective joined tables into respective STRUCTs - for example:

{
"user": {"name": "john", "surname": "smith","parent": "1"},
"parent": {"id": "1","name": "john","surname": "doe"}
}
Livvi answered 24/11, 2017 at 15:23 Comment(4)
Thanks, but I need to use a custom prefix, instead of using a STRUCT like you suggestedAilis
that's understood but unfortunately not doable. so as I mentioned - above this is the closest I can think of. :o)Livvi
Perfect. My use case is just exploring data, joining, sub-setting, etc. I don't care about the select list until much later. I don't want to have to enumerate 100 columns just to speculatively examine a few tables.Grummet
@MikhailBerlyant, I am loving the simplicity of this. It works great for me. For those who need to go further: it looks like something might also be possible using INFORMATION_SCHEMA like in this example: stackoverflow.com/a/10302963Unnecessary

© 2022 - 2024 — McMap. All rights reserved.