Identify if a column is Virtual in Snowflake
Asked Answered
O

3

7

Snowflake does not document its Virtual Column capability that uses the AS clause. I am doing a migration and needing to filter out virtual columns programatically.

Is there any way to identify that a column is virtual? The Information Schema.COLLUMNS view shows nothing different between a virtual and non-virtual column definition.

Outlaw answered 6/8, 2021 at 18:46 Comment(2)
By virtual column, do you mean a column with a default value (e.g. an expression)? If you do then doesn’t this show up in INFORMATION_SCHEMA .COLUMNS.COLUMN_DEFAULT?Colubrine
@Colubrine DEFAULT(expression) and AS (expression) are two different concepts. There is no info about it in INFROMATION_SCHEMA.COLUMNS.Kilt
K
3

There is a difference between column defined as DEFAULT and VIRTUAL COLUMN(aka computed, generated column):

Virtual column

CREATE OR REPLACE TABLE T1(i INT, calc INT AS (i*i));
INSERT INTO T1(i) VALUES (2),(3),(4);
SELECT * FROM T1;

enter image description here

When using AS (expression) syntax the expression is not visible inCOLUMN_DEFAULT:

enter image description here


DEFAULT Expression

In case of the defintion DEFAULT (expression):

CREATE OR REPLACE TABLE T2(i INT, calc INT DEFAULT (i*i));
INSERT INTO T2(i) VALUES (2),(3),(4);
SELECT * FROM T2;

It is visible in COLUMN_DEFAULT:

SELECT * 
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'T2';

enter image description here


Comparing side-by-side with SHOW COLUMNS:

SHOW COLUMNS LIKE 'CALC';
-- kind: VIRTUAL_COLUMN

enter image description here

One notable difference between them is that virtual column cannot be updated:

UPDATE T1
SET calc  = 1;
-- Virtual column 'CALC' is invalid target.

UPDATE T2
SET calc = 1;
-- success
Kilt answered 7/8, 2021 at 7:42 Comment(0)
G
1

How about using SHOW COLUMNS ? you should identify them when expression field is not null.

create table foo (id bigint, derived bigint as (id * 10));
insert into foo (id) values (1), (2), (3);

SHOW COLUMNS IN TABLE foo;
SELECT "table_name", "column_name", "expression" FROM table(result_scan(last_query_id()));

| table_name | column_name | expression     |
| ---------- | ----------- | -------------- |
| FOO        | ID          | null           |
| FOO        | DERIVED     | ID*10          |
Greatcoat answered 6/8, 2021 at 19:12 Comment(1)
My question was not using the DEFAULT, but rather the AS method. I wanted a way to programmatically determine if a column was Virtual. SHOW COLUMNS cannot be referenced from a SQL statement. You reminded me that I can issue SHOW COLUMNS then go back and get the last buffer entry and query that like a table. Thank you!Outlaw
I
0

I normally use the desc table option.

First lets create the table with some example data:

create or replace temporary table ColumnTypesTest (
    id int identity(1,1) primary key,
    userName varchar(30),
    insert_DT datetime default CAST(CONVERT_TIMEZONE('UTC', CAST(CURRENT_TIMESTAMP() AS TIMESTAMP_TZ(9))) AS TIMESTAMP_NTZ(9)) not null,
    nextDayAfterInsert datetime as dateadd(dd,1,insert_DT)
);

insert into ColumnTypesTest (userName) values
('John'),
('Cris'),
('Anne');

select * from ColumnTypesTest;
ID USERNAME INSERT_DT NEXTDAYAFTERINSERT
1 John 2021-10-04 19:11:21.069 2021-10-05 19:11:21.069
2 Cris 2021-10-04 19:11:21.069 2021-10-05 19:11:21.069
3 Anne 2021-10-04 19:11:21.069 2021-10-05 19:11:21.069

Now the answer to your question

Using the 'desc table <table_name>;' you will get a column named kind which will tell you if it is virtual or not, also separately there is the default with NULL if it has no default value.

name type kind null? default primary key unique key check expression comment policy name
ID NUMBER(38,0) COLUMN N IDENTITY START 1 INCREMENT 1 Y N
USERNAME VARCHAR(30) COLUMN Y N N
INSERT_DT TIMESTAMP_NTZ(9) COLUMN N CAST(CONVERT_TIMEZONE('UTC', CAST(CURRENT_TIMESTAMP() AS TIMESTAMP_TZ(9))) AS TIMESTAMP_NTZ(9)) N N
NEXTDAYAFTERINSERT TIMESTAMP_NTZ(9) VIRTUAL Y N N DATE_ADDDAYSTOTIMESTAMP(1, INSERT_DT)

A/ With 'desc table <table_name>' you get meta data of the table with a column named kind, which will say VIRTUAL or COLUMN. In case it is VIRTUAL, then in the column expression you get how that column is calculated.

This is used in Stored Procedures, and saved in an array of arrays with javascript, from there the next query in the stored procedure is created dynamically. A while loop is used to go through the resultSet and push each row intho the array of arrays. You can then use javascript filter to just get the virtual columns. This is part of the advantage of having a mix of javascript and SQL in Snowflake Stored Procedures.

Here the documentation which doesn't say much.

Ingaingaberg answered 4/10, 2021 at 19:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.