COPY TEXT array from Postgres to Snowflake
Asked Answered
E

3

6

I have a few tables with Text array field to copy from Postgres to Snowflake and ran into a problem. The corresponding field in Snowflake is now defined as array.

Some values in the text array field in Postgres has a single entity string without double quotes around them like the following (abc in this example is causing an error):

{abc, "bc dcd", "de ef fg"}

It looks like this (abc) is not a problem in Postgres but Snowflake doesn't accept it as a valid value. I am using Snowflake Python connector and it complains that "JSON parsing error". Any workaround this? I guess copying it as text field (not array) in Snowflake will likely fix the issue but I am wondering if there is any workaround as the array type.

Espionage answered 4/11, 2019 at 17:35 Comment(4)
The value as shown is not a valid Postgres array. What exactly is the data type of that column? text[] or json?Plossl
The first value in the array looks almost like a symbol, but symbols are not allowed in JSON and Snowflake ARRAYs. I'm not familiar with PostgreSQL, but it doesn't look like a valid text[] either. Maybe it was transformed somehow into [abc, "bc dcd", "de ef fg"] ? Converting to text looks like a good option, but what do you really want?<br> How is this supposed to be used?Geology
text[] is the exact typeEspionage
The default representation of arrays in Postgres uses {...} not the square brackets. But anyhow: you can use to_jsonb(the_column) to change that into a valid JSON arrayPlossl
S
2

I ran into a similar issue where a Postgres array was put in a CSV and ingested to Snowflake as text. I still needed a way to make this an array in Snowflake, but had some trouble figuring out how to do that.

None of the elements in my arrays are quoted nor do they have spaces around the comma, so I'm adjusting the OP's question a bit because I haven't seen this answered anywhere.

The only way I've found to do this so far is with STRTOK_TO_ARRAY.

select strtok_to_array('{abc,bc dcd,de ef fg}', '{},')

which yields

[
  "abc",
  "bc dcd",
  "de ef fg"
]

The OP's initial "Postgres array" would yeild

[
  "abc",
  " \"bc dcd\"",
  " \"de ef fg\""
]
Succinylsulfathiazole answered 2/3, 2022 at 21:54 Comment(0)
S
0

One solution to this is if you use to_jsonb in Postgres which will convert the relevant column into JSON prior to staging in Snowflake.

Sideline answered 8/10, 2021 at 14:5 Comment(0)
T
0

Building on Brad's answer, if your Postgres data does include quotation marks and whitespace before/after commas, you can use the following to do the full clean:

select strtok_to_array(replace(regexp_replace(my_column, $$ *, *$$, ','), '"', ''), '{},')
from my_table
Tribunal answered 15/3 at 17:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.