Redshift Python UDFs Varchar limits
Asked Answered
M

2

6

I have successfully created a Python UDF that accepts a varchar value from a table and extracts a substring of that value based on a regex.

The max size of that varchar column in the DDL is set to be 20000 bytes, and in some occasions the UDF outputs an error when I try to call it:

ERROR: Value too long for character type

Detail: 
  -----------------------------------------------
  error:  Value too long for character type
  code:      8001
  context:   Value too long for type character varying(256)
  query:     1127608
  location:  funcs_string.hpp:390
  process:   query0_73 [pid=25345]
  -----------------------------------------------

Is the case that the max varchar limit for a UDF input variable is set to 256 bytes? If yes, is this something that I can change in my code?

Thanks,

Millet answered 9/11, 2015 at 10:45 Comment(0)
M
17

Turns out that you have to specify the length in your function's parameter if it is more than 256, which is the default value.. So I fixed this using something similar to :

CREATE OR REPLACE FUNCTION f_xxxxxx(val VARCHAR(20000)) RETURNS VARCHAR(20000) 
IMMUTABLE AS $$
<python function here>
$$ LANGUAGE plpythonu;
Millet answered 11/11, 2015 at 14:12 Comment(2)
FYI: You can use VARCHAR(MAX) if your function accepts the largest string possible. You also want to make sure that you use RETURNS VARCHAR(MAX) if it does because apparently, VARCHAR == VARCHAR(256) in Redshift.Altheaalthee
this answer is right, can reference aws official answerTamera
P
1

Are you using Text or nvarchar? Redshift internally does not support these data types and converts them to varchar(256).

More details can be found here

Paediatrician answered 10/11, 2015 at 13:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.