How to split a string value based on a delimiter in DB2
Asked Answered
C

13

18

How do you split a string value in DB2?

For example, given the value:

CHG-FFH.

I want to split on the dash (-), which would result in two values:

CHG 
FFH. 

I tried using split function but it is not a function in DB2.

Any help will be appreciated.

Cavender answered 23/9, 2013 at 14:32 Comment(1)
Does this answer your question? how to transform comma separated column into multiples rows in db2Diocletian
C
6

This is what i tried and it fetched me effective result. Hence sharing with all.

select column_name, substr(column_name,1,locate('-',column_name)-1), 
substr(column_name,locate('-',column_name)+1,
length(substr(column_name,locate('-',column_name)+1))) from 
table_name where column_name is not null and column_name!='' 
and column_name like '%-%'
Cavender answered 25/9, 2013 at 13:58 Comment(1)
hello, I need a for loop, my String is variable. how to write it? thankyouColvin
D
40

Short answer:

You need to find the position of the delimiter, and then substring using it as the starting point, and a calculated length.

SELECT 
    SUBSTR('CHG-FFH', 1, LOCATE('-','CHG-FFH')-1) as FIRST_PART
  , SUBSTR('CHG-FFH', LOCATE('-','CHG-FFH')+1)   as SECOND_PART
FROM SYSIBM.SYSDUMMY1;

BONUS! If you do this often, create a user defined function to do it dynamically. Here's an example in a DB fiddle.

CREATE FUNCTION SPLITTER (input VARCHAR(4000), delimiter CHAR, part_number INTEGER)
      RETURNS VARCHAR(4000)
      LANGUAGE SQL
      READS SQL DATA
      NO EXTERNAL ACTION
      DETERMINISTIC
      RETURN 

with pos_info (first_pos, length) as (
select 
  case when part_number = 1 then 0
       else LOCATE_IN_STRING(input, delimiter,1, part_number-1, OCTETS)  
  end as first_pos, 

  case when part_number = 1 then LOCATE_IN_STRING(input, delimiter,1, part_number, OCTETS) - 1 
       when LOCATE_IN_STRING(input, delimiter,1, part_number, OCTETS) = 0
        and  LOCATE_IN_STRING(input, delimiter,1, part_number-1, OCTETS) = 0
       then 0
       when LOCATE_IN_STRING(input, delimiter,1, part_number, OCTETS) = 0
       then length(input) -  LOCATE_IN_STRING(input, '-',1, part_number - 1,    OCTETS)
       else LOCATE_IN_STRING(input, delimiter,1, part_number, OCTETS) -    LOCATE_IN_STRING(input, delimiter,1, part_number-1, OCTETS) - 1
  end as length
from sysibm.sysdummy1
)

select    
    substr(input, first_pos+1,length) as part
from pos_info;

Alternatively, you can see a different approach here at this answer: Split a VARCHAR in DB2 to retrieve a value inside.


Long answer:

DB2, along with other relational databases do not provide a single function to accomplish this.

The reason is likely that it's not an implicitly scalar function. If your string had more than one dash in it, would you want to split it into three parts? Four? So the first step is to note if your data is determinate - if it has a specific number of components that you want to split apart. In your example, you have two, so I'll start with that assumption, and then afterwards comment on how you would deal other situations.

Scenario: A string value with two components separated by a delimiter

With only two parts, you need to find the position of the delimiter, and then substring before and after it by using the position before and after it in a substring function.

  1. LOCATE the index of your delimiter.
LOCATE('-','CHG-FFH')

NOTE: DB2 provides two functions which can be used for this: POSITION (or POSSTR), and LOCATE (or LOCATE_IN_STRING). LOCATE is a bit more powerful because it allows you to specify a start position, which would be helpful if you had more than one delimiter.

  1. SUBSTR using the delimiter index.

For the first part, start your substring at position 1, up to the character before the delimiter (delimiter position - 1):

SUBSTR('CHG-FFH', 1,LOCATE('-','CHG-FFH')-1) as FIRST_PART

For the second part, start your substring at the position after delimiter index (delimiter position + 1), and get the rest of the String:

 SUBSTR('CHG-FFH', LOCATE('-','CHG-FFH')+1) as SECOND_PART

Final Result:

SELECT 
    SUBSTR('CHG-FFH', 1,LOCATE('-','CHG-FFH')-1) as FIRST_PART
  , SUBSTR('CHG-FFH', LOCATE('-','CHG-FFH')+1) as SECOND_PART
FROM SYSIBM.SYSDUMMY1;

Scenario: A string value with three components separated by a delimiter

Use the same concepts as the first scenario, but you have to determine the index of the second delimiter. Use the index of the first delimiter to specify a starting point: Note that LOCATE allows specifying a start position:

>>-LOCATE(search-string,source-string-+--------+-+--------------------+-)-><
                                      '-,start-' '-,--+-CODEUNITS16-+-'     
                                                      +-CODEUNITS32-+       
                                                      '-OCTETS------' 

Finding the second delimiter:

Use the position of the first delimiter as the starting point for finding the second delimiter.

LOCATE('-','CHG-FFH-EEE', LOCATE('-','CHG-FFH-EEE')+1)

Use that as a SUBSTR point for the second and third values, and you're all set. Note: For the second value, you have to use both of the delimiter locations to substring the value.

Final Result:

SELECT 
    SUBSTR('CHG-FFH-EEE', 1,LOCATE('-','CHG-FFH-EEE')-1) as FIRST_PART
  , SUBSTR('CHG-FFH-EEE', LOCATE('-','CHG-FFH-EEE')+1, LOCATE('-','CHG-FFH-EEE', LOCATE('-','CHG-FFH-EEE'))-1) as SECOND_PART
  , SUBSTR('CHG-FFH-EEE', LOCATE('-','CHG-FFH-EEE', LOCATE('-','CHG-FFH-EEE')+1)+1) as THIRD_PART
FROM SYSIBM.SYSDUMMY1;

You can see this strategy would get out of hand for a greater number of delimiters in your String.

Scenario: Indeterminate number of delimiters

This is a tricky problem that is best approached with a Stored Procedure. Think through things like: How do you want the parsed data to come out of the algorithm, how will you access the data? Arrays are not a native type in SQL, but they are in Stored Procedures, so what will you do with the array when you've parsed all the pieces out of your String?

One way to approach this scenario is answered here:

Split a VARCHAR in DB2 to retrieve a value inside

Depravity answered 23/9, 2013 at 16:0 Comment(4)
Could the second part be : SUBSTR('CHG-FFH-EEE', LOCATE('-','CHG-FFH-EEE')+1, LOCATE('-','CHG-FFH-EEE', LOCATE('-','CHG-FFH-EEE'))-1) as SECOND_PARTWoolgathering
Yes! that's right, @user3606336. I didn't notice before that my example didn't work properly. I'll update my answer with your correction. Thanks!Depravity
It would be very easy to implement a function scan('CHG-FFH-EEE',2,'-') which returns the second word, where words are separated by '-'Alyose
@jms thanks for the suggestion. I added a User Defined Function (UDF) to my answer, and linked to a fiddle that demonstrates it. Good idea!Depravity
C
6

This is what i tried and it fetched me effective result. Hence sharing with all.

select column_name, substr(column_name,1,locate('-',column_name)-1), 
substr(column_name,locate('-',column_name)+1,
length(substr(column_name,locate('-',column_name)+1))) from 
table_name where column_name is not null and column_name!='' 
and column_name like '%-%'
Cavender answered 25/9, 2013 at 13:58 Comment(1)
hello, I need a for loop, my String is variable. how to write it? thankyouColvin
B
6

The reason of so late answer is to show much more simple and universal way to achieve the goal. It's based on functions using regular expressions available even at the time, when the question was asked.

To get N-th token (2-nd in the example):

SELECT 
  COL
-- since 9.7 (2009)
, xmlcast(xmlquery('fn:tokenize($s, "-")[2]' passing COL as "s") as varchar(20)) as one
-- since 11.1
, REGEXP_SUBSTR(COL, '([^-]*)-?', 1, 2, '', 1) as two
FROM (VALUES 'CHG-FFH.', 'ABC-DEF-GH') TAB (COL);

The result is:

|COL       |ONE                 |TWO       |
|----------|--------------------|----------|
|CHG-FFH.  |FFH.                |FFH.      |
|ABC-DEF-GH|DEF                 |DEF       |

To tokenize a string:

-- since 9.7 (2009)
SELECT TAB.COL, TOK.SEQ, TOK.TOKEN
FROM 
  (VALUES 'CHG-FFH.', 'ABC-DEF-GH') TAB (COL)
, XMLTABLE
(
  'for $id in tokenize($s, "-") return <i>{string($id)}</i>' PASSING TAB.COL AS "s"
  COLUMNS
    SEQ   FOR ORDINALITY
  , TOKEN VARCHAR(20) PATH '.'
) TOK
ORDER BY TAB.COL, TOK.SEQ;

The result is:

|COL       |SEQ                 |TOKEN               |
|----------|--------------------|--------------------|
|ABC-DEF-GH|1                   |ABC                 |
|ABC-DEF-GH|2                   |DEF                 |
|ABC-DEF-GH|3                   |GH                  |
|CHG-FFH.  |1                   |CHG                 |
|CHG-FFH.  |2                   |FFH.                |
Biconvex answered 1/5, 2020 at 10:38 Comment(0)
S
2

Try this statement:

select substr(your_value, 1,3), substr(your_value, 4, 3) from your_table
Snare answered 23/9, 2013 at 15:29 Comment(2)
@ Peter : Thanks for your response. The solution given by you will work only for the value mentioned above i.e CHG-FFH. But if the characters are not fixed before and after "-" then what solution can be implied?Cavender
try josh hull's solutionSnare
L
2

This answer is not totally different, but it impements LOCATE_IN_STRING in a more flexible way which is useful, for example, if you have restrictions not to use functions.

To get the nth string delimited by the character '-' as in your example:

SUBSTR(THESTRING, LOCATE_IN_STRING(THESTRING, '-', 1,  n - 1) + 1, 
LOCATE_IN_STRING(THESTRING||'-', '-', 1,  n) - LOCATE_IN_STRING(THESTRING, '-', 1,  n - 1) - 1) 
                        

If you replace each occurrence of n by the column number you want, this construct will return the nth column's contents.


Some explanation:

The length of the string to be extracted by SUBSTR is calcluated from the difference between the occurrence of the nth and the (n + 1)th delimiter character (- in this case). Note that the (n + 1)th delimiter position is calculated from THESTRING plus an extra delimiter attached to its end to account for the case when we look for the last column and THESTRING does not end in a delimiter.

Lithia answered 6/8, 2020 at 13:9 Comment(0)
R
1

I know this is old post.. but thought following may help others.

I used following approach to split the given string.

SELECT TRIM(ITEM) AS ITEM FROM TABLE(<LIB1>.SF_SPLIT(I_INPUTLIST=>'AA|BB|CC|DD|EE|EE|FF', I_DELIMITER=>'|')) AS T;

SF_SPLIT is the User defined SQL function and below is definition:

CREATE OR REPLACE FUNCTION <LIB1>.SF_SPLIT(

    I_INPUTLIST VARCHAR(8000) 
  , I_DELIMITER VARCHAR(3)    

) 
RETURNS TABLE (ITEM VARCHAR(8000))

LANGUAGE SQL

RETURN

WITH R1 (ITEM, REMINDER) AS 

(SELECT SUBSTR(I_INPUTLIST, 1, LOCATE(I_DELIMITER, I_INPUTLIST)-1) AS ITEM, 

SUBSTR(I_INPUTLIST, LOCATE(I_DELIMITER, I_INPUTLIST)+1, LENGTH(I_INPUTLIST)) REMINDER

FROM SYSIBM.SYSDUMMY1

UNION ALL

SELECT SUBSTR(REMINDER, 1, LOCATE(I_DELIMITER, REMINDER)-1) AS ITEM, 
SUBSTR(REMINDER, LOCATE(I_DELIMITER, REMINDER)+1, LENGTH(REMINDER)) REMINDER 

FROM R1 WHERE LOCATE(I_DELIMITER, REMINDER) > 0

UNION ALL

SELECT SUBSTR(REMINDER, LOCATE(I_DELIMITER, REMINDER)+1, LENGTH(REMINDER)) AS ITEM,

'' AS REMINDER FROM R1 WHERE REMINDER <> '' AND LOCATE(I_DELIMITER, REMINDER) = 0

)

SELECT ITEM FROM R1;
Robi answered 19/5, 2017 at 14:57 Comment(1)
This was really close to working for me, but I had to change the length parts. I am going to make a separate answer in case yours works for someone else. It is probably a db2 difference between our environments.Enounce
U
1

In DB2

SELECT
'11,222,33,444' AS THE_F_STRING
, SUBSTR('11,222,33,444', 1, LOCATE_IN_STRING('11,222,33,444',',',1,1)-1) AS AA
, SUBSTR('11,222,33,444', LOCATE_IN_STRING('11,222,33,444',',',1,1)+1, LOCATE_IN_STRING('11,222,33,444',',',1,2)-LOCATE_IN_STRING('11,222,33,444',',',1,1)-1) AS BB 
, SUBSTR('11,222,33,444', LOCATE_IN_STRING('11,222,33,444',',',1,2)+1, LOCATE_IN_STRING('11,222,33,444',',',1,3)-LOCATE_IN_STRING('11,222,33,444',',',1,2)-1) AS CC
, SUBSTR('11,222,33,444', LOCATE_IN_STRING('11,222,33,444',',',1,3)+1, LENGTH('11,222,33,444')-LOCATE_IN_STRING('11,222,33,444',',',1,3)) AS DD
FROM SYSIBM.SYSDUMMY1;

Keep extrapolating...enjoy...

Unwitnessed answered 24/11, 2017 at 3:49 Comment(0)
B
0

If you are sure that each substrings are 3 characters long you can try this code, provided that TABLE1 is a table where there is at least X rows (X = 10 in this example):

select rc, substr(string_to_split, (rc-1)*3+rc, 3) as result from
    (select row_number() over() as rc from TABLE1 fetch first 10 rows only) TB_rowcount
    cross join
    (select 'CHG-FFH' as string_to_split from sysibm.sysdummy1) T2
    where substr(string_to_split, (rc-1)*3+rc, 3) <> '   '

If the length of the substrings are not the same you have to apply LOCATE function to find the separator

Baronial answered 31/12, 2015 at 13:41 Comment(0)
L
0

I needed to use instr, substr, trim, and messed with locate as well.. but instr, and substr are all supported. You can find a pattern. I had to go through a varchar split with ' - ' and needed to find the end and go back from there.

           select  itn, 
           substr(Message, 1 , locate(' - ', Message)) FIRST_SSR,  
           SUBSTR(Message , instr( message, ' - ', octets)+1, (instr( 
            message, '(Ref', octets)+1)) SECOND_STR ,
           Message
              from
         (
   select p.itn itn, 
          substr(p.msg,  instr( p.msg, ' - ' , octets)+21) Message
    from itnpad p
    where p.msg like '%MR - Multiple Requests%'

       ) A 
Lemmons answered 29/6, 2017 at 22:41 Comment(0)
E
0

Suraj's answer was very close to working for me, but I needed to change the length logic slightly. I also have colons in it at this point to run it outside of a stored procedure.

WITH R1 (ITEM, REMAINDER) AS 

(SELECT SUBSTR(:I_INPUTLIST, 1, LOCATE(:I_DELIMITER, :I_INPUTLIST)-1) AS ITEM, 

SUBSTR(:I_INPUTLIST, LOCATE(:I_DELIMITER, :I_INPUTLIST)+1, LENGTH(:I_INPUTLIST) - LOCATE(:I_DELIMITER, :I_INPUTLIST)) REMAINDER

FROM SYSIBM.SYSDUMMY1
UNION ALL

SELECT SUBSTR(REMAINDER, 1, LOCATE(:I_DELIMITER, REMAINDER)-1) AS ITEM, 
SUBSTR(REMAINDER, LOCATE(:I_DELIMITER, REMAINDER)+1, LENGTH(:I_INPUTLIST) - LOCATE(:I_DELIMITER, :I_INPUTLIST)) REMAINDER 

FROM R1 WHERE LOCATE(:I_DELIMITER, REMAINDER) > 0

UNION ALL

SELECT SUBSTR(REMAINDER, LOCATE(:I_DELIMITER, REMAINDER)+1, LENGTH(:I_INPUTLIST) - LOCATE(:I_DELIMITER, :I_INPUTLIST)) AS ITEM,

'' AS REMAINDER FROM R1 WHERE REMAINDER <> '' AND LOCATE(:I_DELIMITER, REMAINDER) = 0

)

SELECT ITEM FROM R1;
Enounce answered 25/6, 2021 at 16:21 Comment(0)
B
0

DB2 11.5 LUW has a set of JSON functions that can do this.

select * 
from table(
     systools.json_table(systools.json2bson('{"data":' || ([1,2,3]) || }'), 'data','s:10'))

TYPE VALUE


     16 1
     16 2
     16 3

3 record(s) selected.

you have to wrap the string in a valid json object ( which is why i concat with curly braces and a 'data' key) but then it will recursively unpack the array into rows.

Bonnes answered 12/7, 2022 at 21:10 Comment(0)
L
0
CREATE FUNCTION MYLIB.SPLITTER ( 
    ISTRING VARCHAR(4000) , 
    IDELIMITER CHAR(1) , 
    IPOS INTEGER ) 
    RETURNS VARCHAR(4000)   
    LANGUAGE SQL 
    SPECIFIC MYLIB.SPLITTER 
    DETERMINISTIC 
    CONTAINS SQL 
    CALLED ON NULL INPUT 
    SET OPTION  ALWBLK = *ALLREAD , 
    ALWCPYDTA = *OPTIMIZE , 
    COMMIT = *NONE , 
    DECRESULT = (31, 31, 00) , 
    DYNDFTCOL = *NO , 
    DYNUSRPRF = *USER , 
    SRTSEQ = *HEX   
    BEGIN 
DECLARE POS INT DEFAULT 1 ; 
DECLARE NEXT_POS INT ; 
DECLARE OPART VARCHAR ( 4000 ) ; 
DECLARE XDELIMITER CHAR ( 1 ) ; 
DECLARE XPOS INT ; 
  
SET XDELIMITER = COALESCE ( IDELIMITER , ',' ) ; 
SET XPOS = COALESCE ( IPOS , 1 ) ; 
  
WHILE POS > 0 DO 
SET NEXT_POS = LOCATE ( XDELIMITER , ISTRING , POS ) ; 
IF NEXT_POS = 0 THEN 
SET OPART = TRIM ( SUBSTR ( ISTRING , POS ) ) ; 
IF XPOS = 1 THEN 
RETURN OPART ; 
ELSE 
RETURN NULL ; 
END IF ; 
ELSE 
SET OPART = TRIM ( SUBSTR ( ISTRING , POS , NEXT_POS - POS ) ) ; 
END IF ; 
  
IF XPOS = 1 THEN 
RETURN OPART ; 
END IF ; 
  
SET POS = NEXT_POS + 1 ; 
SET XPOS = XPOS - 1 ; 
  
END WHILE ; 
  
RETURN NULL ;  -- Return NULL if iPos is out of range 
END  ; 
  
LABEL ON SPECIFIC FUNCTION MYLIB.SPLITTER 
    IS 'In(String,Delimiter,Segment) Out(parsed value)' ; 
  
GRANT ALTER , EXECUTE   
ON SPECIFIC FUNCTION MYLIB.SPLITTER 
TO QPGMR WITH GRANT OPTION ;
Lumpy answered 19/6 at 18:27 Comment(1)
Thank you for your interest in contributing to the Stack Overflow community. This question already has quite a few answers—including one that has been extensively validated by the community. Are you certain your approach hasn’t been given previously? If so, it would be useful to explain how your approach is different, under what circumstances your approach might be preferred, and/or why you think the previous answers aren’t sufficient. Can you kindly edit your answer to offer an explanation?Crucifix
S
-1
CREATE TYPE CUSTOMSTRINGARRAY AS VARCHAR(1000) ARRAY[VARCHAR(1000)];
create or replace function SPLIT_STRING(inputString varchar(1000),splitor varchar(10), pos int)
returns VARCHAR(1000)
ARRAYDEMO: BEGIN
DECLARE arraySymbols CUSTOMSTRINGARRAY;
DECLARE out_str,item,str VARCHAR(1000);
DECLARE i,occ INT;
SET i = 1;
set item = '';
set str = inputString;
set occ = LENGTH(inputString) - LENGTH(REPLACE(inputString,splitor,''));
WHILE i <= occ DO
set item = substr(str,1,LOCATE_IN_STRING(str,splitor,1));
set str = replace(str,item,'');
SET arraySymbols[i] = TRIM(replace(item,splitor,''));
SET i = i + 1;
END WHILE;
set arraySymbols[i] = str;
return(arraySymbols[pos]);
end;
Scorify answered 11/3, 2019 at 12:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.