I have a database table (Oracle 11g) of questionnaire feedback, including multiple choice, multiple answer questions. The Options column has each value the user could choose, and the Answers column has the numerical values of what they chose.
ID_NO OPTIONS ANSWERS
1001 Apple Pie|Banana-Split|Cream Tea 1|2
1002 Apple Pie|Banana-Split|Cream Tea 2|3
1003 Apple Pie|Banana-Split|Cream Tea 1|2|3
I need a query that will decode the answers, with the text versions of the answers as a single string.
ID_NO ANSWERS ANSWER_DECODE
1001 1|2 Apple Pie|Banana-Split
1002 2|3 Banana-Split|Cream Tea
1003 1|2|3 Apple Pie|Banana-Split|Cream Tea
I have experimented with regular expressions to replace values and get substrings, but I can't work out a way to properly merge the two.
WITH feedback AS (
SELECT 1001 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '1|2' answers FROM DUAL UNION
SELECT 1002 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '2|3' answers FROM DUAL UNION
SELECT 1003 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '1|2|3' answers FROM DUAL )
SELECT
id_no,
options,
REGEXP_SUBSTR(options||'|', '(.)+?\|', 1, 2) second_option,
answers,
REGEXP_REPLACE(answers, '(\d)+', ' \1 ') answer_numbers,
REGEXP_REPLACE(answers, '(\d)+', REGEXP_SUBSTR(options||'|', '(.)+?\|', 1, To_Number('2'))) "???"
FROM feedback
I don't want to have to manually define or decode the answers in the SQL; there are many surveys with different questions (and differing numbers of options), so I'm hoping that there's a solution that will dynamically work for all of them.
I've tried to split the options and answers into separate rows by LEVEL, and re-join them where the codes match, but this runs exceedingly slow with the actual dataset (a 5-option question with 600 rows of responses).
WITH feedback AS (
SELECT 1001 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '1|2' answers FROM DUAL UNION
SELECT 1002 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '2|3' answers FROM DUAL UNION
SELECT 1003 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '1|2|3' answers FROM DUAL )
SELECT
answer_rows.id_no,
ListAgg(option_rows.answer) WITHIN GROUP(ORDER BY option_rows.lvl)
FROM
(SELECT DISTINCT
LEVEL lvl,
REGEXP_SUBSTR(options||'|', '(.)+?\|', 1, LEVEL) answer
FROM
(SELECT DISTINCT
options,
REGEXP_COUNT(options||'|', '(.)+?\|') num_choices
FROM
feedback)
CONNECT BY LEVEL <= num_choices
) option_rows
LEFT OUTER JOIN
(SELECT DISTINCT
id_no,
to_number(REGEXP_SUBSTR(answers, '(\d)+', 1, LEVEL)) answer
FROM
(SELECT DISTINCT
id_no,
answers,
To_Number(REGEXP_SUBSTR(answers, '(\d)+$')) max_answer
FROM
feedback)
WHERE
to_number(REGEXP_SUBSTR(answers, '(\d)+', 1, LEVEL)) IS NOT NULL
CONNECT BY LEVEL <= max_answer
) answer_rows
ON option_rows.lvl = answer_rows.answer
GROUP BY
answer_rows.id_no
ORDER BY
answer_rows.id_no
If there isn't a solution just using Regex, is there a more efficient way than LEVEL to split the values? Or is there another approach that would work?