Is it possible to query a comma separated column for a specific value?
Asked Answered
Z

5

3

I have (and don't own, so I can't change) a table with a layout similar to this.

ID | CATEGORIES
---------------
1  | c1
2  | c2,c3
3  | c3,c2
4  | c3
5  | c4,c8,c5,c100

I need to return the rows that contain a specific category id. I starting by writing the queries with LIKE statements, because the values can be anywhere in the string

SELECT id FROM table WHERE categories LIKE '%c2%'; Would return rows 2 and 3

SELECT id FROM table WHERE categories LIKE '%c3%' and categories LIKE '%c2%'; Would again get me rows 2 and 3, but not row 4

SELECT id FROM table WHERE categories LIKE '%c3%' or categories LIKE '%c2%'; Would again get me rows 2, 3, and 4

I don't like all the LIKE statements. I've found FIND_IN_SET() in the Oracle documentation but it doesn't seem to work in 10g. I get the following error:

ORA-00904: "FIND_IN_SET": invalid identifier
00904. 00000 -  "%s: invalid identifier"

when running this query: SELECT id FROM table WHERE FIND_IN_SET('c2', categories); (example from the docs) or this query: SELECT id FROM table WHERE FIND_IN_SET('c2', categories) <> 0; (example from Google)

I would expect it to return rows 2 and 3.

Is there a better way to write these queries instead of using a ton of LIKE statements?

Zoolatry answered 27/8, 2011 at 2:58 Comment(8)
Again, that's one of the reasons why you don't put CSV or otherwise serialized values in a relational DB.Dolph
Also FIND_IN_SET is a MySQL function, won't work with OracleDolph
...and as bad as you've made it seem, it's actually much worse. Unless you filter the results later somehow; you can't just search for LIKE '%c2%' - you actually should specify each one in this manner: "...where (categories = 'c2' or categories like '%,c2' or categories like '%,c2,%' or categories like 'c2,%'). Unless you do that, you're going to match things like 'c20', 'c201', etc.Enlargement
What Oracle version? Can you use a regex-based query? download.oracle.com/docs/cd/B14117_01/appdev.101/b10795/…Zinazinah
@Enlargement Not if you add the commas to the searched value, like in my answer.Rident
Unfortunately, I can't normalize the table. It's not my database. It's what I was given to query against and is 'legacy', so it can't be changed according to the powers that be.Zoolatry
@Golez: adding the commas will only find the search-value when it is in the middle of a list; Gerrat's code will also find the search-value when it is at the beginning of the list, at the end of the list, and when it is the only item in the list, all of which are plausible cases that have to be considered.Nkrumah
@@Jonathn Leffer. You're incorrent. Please read my code again.Rident
R
13

You can, using LIKE. You don't want to match for partial values, so you'll have to include the commas in your search. That also means that you'll have to provide an extra comma to search for values at the beginning or end of your text:

select 
  * 
from
  YourTable 
where 
  ',' || CommaSeparatedValueColumn || ',' LIKE '%,SearchValue,%'

But this query will be slow, as will all queries using LIKE, especially with a leading wildcard.

And there's always a risk. If there are spaces around the values, or values can contain commas themselves in which case they are surrounded by quotes (like in csv files), this query won't work and you'll have to add even more logic, slowing down your query even more.

A better solution would be to add a child table for these categories. Or rather even a separate table for the catagories, and a table that cross links them to YourTable.

Rident answered 27/8, 2011 at 6:51 Comment(0)
P
2

You can write a PIPELINED table function which return a 1 column table. Each row is a value from the comma separated string. Use something like this to pop a string from the list and put it as a row into the table:

PIPE ROW(ltrim(rtrim(substr(l_list, 1, l_idx - 1),' '),' '));

Usage:

SELECT * FROM MyTable 
WHERE 'c2' IN TABLE(Util_Pkg.split_string(categories));

See more here: Oracle docs

Pathogen answered 5/6, 2014 at 11:26 Comment(0)
F
1

Yes and No...

"Yes":

Normalize the data (strongly recommended) - i.e. split the categorie column so that you have each categorie in a separate... then you can just query it in a normal faschion...

"No":
As long as you keep this "pseudo-structure" there will be several issues (performance and others) and you will have to do something similar to:

SELECT * FROM MyTable WHERE categories LIKE 'c2,%' OR categories = 'c2' OR categories LIKE '%,c2,%' OR categories LIKE '%,c2'

IF you absolutely must you could define a function which is named FIND_IN_SET like the following:

CREATE OR REPLACE Function FIND_IN_SET
   ( vSET IN varchar2, vToFind IN VARCHAR2 )
   RETURN number
IS
    rRESULT number;
BEGIN

rRESULT := -1;
SELECT COUNT(*) INTO rRESULT FROM DUAL WHERE vSET LIKE ( vToFine || ',%' ) OR vSET = vToFind OR vSET LIKE ('%,' || vToFind || ',%') OR vSET LIKE ('%,' || vToFind);

RETURN rRESULT;

END;

You can then use that function like:

SELECT * FROM MyTable WHERE FIND_IN_SET (categories, 'c2' ) > 0;
Forgive answered 27/8, 2011 at 4:35 Comment(0)
B
1

For the sake of future searchers, don't forget the regular expression way:

with tbl as (
select 1 ID, 'c1' CATEGORIES from dual
union
select 2 ID, 'c2,c3' CATEGORIES from dual
union
select 3 ID, 'c3,c2' CATEGORIES from dual
union
select 4 ID, 'c3' CATEGORIES from dual
union
select 5 ID, 'c4,c8,c5,c100' CATEGORIES from dual
)
select * 
from tbl
where regexp_like(CATEGORIES, '(^|\W)c3(\W|$)');

        ID CATEGORIES
---------- -------------
         2 c2,c3
         3 c3,c2
         4 c3

This matches on a word boundary, so even if the comma was followed by a space it would still work. If you want to be more strict and match only where a comma separates values, replace the '\W' with a comma. At any rate, read the regular expression as: match a group of either the beginning of the line or a word boundary, followed by the target search value, followed by a group of either a word boundary or the end of the line.

Bregma answered 24/10, 2014 at 20:33 Comment(0)
C
1

As long as the comma-delimited list is 512 characters or less, you can also use a regular expression in this instance (Oracle's regular expression functions, e.g., REGEXP_LIKE(), are limited to 512 characters):

SELECT id, categories
  FROM mytable
 WHERE REGEXP_LIKE('c2', '^(' || REPLACE(categories, ',', '|') || ')$', 'i');

In the above I'm replacing the commas with the regular expression alternation operator |. If your list of delimited values is already |-delimited, so much the better.

Corwin answered 3/4, 2015 at 17:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.