SQL - How to check if item is in a list in a record
Asked Answered
M

5

3

I have have a column name MyRoles that contains a list of items (integers) stored in a field named UserRoles. I want to write a query that checks to see if a specific item is in the list. The list will look like this: "1,2,3"

I can't use WHERE MyRoles

What should the query look like?

This is similar to what I am thinking:

SELECT *
FROM MyTable
WHERE MyRoles CONTAINS ('1')

The fact that no answer was easy to implement and would take me farther down an ugly road really makes clear that a normalized database is always the best bet.

Marchelle answered 15/6, 2011 at 19:28 Comment(5)
Any particular reason you are not using a normalized database?Hair
There are lots of reasons why this database isn't totally normalized. I am interested in finding an answer to this question though. I am not attempting to change the database right now. Thanks though!Marchelle
Context helps - if you had said in your question that you know this isn't normalized, the question wouldn't have come up...Hair
I just assumed that no bonehead would ever create a database that wasn't totally normalized. But, I did. Of course, I am interested in how many possible workarounds I can find to a fairly typical problem. Thanks, Oded.Marchelle
I have seen many questions here featuring denormalized fields and databases, mostly for lack of knowledge.Hair
G
0

Convert it to an array:

SELECT *
FROM MyTable
WHERE ('{' || MyRoles || '}')::int[] && array[1]

Better yet, you can use an index on the above mess. Casting text to an array type outright will get rejected when building an array, but you can work around it:

create function role_scan(text) returns int[] as $$
  select ('{' || $1 || '}')::int[];
$$ language sql immutable strict;

create index on MyTable using gin (role_scan(MyRoles));

-- actually use the index
SELECT *
FROM MyTable
WHERE role_scan(MyRoles) && array[1]

There is a caveat in adding the index, that you should be aware of. The stats collector doesn't look (up to 9.1 anyway) into the actual array values. The selectivity of the overlap operator (1/200, i.e. very selective) is hard-coded for all intents and purposes. So if you're querying against very common values, you might get an index scan where inappropriate. One workaround is to call the underlying overlap method directly (which gives a selectivity of 1/3 and no potential index scan) when you know plenty of roles apply:

SELECT *
FROM MyTable
WHERE arrayoverlap(role_scan(MyRoles), array[1])
Geode answered 15/6, 2011 at 20:13 Comment(0)
H
4

You can use LIKE:

SELECT *
FROM MyTable
WHERE MyRoles LIKE ('%1%')

This is likely to perform very badly (as an index will be fairly useless for such a search). And of course, will also match on 10 even if 1 doesn't exist in the query. You can extend the approach:

SELECT *
FROM MyTable
WHERE MyRoles = '1'
  OR MyRoles LIKE '1,%'
  OR MyRoles LIKE '%,1,%'

A better solution would be to normalize your database and not have multi-valued fields. Use a many-to-many table with single role ids and item ids per row. This is much easier to query.

Some databases will have better facilities for such a query, but these would be extensions and not standard SQL - you didn't mention a specific RDBMS.

Hair answered 15/6, 2011 at 19:32 Comment(3)
Will fail if there are more than 9 roles.Hyposensitize
Of course, that only works if there are less than 10 roles. If there are more, this query will also match roles 10, 11, 21, 310, etc. Which is why a normalized database would work best.Director
John, a normalized database would work best. But that's not an option. Hence, my question.Marchelle
A
1

Watch out if you use LIKE:

If MyRoles is 2,11 then it will match LIKE('%1%') although you don't want it to.

A painful workaround is to use

SELECT *
FROM MyTable
WHERE MyRoles LIKE ('%,1,%')

but then you need to put leading and trailing commas in every MyRoles entry.

These various ugly facts are the reason everyone is telling you to change your database design and create a "Roles" table.

Algor answered 15/6, 2011 at 19:36 Comment(4)
Oops -- looks like the earlier answer has now been edited to say this... sorry! I'm new here, just trying to help. :)Algor
Chris - Please go on trying to help. There is nothing wrong with having several answers to a question.Hair
Thanks, Chris. I am completely aware of the pitfalls of using LIKE, for the exact reason you cited. Do you have another workaround?Marchelle
You can use WHERE ','+MyRoles+',' LIKE ('%,1,%') to eliminate having to update the MyRoles field in the databaseArdis
B
0
SELECT *
FROM MyTable
WHERE FIND_IN_SET(1, MyRoles)

EDIT: It works in mysql db server.

EDIT:

find_in_set function for postgres:

create function find_in_set(n int, s text) returns bigint as
$$
select z.row_number
from
(
    select row_number() over(), y.x
    from (select unnest(('{' || $2 || '}')::int[]) as x) as y
) as z
where z.x = $1
$$ language sql;
Boast answered 15/6, 2011 at 19:33 Comment(4)
FIND_IN_SET looks like a specific extension. What database is this for?Hair
It is mysql specific function. But he hasn't mentioned what is his db server.Boast
Rahim, this is exactly the type of answer I want, except I need it to work in Postgres. Thanks!Marchelle
@Rahim: your solution automatically leads to a seq scan.Geode
B
0

maybe regular expressions can help here:

SELECT *
FROM MyTable
WHERE MyRoles ~ ('^(.*,)*' || 1 || '(,.*)*$')
Babarababassu answered 15/6, 2011 at 20:5 Comment(0)
G
0

Convert it to an array:

SELECT *
FROM MyTable
WHERE ('{' || MyRoles || '}')::int[] && array[1]

Better yet, you can use an index on the above mess. Casting text to an array type outright will get rejected when building an array, but you can work around it:

create function role_scan(text) returns int[] as $$
  select ('{' || $1 || '}')::int[];
$$ language sql immutable strict;

create index on MyTable using gin (role_scan(MyRoles));

-- actually use the index
SELECT *
FROM MyTable
WHERE role_scan(MyRoles) && array[1]

There is a caveat in adding the index, that you should be aware of. The stats collector doesn't look (up to 9.1 anyway) into the actual array values. The selectivity of the overlap operator (1/200, i.e. very selective) is hard-coded for all intents and purposes. So if you're querying against very common values, you might get an index scan where inappropriate. One workaround is to call the underlying overlap method directly (which gives a selectivity of 1/3 and no potential index scan) when you know plenty of roles apply:

SELECT *
FROM MyTable
WHERE arrayoverlap(role_scan(MyRoles), array[1])
Geode answered 15/6, 2011 at 20:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.