Using same column multiple times in WHERE clause
Asked Answered
G

8

14

I have a following table structure.

USERS

USERS data

PROPERTY_VALUE

PROPERTY_VALUE data

PROPERTY_NAME

PROPERTY_NAME data

USER_PROPERTY_MAP

USER_PROPERTY_MAP data

I am trying to retrieve user/s from the users table who have matching properties in property_value table.

A single user can have multiple properties. The example data here has 2 properties for user '1', but there can be more than 2. I want to use all those user properties in the WHERE clause.

This query works if user has a single property but it fails for more than 1 properties:

SELECT * FROM users u
INNER JOIN user_property_map upm ON u.id = upm.user_id
INNER JOIN property_value pv ON upm.property_value_id = pv.id
INNER JOIN property_name pn ON pv.property_name_id = pn.id
WHERE (pn.id = 1 AND pv.id IN (SELECT id FROM property_value WHERE value like '101')
AND pn.id = 2 AND pv.id IN (SELECT id FROM property_value WHERE value like '102')) and u.user_name = 'user1' and u.city = 'city1'

I understand since the query has pn.id = 1 AND pn.id = 2 it will always fail because pn.id can be either 1 or 2 but not both at the same time. So how can I re-write it to make it work for n number of properties?

In above example data there is only one user with id = 1 that has both matching properties used in the WHERE clause. The query should return a single record with all columns of the USERS table.

To clarify my requirements

I am working on an application that has a users list page on the UI listing all users in the system. This list has information like user id, user name, city etc. - all columns of the in USERS table. Users can have properties as detailed in the database model above.

The users list page also provides functionality to search users based on these properties. When searching for users with 2 properties, 'property1' and 'property2', the page should fetch and display only matching rows. Based on the test data above, only user '1' fits the bill.

A user with 4 properties including 'property1' and 'property2' qualifies. But a user with only one property 'property1' would be excluded due to the missing 'property2'.

Greer answered 17/11, 2017 at 13:28 Comment(6)
OR instead of AND?Scintilla
OR returns 2 results for the same user, for 'property1' and 'property2' each. Since both these properties belong to the same user i would like to get the single record in returnGreer
Need OR, group by and having count(*)=2Infiltrate
@Infiltrate tried with 'OR, group by and having count(*)=2' but didn't help. In my case i had to add 6 columns in group by but it always returns empty result.Greer
Please post data as text, never as image. And you should provide table definitions (CREATE TABLE statements).Mariquilla
@ErwinBrandstetter copy that. But I thought pictorial representations will help understand/visualize table relationships better.Greer
M
12

This is a case of . I added the tag.

Indexes

Assuming a PK or UNIQUE constraint on USER_PROPERTY_MAP(property_value_id, user_id) - columns in this order to make my queries fast. Related:

You should also have an index on PROPERTY_VALUE(value, property_name_id, id). Again, columns in this order. Add the the last column id only if you get index-only scans out of it.

For a given number of properties

There are many ways to solve it. This should be one of the simplest and fastest for exactly two properties:

SELECT u.*
FROM   users             u
JOIN   user_property_map up1 ON up1.user_id = u.id
JOIN   user_property_map up2 USING (user_id)
WHERE  up1.property_value_id =
      (SELECT id FROM property_value WHERE property_name_id = 1 AND value = '101')
AND    up2.property_value_id =
      (SELECT id FROM property_value WHERE property_name_id = 2 AND value = '102')
-- AND    u.user_name = 'user1'  -- more filters?
-- AND    u.city = 'city1'

Not visiting table PROPERTY_NAME, since you seem to have resolved property names to IDs already, according to your example query. Else you could add a join to PROPERTY_NAME in each subquery.

We have assembled an arsenal of techniques under this related question:

For an unknown number of properties

@Mike and @Valera have very useful queries in their respective answers. To make this even more dynamic:

WITH input(property_name_id, value) AS (
      VALUES  -- provide n rows with input parameters here
        (1, '101')
      , (2, '102')
      -- more?
      ) 
SELECT *
FROM   users u
JOIN  (
   SELECT up.user_id AS id
   FROM   input
   JOIN   property_value    pv USING (property_name_id, value)
   JOIN   user_property_map up ON up.property_value_id = pv.id
   GROUP  BY 1
   HAVING count(*) = (SELECT count(*) FROM input)
   ) sub USING (id);

Only add / remove rows from the VALUES expression. Or remove the WITH clause and the JOIN for no property filters at all.

The problem with this class of queries (counting all partial matches) is performance. My first query is less dynamic, but typically considerably faster. (Just test with EXPLAIN ANALYZE.) Especially for bigger tables and a growing number of properties.

Best of both worlds?

This solution with a recursive CTE should be a good compromise: fast and dynamic:

WITH RECURSIVE input AS (
   SELECT count(*)     OVER () AS ct
        , row_number() OVER () AS rn
        , *
   FROM  (
      VALUES  -- provide n rows with input parameters here
        (1, '101')
      , (2, '102')
      -- more?
      ) i (property_name_id, value)
   )
 , rcte AS (
   SELECT i.ct, i.rn, up.user_id AS id
   FROM   input             i
   JOIN   property_value    pv USING (property_name_id, value)
   JOIN   user_property_map up ON up.property_value_id = pv.id
   WHERE  i.rn = 1

   UNION ALL
   SELECT i.ct, i.rn, up.user_id
   FROM   rcte              r
   JOIN   input             i ON i.rn = r.rn + 1
   JOIN   property_value    pv USING (property_name_id, value)
   JOIN   user_property_map up ON up.property_value_id = pv.id
                              AND up.user_id = r.id
   )
SELECT u.*
FROM   rcte  r
JOIN   users u USING (id)
WHERE  r.ct = r.rn;          -- has all matches

dbfiddle here

The manual about recursive CTEs.

The added complexity does not pay for small tables where the additional overhead outweighs any benefit or the difference is negligible to begin with. But it scales much better and is increasingly superior to "counting" techniques with growing tables and a growing number of property filters.

Counting techniques have to visit all rows in user_property_map for all given property filters, while this query (as well as the 1st query) can eliminate irrelevant users early.

Optimizing performance

With current table statistics (reasonable settings, autovacuum running), Postgres has knowledge about "most common values" in each column and will reorder joins in the 1st query to evaluate the most selective property filters first (or at least not the least selective ones). Up to a certain limit: join_collapse_limit. Related:

This "deus-ex-machina" intervention is not possible with the 3rd query (recursive CTE). To help performance (possibly a lot) you have to place more selective filters first yourself. But even with the worst-case ordering it will still outperform counting queries.

Related:

Much more gory details:

More explanation in the manual:

Mariquilla answered 27/11, 2017 at 13:23 Comment(3)
But an user can have any number of properties not just 2. Based on your solution, for n number of properties I will have to join user_property_map n number of times, right? I am going through the link you shared. It seems really useful. Thank youGreer
@ivish: I added more explanation for underlying performance mechanisms.Mariquilla
For anyone trying to use the 3rd query in SQLite, move "(property_name_id, value)" to after "input" as SQLite doesn't accept column definitions as is. :) Gotta be amazed how powerful SQLite is these days.Plast
I
6
SELECT *
  FROM users u
 WHERE u.id IN(
         select m.user_id
           from property_value v
           join USER_PROPERTY_MAP m
             on v.id=m.property_value_id 
          where (v.property_name_id, v.value) in( (1, '101'), (2, '102') )
          group by m.user_id
         having count(*)=2
      )

OR

SELECT u.id
  FROM users u
 INNER JOIN user_property_map upm ON u.id = upm.user_id
 INNER JOIN property_value pv ON upm.property_value_id = pv.id
 WHERE (pv.property_name_id=1 and pv.value='101')
    OR (pv.property_name_id=2 and pv.value='102')
 GROUP BY u.id
HAVING count(*)=2

No property_name table needed in query if propery_name_id are kown.

Infiltrate answered 17/11, 2017 at 14:32 Comment(1)
Thanks. I will try it and get back.Greer
O
5

If you want just to filter:

SELECT users.*
FROM users
where (
    select count(*)
    from user_property_map
    left join property_value on user_property_map.property_value_id = property_value.id
    left join property_name on property_value.property_name_id = property_name.id
    where user_property_map.user_id = users.id -- join with users table
    and (property_name.name, property_value.value) in (
        values ('property1', '101'), ('property2', '102') -- filter properties by name and value
    )
) = 2 -- number of properties you filter by

Or, if you need users ordered descending by number of matches, you could do:

select * from (
    SELECT users.*, (
        select count(*) as property_matches
        from user_property_map
        left join property_value on user_property_map.property_value_id = property_value.id
        left join property_name on property_value.property_name_id = property_name.id
        where user_property_map.user_id = users.id -- join with users table
        and (property_name.name, property_value.value) in (
            values ('property1', '101'), ('property2', '102') -- filter properties by name and value
        )
    )
    FROM users
) t
order by property_matches desc
Odaniel answered 21/11, 2017 at 15:49 Comment(1)
Thank you. I will verify your solution and update you.Greer
I
3
SELECT * FROM users u
INNER JOIN user_property_map upm ON u.id = upm.user_id
INNER JOIN property_value pv ON upm.property_value_id = pv.id
INNER JOIN property_name pn ON pv.property_name_id = pn.id
WHERE (pn.id = 1 AND pv.id IN (SELECT id FROM property_value WHERE value 
like '101') )
OR ( pn.id = 2 AND pv.id IN (SELECT id FROM property_value WHERE value like 
'102'))

OR (...)
OR (...)

You can't do AND because there is no such a case where id is 1 and 2 for the SAME ROW, you specify the where condition for each row!

If you run a simple test, like

SELECT * FROM users where id=1 and id=2 

you will get 0 results. To achieve that use

 id in (1,2) 

or

 id=1 or id=2

That query can be optimised more but this is a good start I hope.

Invention answered 17/11, 2017 at 13:53 Comment(7)
Agreed, id=1 and id=2 will not work. But using OR returns multiple records for the same user and that's not what i wantGreer
Can you show what is the output you try to get? Because of course you can group by user as well, but not sure how you want to see the rsultsInvention
edited the post to explain what exactly i desire in the query outputGreer
Thank you, can you actually type the result how it should look like? As to me still not clear, do you want a row per user? Or just columns from table users? Sorry but it is not easy sometimes to understand exact need :)Invention
Apologies for not being able to explain it better. I would like to get all the records from user table that match the properties used in the WHERE clause. In above example data there is only one user that has the 2 properties used in the WHERE clause. So i want query to return that single user record from Users table.Greer
@ivish: What do you mean by match? Do you mean that if table property_value have n properties, return only those users who has all of the n properties?Heteronomy
yeah still confusing, I actually need to see the result expected... anyway, try to do select count(*), user_id from users u INNER JOIN user_property_map upm ON u.id = upm.user_id INNER JOIN property_value pv ON upm.property_value_id = pv.id INNER JOIN property_name pn ON pv.property_name_id = pn.id WHERE (pn.id = 1 AND pv.id IN (SELECT id FROM property_value WHERE value like '101') ) OR ( pn.id = 2 AND pv.id IN (SELECT id FROM property_value WHERE value like '102')) group by user_idInvention
P
2

you are using AND operator between two pn.id=1 and pn.id=2. then how you getting the answer is between that:

(SELECT id FROM property_value WHERE value like '101') and
(SELECT id FROM property_value WHERE value like '102') 

So like above comments , Use or operator.

Update 1:

SELECT * FROM users u
INNER JOIN user_property_map upm ON u.id = upm.user_id
INNER JOIN property_value pv ON upm.property_value_id = pv.id
INNER JOIN property_name pn ON pv.property_name_id = pn.id
WHERE pn.id in (1,2) AND pv.id IN (SELECT id FROM property_value WHERE value like '101' or value like '102');
Piane answered 17/11, 2017 at 13:40 Comment(4)
using OR doesn't produce what i desireGreer
@Greer Try my updated query . if doesn't satisfy please add your desired output in your question :)Piane
Tried it. Like OR it also returns multiple (2) records. I have edited the post to include desired outputGreer
@Greer if you add desired output table, it will be more helpfulPiane
N
2

If you just want the distinct columns in U, it is:

SELECT DISTINCT u.* 
  FROM Users u INNER JOIN USER_PROPERTY_MAP upm ON u.id = upm.[user_id]
                INNER JOIN PROPERTY_VALUE pv ON upm.property_value_id = pv.id
                INNER JOIN PROPERTY_NAME pn ON pv.property_name_id = pn.id

  WHERE (pn.id = 1 AND pv.[value] = '101')
     OR (pn.id = 2 AND pv.[value] = '102')

Notice I used pv.[value] = instead of the subquery to reacquire id... this is simplification.

Noonan answered 17/11, 2017 at 14:39 Comment(1)
Thanks. I will try it and get back.Greer
X
2

If I understand your question correctly I would do it like this.

SELECT u.id, u.user_name, u.city FROM users u 
WHERE (SELECT count(*) FROM property_value v, user_property_map m 
WHERE m.user_id = u.id AND m.property_value_id = v.id AND v.value IN ('101', '102')) = 2

This should return a list of users that have all the properties listed in the IN clause. The 2 represents the number of properties searched for.

Xyloid answered 27/11, 2017 at 14:56 Comment(0)
G
0

Assuming you want to select all the fields in the USERS table

SELECT u.* 
FROM USERS u
INNER JOIN 
(
    SELECT USERS.id as user_id, COUNT(*) as matching_property_count
    FROM USERS
    INNER JOIN (
        SELECT m.user_id, n.name as property_name, v.value
        FROM PROPERTY_NAME n
        INNER JOIN PROPERTY_VALUE v ON n.id = v.property_name_id
        INNER JOIN USER_PROPERTY_MAP m ON m.property_value_id = v.property_value_id
        WHERE  (n.id = @property_id_1 AND v.value = @property_value_1) -- Property Condition 1
            OR (n.id = @property_id_2 AND v.value = @property_value_2) -- Property Condition 2
            OR (n.id = @property_id_3 AND v.value = @property_value_3) -- Property Condition 3
            OR (n.id = @property_id_N AND v.value = @property_value_N) -- Property Condition N
    ) USER_PROPERTIES ON USER_PROPERTIES.user_id = USERS.id
    GROUP BY USERS.id
    HAVING COUNT(*) = N     --N = the number of Property Condition in the WHERE clause
    -- Note : 
    -- Use HAVING COUNT(*) = N if property matches will be "MUST MATCH ALL"
    -- Use HAVING COUNT(*) > 0 if property matches will be "MUST MATCH AT LEAST ONE"
) USER_MATCHING_PROPERTY_COUNT ON u.id = USER_MATCHING_PROPERTY_COUNT.user_id
Governess answered 28/11, 2017 at 3:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.