Understanding the GROUP BY statement's behaviour
Asked Answered
M

3

7

The question is this..

Table is this..

+--------------------------+---------+------+-----+---------+----------------+
| Field                    | Type    | Null | Key | Default | Extra          |
+--------------------------+---------+------+-----+---------+----------------+
| facility_map_id          | int(10) | NO   | PRI | NULL    | auto_increment |
| facility_map_facility_id | int(10) | NO   | MUL | NULL    |                |
| facility_map_listing_id  | int(10) | NO   |     | NULL    |                |
+--------------------------+---------+------+-----+---------+----------------+

Data is this..

+-----------------+--------------------------+-------------------------+
| facility_map_id | facility_map_facility_id | facility_map_listing_id |
+-----------------+--------------------------+-------------------------+
|             248 |                        1 |                      18 |
|             259 |                        1 |                      19 |
|             206 |                        1 |                      20 |
|             244 |                        1 |                      21 |
|             249 |                        2 |                      18 |
|             207 |                        2 |                      20 |
|             208 |                        3 |                      20 |
|             245 |                        3 |                      21 |
|             260 |                        4 |                      19 |
|             261 |                        5 |                      19 |
|             246 |                        6 |                      21 |
|             250 |                        7 |                      18 |
|             247 |                        8 |                      21 |
+-----------------+--------------------------+-------------------------+

I run the this query :

SELECT facility_map_listing_id 
FROM facility_map 
WHERE facility_map_facility_id IN(1, 2) 
GROUP BY facility_map_listing_id 
HAVING count(DISTINCT facility_map_facility_id) >= 2 

and get this..

+-------------------------+
| facility_map_listing_id |
+-------------------------+
|                      18 |
|                      20 |
+-------------------------+
2 rows in set (0.00 sec)

Which is correct! - but can anyone explain, why the GROUP BY needs to be in the statement?

if it Isnt and I run the same query leaving out the GROUP BY I get..

+-------------------------+
| facility_map_listing_id |
+-------------------------+
|                      18 |
+-------------------------+
1 row in set (0.00 sec)

Can any one explain this to me? Thank you!

Marlea answered 27/5, 2012 at 15:42 Comment(0)
B
6

Without a group by, an aggregate like count works on the set as a whole. So this query returns either zero or one row:

SELECT facility_map_listing_id 
FROM facility_map 
WHERE facility_map_facility_id IN(1, 2)
HAVING count(DISTINCT facility_map_facility_id) >= 2 

It will return one row if the having condition is met, and an empty set otherwise.

Now, with the group by, it evaluates the having condition for each value of facility_map_listing_id. That can return up to as many rows as there are distinct values of facility_map_listing_id.

Biannual answered 27/5, 2012 at 15:48 Comment(1)
Brill - explained things perfectly. Thanks for your help!Marlea
S
6

I think this should explain things:

If you omit group by, all the rows not excluded by the where clause return as a single group.

So, basically, you are still using a group by...just by the entire set.

Spring answered 27/5, 2012 at 15:48 Comment(1)
Thank you for the resource! I understand things much better now. I upped your answer!Marlea
O
1

we can use HAVING clause without a GROUP BY clause but SELECT columns and HAVING columns should be used with aggregate functions:

  • AVG() - Returns the average value
  • COUNT() - Returns the number of rows
  • FIRST() - Returns the first value
  • LAST() - Returns the last value
  • MAX() - Returns the largest value
  • MIN() - Returns the smallest value
  • SUM() - Returns the sum

HAVING is generally used with the GROUP BY clause, and mainly used to manipulate and filter the data by using aggregate functions. When GROUP BY is not used, HAVING behaves like a WHERE clause.

HAVING sets conditions for the group by clause, similar to the way in which where sets conditions for the select clause.

See This and This one

Optimal answered 27/5, 2012 at 15:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.