What can an aggregate function do in the ORDER BY clause?
Asked Answered
M

4

15

Lets say I have a plant table:

id fruit
1  banana
2  apple
3  orange

I can do these

SELECT * FROM plant ORDER BY id;
SELECT * FROM plant ORDER BY fruit DESC;

which does the obvious thing.

But I was bitten by this, what does this do?

SELECT * FROM plant ORDER BY SUM(id);
SELECT * FROM plant ORDER BY COUNT(fruit);
SELECT * FROM plant ORDER BY COUNT(*);
SELECT * FROM plant ORDER BY SUM(1) DESC;

All these return just the first row (which is with id = 1).

  1. What's happening underhood?
  2. What are the scenarios where aggregate function will come in handy in ORDER BY?
Meaghanmeagher answered 27/10, 2012 at 10:23 Comment(0)
C
24

Your results are more clear if you actually select the aggregate values instead of columns from the table:

SELECT SUM(id) FROM plant ORDER BY SUM(id)

This will return the sum of all id's. This is of course a useless example because the aggregation will always create only one row, hence no need for ordering. The reason you get a row qith columns in your query is because MySQL picks one row, not at random but not deterministic either. It just so happens that it is the first column in the table in your case, but others may get another row depending on storage engine, primary keys and so on. Aggregation only in the ORDER BY clause is thus not very useful.

What you usually want to do is grouping by a certain field and then order the result set in some way:

SELECT fruit, COUNT(*)
FROM plant
GROUP BY fruit
ORDER BY COUNT(*)

Now that's a more interesting query! This will give you one row for each fruit together with the total count for that fruit. Try adding some more apples and the ordering will actually start making sense:

Complete table:

+----+--------+
| id | fruit  |
+----+--------+
|  1 | banana |
|  2 | apple  |
|  3 | orange |
|  4 | apple  |
|  5 | apple  |
|  6 | banana |
+----+--------+

The query above:

+--------+----------+
| fruit  | COUNT(*) |
+--------+----------+
| orange |        1 |
| banana |        2 |
| apple  |        3 |
+--------+----------+
Coronation answered 27/10, 2012 at 10:28 Comment(2)
Will this count all the rows twice?Cogswell
@RobertRocha In some cases. I don't think it would in this example. You can learn more by reading about "query plans" and "EXPLAIN".Determinate
A
5

All these queries will all give you a syntax error on any SQL platform that complies with SQL standards.

SELECT * FROM plant ORDER BY SUM(id);
SELECT * FROM plant ORDER BY COUNT(fruit);
SELECT * FROM plant ORDER BY COUNT(*);
SELECT * FROM plant ORDER BY SUM(1) DESC;

On PostgreSQL, for example, all those queries will raise the same error.

ERROR: column "plant.id" must appear in the GROUP BY clause or be used in an aggregate function

That means you're using a domain aggregate function without using GROUP BY. SQL Server and Oracle return similar error messages.

MySQL's GROUP BY is known to be broken in several respects, at least as far as standard behavior is concerned. But the queries you posted were a new broken behavior to me, so +1 for that.

Instead of trying to understand what it's doing under the hood, you're probably better off learning to write standard GROUP BY queries. MySQL will process standard GROUP BY statements correctly, as far as I know.

Earlier versions of MySQL docs warned you about GROUP BY and hidden columns. (I don't have a reference, but this text is cited all over the place.)

Do not use this feature if the columns you omit from the GROUP BY part are not constant in the group. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.

More recent versions are a little different.

You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.

Personally, I don't consider indeterminate a feature in SQL.

Autumnautumnal answered 27/10, 2012 at 10:40 Comment(0)
N
2
  1. When you use an aggregate like that, the query gets an implicit group by where the entire result is a single group.

  2. Using an aggregate in order by is only useful if you also have a group by, so that you can have more than one row in the result.

Nightshirt answered 27/10, 2012 at 10:34 Comment(0)
B
0

You can also use the aggregate function in order by clause if you want a specific order of columns while performing pivoting of a table using a dynamic query. If the pivoting column and ordering column are different, you can use the Max(OrderingCol) in the order by clause to order it accordingly.

Bourn answered 9/1, 2024 at 4:33 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Ramires

© 2022 - 2025 — McMap. All rights reserved.