Important: Please consider upgrading to MySQL 8+ and use the defined and documented ROW_NUMBER() function, and ditch old hacks tied to a feature limited ancient version of MySQL
Now here's one of those hacks:
The answers here that use in-query variables mostly/all seem to ignore the fact that the documentation says (paraphrase):
Don't rely on items in the SELECT list being evaluated in order from top to bottom. Don't assign variables in one SELECT item and use them in another one
As such, there's a risk they will churn out the wrong answer, because they typically do a
select
(row number variable that uses partition variable),
(assign partition variable)
If these are ever evaluated bottom up, the row number will stop working (no partitions)
So we need to use something with a guaranteed order of execution. Enter CASE WHEN:
SELECT
t.*,
@r := CASE
WHEN col = @prevcol THEN @r + 1
WHEN (@prevcol := col) = null THEN null
ELSE 1 END AS rn
FROM
t,
(SELECT @r := 0, @prevcol := null) x
ORDER BY col
As outline ld, order of assignment of prevcol is important - prevcol has to be compared to the current row's value before we assign it a value from the current row (otherwise it would be the current rows col value, not the previous row's col value).
Here's how this fits together:
The first WHEN is evaluated. If this row's col is the same as the previous row's col then @r is incremented and returned from the CASE. This return led values is stored in @r. It's a feature of MySQL that assignment returns the new value of what is assigned into @r into the result rows.
For the first row on the result set, @prevcol is null (it is initialised to null in the subquery) so this predicate is false. This first predicate also returns false every time col changes (current row is different to previous row). This causes the second WHEN to be evaluated.
The second WHEN predicate is always false, and it exists purely to assign a new value to @prevcol. Because this row's col is different to the previous row's col (we know this because if it were the same, the first WHEN would have been used), we have to assign the new value to keep it for testing next time. Because the assignment is made and then the result of the assignment is compared with null, and anything equated with null is false, this predicate is always false. But at least evaluating it did its job of keeping the value of col from this row, so it can be evaluated against the next row's col value
Because the second WHEN is false, it means in situations where the column we are partitioning by (col) has changed, it is the ELSE that gives a new value for @r, restarting the numbering from 1
We this get to a situation where this:
SELECT
t.*,
ROW_NUMBER() OVER(PARTITION BY pcol1, pcol2, ... pcolX ORDER BY ocol1, ocol2, ... ocolX) rn
FROM
t
Has the general form:
SELECT
t.*,
@r := CASE
WHEN col1 = @pcol1 AND col2 = @pcol2 AND ... AND colX = @pcolX THEN @r + 1
WHEN (@pcol1 := pcol1) = null OR (@pcol2 := col2) = null OR ... OR (@pcolX := colX) = null THEN null
ELSE 1
END AS rn
FROM
t,
(SELECT @r := 0, @pcol1 := null, @pcol2 := null, ..., @pcolX := null) x
ORDER BY pcol1, pcol2, ..., pcolX, ocol1, ocol2, ..., ocolX
Footnotes:
The p in pcol means "partition", the o in ocol means "order" - in the general form I dropped the "prev" from the variable name to reduce visual clutter
The brackets around (@pcolX := colX) = null
are important. Without them you'll assign null to @pcolX and things stop working
It's a compromise that the result set has to be ordered by the partition columns too, for the previous column compare to work out. You can't thus have your rownumber ordered according to one column but your result set ordered to another You might be able to resolve this with subqueries but I believe the docs also state that subquery ordering may be ignored unless LIMIT is used and this could impact performance
I haven't delved into it beyond testing that the method works, but if there is a risk that the predicates in the second WHEN will be optimised away (anything compared to null is null/false so why bother running the assignment) and not executed, it also stops. This doesn't seem to happen in my experience but I'll gladly accept comments and propose solution if it could reasonably occur
It may be wise to cast the nulls that create @pcolX to the actual types of your columns, in the subquery that creates the @pcolX variables, viz: select @pcol1 := CAST(null as INT), @pcol2 := CAST(null as DATE)