I wandered in here with essentially the same question as whytheq and found David’s solution, but then had to review my old self-tutorial notes regarding DENSE_RANK because I use it so rarely: why DENSE_RANK instead of RANK or ROW_NUMBER, and how does it actually work? In the process, I updated that tutorial to include my version of David’s solution for this particular problem, and then thought it might be helpful for SQL newbies (or others like me who forget stuff).
The whole tutorial text can be copy/pasted into a query editor and then each example query can be (separately) uncommented and run, to see their respective results. (By default, the solution to this problem is uncommented at the bottom.) Or, each example can be copied separately into their own query-edit instance but the TBLx CTE must be included with each.
--WITH /* DB2 version */
--TBLx (Col_A, Col_B) AS (VALUES
-- ( 7, 7 ),
-- ( 7, 7 ),
-- ( 7, 7 ),
-- ( 7, 8 ))
WITH /* SQL-Server version */
TBLx (Col_A, Col_B) AS
(SELECT 7, 7 UNION ALL
SELECT 7, 7 UNION ALL
SELECT 7, 7 UNION ALL
SELECT 7, 8)
/*** Example-A: demonstrates the difference between ROW_NUMBER, RANK and DENSE_RANK ***/
--SELECT Col_A, Col_B,
-- ROW_NUMBER() OVER(PARTITION BY Col_A ORDER BY Col_B) AS ROW_NUMBER_,
-- RANK() OVER(PARTITION BY Col_A ORDER BY Col_B) AS RANK_,
-- DENSE_RANK() OVER(PARTITION BY Col_A ORDER BY Col_B) AS DENSE_RANK_
--FROM TBLx
/* RESULTS:
Col_A Col_B ROW_NUMBER_ RANK_ DENSE_RANK_
7 7 1 1 1
7 7 2 1 1
7 7 3 1 1
7 8 4 4 2
ROW_NUMBER: Just increments for the three identical rows and increments again for the final unique row.
That is, it’s an order-value (based on "sort" order) but makes no other distinction.
RANK: Assigns the same rank value to the three identical rows, then jumps to 4 for the fourth row,
which is *unique* with regard to the others.
That is, each identical row is ranked by the rank-order of the first row-instance of that
(identical) value-set.
DENSE_RANK: Also assigns the same rank value to the three identical rows but the fourth *unique* row is
assigned a value of 2.
That is, DENSE_RANK identifies that there are (only) two *unique* row-types in the row set.
*/
/*** Example-B: to get only the distinct resulting "count-of-each-row-type" rows ***/
-- SELECT DISTINCT -- For unique returned "count-of-each-row-type" rows, the DISTINCT operator is necessary because
-- -- the calculated DENSE_RANK value is appended to *all* rows in the data set. Without DISTINCT,
-- -- its value for each original-data row-type would just be replicated for each of those rows.
--
-- Col_A, Col_B,
-- DENSE_RANK() OVER(PARTITION BY Col_A ORDER BY Col_B) AS DISTINCT_ROWTYPE_COUNT_
-- FROM TBLx
/* RESULTS:
Col_A Col_B DISTINCT_ROWTYPE_COUNT_
7 7 1
7 8 2
*/
/*** Example-C.1: demonstrates the derivation of the "count-of-all-row-types" (finalized in Example-C.2, below) ***/
-- SELECT
-- Col_A, Col_B,
--
-- DENSE_RANK() OVER ( PARTITION BY Col_A ORDER BY Col_B DESC) AS ROW_TYPES_COUNT_DESC_,
-- DENSE_RANK() OVER ( PARTITION BY Col_A ORDER BY Col_B ASC) AS ROW_TYPES_COUNT_ASC_,
--
-- -- Adding the above cases together and subtracting one gives the same total count for on each resulting row:
--
-- DENSE_RANK() OVER ( PARTITION BY Col_A ORDER BY Col_B DESC)
-- +
-- DENSE_RANK() OVER ( PARTITION BY Col_A ORDER BY Col_B ASC)
-- - 1 /* (Because DENSE_RANK values are one-based) */
-- AS ROW_TYPES_COUNT_
-- FROM TBLx
/* RESULTS:
COL_A COL_B ROW_TYPES_COUNT_DESC_ ROW_TYPES_COUNT_ASC_ ROW_TYPES_COUNT_
7 7 2 1 2
7 7 2 1 2
7 7 2 1 2
7 8 1 2 2
*/
/*** Example-C.2: uses the above technique to get a *single* resulting "count-of-all-row-types" row ***/
SELECT DISTINCT -- For a single returned "count-of-all-row-types" row, the DISTINCT operator is necessary because the
-- calculated DENSE_RANK value is appended to *all* rows in the data set. Without DISTINCT, that
-- value would just be replicated for each original-data row.
-- Col_A, Col_B, -- In order to get a *single* returned "count-of-all-row-types" row (and field), all other fields
-- must be excluded because their respective differing row-values will defeat the purpose of the
-- DISTINCT operator, above.
DENSE_RANK() OVER ( PARTITION BY Col_A ORDER BY Col_B DESC)
+
DENSE_RANK() OVER ( PARTITION BY Col_A ORDER BY Col_B ASC)
- 1 /* (Because DENSE_RANK values are one-based) */
AS ROW_TYPES_COUNT_
FROM TBLx
/* RESULTS:
ROW_TYPES_COUNT_
2
*/
COUNT
withORDER BY
instead ofPARTITION BY
is ill-defined in 2008. I'm surprised it's letting you have it at all. Per the documentation, you're not allowed anORDER BY
for an aggregate function. – Apostolic