sql group by only rows which are in sequence
Asked Answered
B

3

21

Say I have the following table:

MyTable
---------
| 1 | A |
| 2 | A |
| 3 | A |
| 4 | B |
| 5 | B |
| 6 | B |
| 7 | A |
| 8 | A |
---------

I need the sql query to output the following:

---------
| 3 | A |
| 3 | B |
| 2 | A |
---------

Basically I'm doing a group by but only for rows which are together in the sequence. Any ideas?

Note that the database is on sql server 2008. There is a post on this topic however it uses oracle's lag() function.

Bevins answered 1/12, 2010 at 13:2 Comment(1)
Hi where is the post that uses oracle's lag function?Racket
L
30

This is known as the "islands" problem. Using Itzik Ben Gan's approach:

;WITH YourTable AS
(
SELECT 1 AS N, 'A' AS C UNION ALL
SELECT 2 AS N, 'A' AS C UNION ALL
SELECT 3 AS N, 'A' AS C UNION ALL
SELECT 4 AS N, 'B' AS C UNION ALL
SELECT 5 AS N, 'B' AS C UNION ALL
SELECT 6 AS N, 'B' AS C UNION ALL
SELECT 7 AS N, 'A' AS C UNION ALL
SELECT 8 AS N, 'A' AS C
),
     T
     AS (SELECT N,
                C,
                DENSE_RANK() OVER (ORDER BY N) - 
                DENSE_RANK() OVER (PARTITION BY C ORDER BY N) AS Grp
         FROM   YourTable)
SELECT COUNT(*),
       C
FROM   T
GROUP  BY C,
          Grp 
ORDER BY MIN(N)
Larch answered 1/12, 2010 at 13:8 Comment(1)
Fantastic solution! That's going in the toolbox.Jinja
A
0

this will work for you...

SELECT 
  Total=COUNT(*), C 
FROM 
(
 SELECT 
 NGroup = ROW_NUMBER() OVER (ORDER BY N) - ROW_NUMBER() OVER (PARTITION BY C ORDER BY N),
 N,
 C
 FROM MyTable 
)RegroupedTable
GROUP BY C,NGroup
Aria answered 1/12, 2010 at 13:42 Comment(0)
A
0

Just for fun, without any SQL-specific functions and NOT assuming that the ID column is monotonically increasing:

WITH starters(name, minid, maxid) AS (
    SELECT
        a.name, MIN(a.id), MAX(a.id)
    FROM
        mytable a RIGHT JOIN
        mytable b ON
            (a.name <> b.name AND a.id < b.id) 
    WHERE 
        a.id IS NOT NULL
    GROUP BY 
        a.name
),
both(name, minid, maxid) AS (
    SELECT
        name, minid, maxid
    FROM
        starters
    UNION ALL
    SELECT
        name, MIN(id), MAX(id)
    FROM
        mytable
    WHERE
        id > (SELECT MAX(maxid) from starters)
    GROUP BY
        name
)
SELECT
    COUNT(*), m.name, minid
FROM 
    both INNER JOIN 
    mytable m ON
        id BETWEEN minid AND maxid
GROUP BY
    m.name, minid

Result (ignore the midid column):

(No column name)    name    minid
3   A   1
3   B   4
2   A   7
Acetophenetidin answered 1/12, 2010 at 13:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.