GROUP BY for continuous rows in SQL
Asked Answered
S

5

5

Given the following table:

ID   State  Date
12   1      2009-07-16 10:00
45   2      2009-07-16 13:00
67   2      2009-07-16 14:40
77   1      2009-07-16 15:00
89   1      2009-07-16 15:30
99   1      2009-07-16 16:00

Question:
How can i GROUP by the field "State", while still maintaining the borders between the state changes?

SELECT MIN(ID) AS ID, State, MIN(Date) AS Date, COUNT(ID) AS Count
FROM table GROUP BY State

results in the following:

ID   State  Date              Count
12   1      2009-07-16 10:00  4
45   2      2009-07-16 13:00  2

but this is the desired output:

ID   State  Date              Count
12   1      2009-07-16 10:00  1
45   2      2009-07-16 13:00  2
77   1      2009-07-16 15:00  3

Is this possible in SQL? I didn't find a solution so far...
Spheroidal answered 16/7, 2009 at 10:1 Comment(0)
C
10
SELECT  MIN(id) AS id, MIN(ts) AS ts, MIN(state) AS state, COUNT(*) cnt
FROM    (
        SELECT  @r := @r + (@state != state) AS gn,
                @state := state AS sn,
                s.*
        FROM    (
                SELECT  @r := 0,
                        @state := 0
                ) vars,
                t_state s
        ORDER BY
                ts
        ) q
GROUP BY
        gn

Table creation scripts for testing:

CREATE TABLE t_state (id INT NOT NULL PRIMARY KEY, state INT NOT NULL, ts DATETIME NOT NULL);

INSERT
INTO  t_state
VALUES
(12,   1,      '2009-07-16 10:00'),
(45,   2,      '2009-07-16 13:00'),
(67,   2,      '2009-07-16 14:40'),
(77,   1,      '2009-07-16 15:00'),
(89,   1,      '2009-07-16 15:30'),
(99,   1,      '2009-07-16 16:00');
Compilation answered 16/7, 2009 at 10:19 Comment(0)
P
3

This is how to do it with CTEs on MSSQL server

-- DROP TABLE MyLog
CREATE TABLE MyLog(
        ID          INT PRIMARY KEY
        , State     INT
        , Date      DATETIME
        )
INSERT MyLog
SELECT 12, 1, '2009-07-16 10:00' UNION ALL
SELECT 45, 2, '2009-07-16 13:00' UNION ALL
SELECT 67, 2, '2009-07-16 14:40' UNION ALL
SELECT 77, 1, '2009-07-16 15:00' UNION ALL
SELECT 89, 1, '2009-07-16 15:30' UNION ALL
SELECT 99, 1, '2009-07-16 16:00'

;WITH   CTE
AS      (
        SELECT  ROW_NUMBER() OVER(ORDER BY ID) AS RowNo
                , *
        FROM    MyLog
        )
, MyLogGroup
AS      (
        SELECT  l.*
                , ( SELECT  MAX(ID)
                    FROM    CTE c
                    WHERE   NOT EXISTS (SELECT * FROM CTE
                                        WHERE RowNo = c.RowNo-1 AND State = c.State)
                            AND c.ID <= l.ID) AS GroupID
        FROM    MyLog l
        )
SELECT  *
FROM    MyLogGroup
Polypropylene answered 16/7, 2009 at 10:48 Comment(1)
as this is mysql related and i do not have any MS SQL server, i can not test your code. anyways .. thank you for the effort and your contribution, it will help other people with similar problems.Spheroidal
E
1

Here is a lengthier description of how solutions like the one offered by Quassnoi work

Ectophyte answered 18/1, 2012 at 22:16 Comment(0)
E
0

I might be stating the obvious here, but if you're willing to make use of Transact-SQL, you can iterate through the rows of the table and build your own result set, which probably seems like a hassle, but it will definitely work. The iteration can be done without the use of cursors.

Extrinsic answered 16/7, 2009 at 10:21 Comment(0)
F
0

I created a solution for BigQuery:

WITH offset_state_table AS (
  SELECT
    id,
    date,
    state,
    LEAD(state)
      OVER(PARTITION BY id ORDER BY date ASC) AS offset_state,
  FROM
    `my_project.my_dataset.my_table`
),
grouped_table AS (
  SELECT
    id,
    date,
    state,
  FROM
    offset_state_table
  WHERE
    offset_state != state
  OR
    offset_state IS NULL
)
SELECT
  id,
  state,
  date AS start_date,
  LEAD(date)
    OVER(PARTITION BY id ORDER BY date ASC) AS end_date
FROM
  grouped_table
Flofloat answered 16/8, 2023 at 12:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.