How do I group on continuous ranges
Asked Answered
B

5

31

I know some basic SQL, but this one is beyond me. I have looked high and low but no dice. I need a view of the following data, I can do this in the application layer code. But unfortunately for this particular one, the code must be put in the data layer.

I am using T-SQL.

Table

Date      Crew       DayType
01-02-11  John Doe  SEA  
02-02-11  John Doe  SEA  
03-02-11  John Doe  SEA  
04-02-11  John Doe  HOME  
05-02-11  John Doe  HOME  
06-02-11  John Doe  SEA 

I need a view like this

DateFrom  DateTo    Name      DayType
01-02-11  03-02-11  John Doe  SEA
04-02-11  05-02-11  John Doe  HOME
06-02-11  06-02-11  John Doe  SEA

Unfortunately, the base table is required for the application layer to be in the format shown. Is this possible to do in a query?

Barbellate answered 14/4, 2011 at 11:30 Comment(0)
P
36
WITH    q AS
        (
        SELECT  *,
                ROW_NUMBER() OVER (PARTITION BY crew, dayType ORDER BY [date]) AS rnd,
                ROW_NUMBER() OVER (PARTITION BY crew ORDER BY [date]) AS rn
        FROM    mytable
        )
SELECT  MIN([date]), MAX([date]), crew AS name, dayType
FROM    q
GROUP BY
        crew, dayType, rnd - rn

This article may be of interest to you:

Placer answered 14/4, 2011 at 11:36 Comment(9)
+1, good as always. Did you examine if the ROW_NUMBER approach is faster compared to recursive one on larger sets?Verjuice
+1 out of interest, if there were multiple Daytypes for a crew name for a given date (i.e. add 02-02-11,John Doe,Home into sample DDL), would it be best to use DENSE_RANK for rn or is there a better way instead?Dementia
But (a) the data already exists in Relational form, (b) the view required is Relational (c) SQL handles Relational data. Why convert it to a file (complete with a Record ID), and then process it as a file, at massive resource cost ? The method for solving this Relationally, without 1970-style Record IDs or CTEs, is given in this AnswerHypoploid
The Derived Table with ROW_NUMBER(), that drives the WITH. Check your SHOWPLAN.Hypoploid
@PerformanceDBA: sorry, I don't get it. What does the derived table have to do with files? What am I supposed to find in the plan?Placer
(0) The data is Relational, it does not have a Record ID, it has a good PK (1) When you create a Derived Table, with a Record ID, you are converting the Relational data, to the format of a pre-1970 ISAM file, which is non-relational. (2) Then you process the file (Derived Table) using processing methods, which employs a pre-relational Record Filing System mindset. (3) Both steps have a cost, which will be visible in the plan & stats. (4) Whereas, you could simply SELECT from the Relational data, using Relational methods, which would avoid those costs. But that requires a Relational mindset.Hypoploid
@PerformanceDBA: sorry, it seems you are unaware of SQL Server's architecture. SQL Server does not use any ISAM files.Placer
Let us continue this discussion in chat.Hypoploid
Nowhere in my comments did I suggest that SQL Server uses ISAM Files. You will be aware of course that you can create an ISAM File structure using in SQL Server, via CREATE TABLE or by using a Derived Table (as you have). Everywhere, I stated that your solution converts the Relational data to a ISAM File format (inside SQL Server), and then processes it using File processing methods. Instead of Relational methods over the source data, without the ISAM File steps in-between.Hypoploid
T
15
WITH grouped AS (
  SELECT
    *,
    grp = DATEDIFF(day, 0, Date) -
          ROW_NUMBER() OVER (PARTITION BY Crew, DayType ORDER BY Date)
  FROM @testtable
)
SELECT
  DateFrom = MIN(Date),
  DateTo = MAX(Date),
  Name = Crew,
  DayType
FROM grouped
GROUP BY Crew, DayType, grp;

Basically, same as Quassnoi's solution, but using one ROW_NUMBER fewer yields a better execution plan.

Tisbe answered 14/4, 2011 at 13:46 Comment(2)
Use of DATEDIFF(day, 0, Date) ensures that dates are continuous when grouping. IMO this is a better answerWicopy
Be careful because this version requires that the dates be continuous within a group - no missing days. If there's a weekend/holiday with no rows, DateDiff moves more than row_number and you get a new group, right?Capps
D
0
SELECT MIN(Date) AS DateFrom,MAX(Date) AS DateTo, Crew, DayType FROM yourTableName GROUP BY Crew, DayType
Dominate answered 14/4, 2011 at 11:35 Comment(1)
Doesn't this answer just gives the min & max dates for each Crew/DayType combo but not split into continuous date ranges.Veritable
R
-1

Something Like:

SELECT Crew, DayType, MIN(Date) AS SomeDate1, MAX(Date) AS SomeDate2 
FROM Mytable 
GROUP BY Crew, DayType
Ruskin answered 14/4, 2011 at 11:34 Comment(0)
P
-2
Select Min(Date) DateFrom, Max(Date) DateTo, Crew Name,DayType From Mytable Group By Crew,DayType 

try this.

Putty answered 14/4, 2011 at 11:40 Comment(1)
This returns only two lines. 01-02-11 06-02-11 John Doe SEA and 04-02-11 05-02-11 John Doe HOME.Distilled

© 2022 - 2024 — McMap. All rights reserved.