Detect consecutive dates ranges using SQL
Asked Answered
E

9

39

I want to fill the calendar object which requires start and end date information. I have one column which contains a sequence of dates. Some of the dates are consecutive (have one day difference) and some are not.

InfoDate  

2013-12-04  consecutive date [StartDate]
2013-12-05  consecutive date
2013-12-06  consecutive date [EndDate]

2013-12-09                   [startDate]
2013-12-10                   [EndDate]

2014-01-01                   [startDate]
2014-01-02 
2014-01-03                   [EndDate]

2014-01-06                   [startDate]
2014-01-07                   [EndDate]

2014-01-29                   [startDate]
2014-01-30 
2014-01-31                   [EndDate]

2014-02-03                   [startDate]
2014-02-04                   [EndDate]

I want to pick each consecutive dates range’s start and end date (the first one and the last one in the block).

StartDate     EndDate

2013-12-04    2013-12-06
2013-12-09    2013-12-10
2014-01-01    2014-01-03
2014-01-06    2014-01-07
2014-01-29    2014-01-31
2014-02-03    2014-02-04

I want to solve the problem using SQL only.

Elison answered 5/12, 2013 at 14:5 Comment(1)
What do the empty lines in the second listing mean? Do you really need to solve this in SQL? This seems pretty hard to express in SQL (at least in the standard one), the obvious algorithm is pretty much sequetial and could be easily written in a procedural language. If SQL is really needed, I would use a stored procedure.Septemberseptembrist
U
72

No joins or recursive CTEs needed. The standard gaps-and-island solution is to group by (value minus row_number), since that is invariant within a consecutive sequence. The start and end dates are just the MIN() and MAX() of the group.

WITH t AS (
  SELECT InfoDate d,ROW_NUMBER() OVER(ORDER BY InfoDate) i
  FROM @d
  GROUP BY InfoDate
)
SELECT MIN(d),MAX(d)
FROM t
GROUP BY DATEDIFF(day,i,d)
Unwitnessed answered 5/12, 2013 at 16:51 Comment(6)
I think the group by should be changed to: GROUP BY DATEADD(day,-i,d)Chuckwalla
@Chuckwalla GROUP BY DATEDIFF(day,i,d) and GROUP BY DATEADD(day,-i,d) will generate identical groups.Unwitnessed
Maybe been downvoted as you say "no need to use CTEs" - and then use a CTE! But you could of course substitute the CTE in for t in your final SELECT so you are still correct...Obituary
Hi TommCatt, sorry id doesn't work with INPUT in form StartDate and EndDate.Jahnke
Sorry, new to CTE. I got ERROR: column "day" does not exist. ``` WITH t AS ( SELECT date d,ROW_NUMBER() OVER(ORDER BY date) i FROM <table_name> GROUP BY date ) SELECT MIN(d),MAX(d) FROM t GROUP BY DATEDIFF(DAY,i,d) ```Dialyse
Great solution. Also, if you change ROW_NUMBER() to DENSE_RANK(), then this code will work for the case where there are duplicate dates.Ufo
L
6

Here you go..

;WITH CTEDATES
AS
(
    SELECT ROW_NUMBER() OVER (ORDER BY Infodate asc ) AS ROWNUMBER,infodate FROM YourTableName  

),
 CTEDATES1
AS
(
   SELECT ROWNUMBER, infodate, 1 as groupid FROM CTEDATES WHERE ROWNUMBER=1
   UNION ALL
   SELECT a.ROWNUMBER, a.infodate,case datediff(d, b.infodate,a.infodate) when 1 then b.groupid else b.groupid+1 end as gap FROM CTEDATES A INNER JOIN CTEDATES1 B ON A.ROWNUMBER-1 = B.ROWNUMBER
)

select min(mydate) as startdate, max(infodate) as enddate from CTEDATES1 group by groupid
Luisaluise answered 5/12, 2013 at 14:49 Comment(1)
You should be using OVER (ORDER BY Infodate) instead of OVER (ORDER BY (SELECT 1)). Also, change min(mydate) with min(infodate). Other than that, this is a good answerCowpoke
P
2

--MS SQL

with cte as (
select start_date, end_date,
    dateadd(d, -row_number() over (order by start_date), start_date) as GRN
from projects)
select min(start_date), max(end_date) from cte group by grn order by grn;

--Oracle

with cte as(
select start_date, end_date, 
    start_date - row_number() over (order by start_date) as GRN 
    from projects)
select min(start_date), max(end_date) from cte  group by grn order by grn;
Parapodium answered 22/7, 2022 at 7:35 Comment(0)
D
0

I have inserted these values into a table called #consec and then perforemed the following:

select t1.*
,t2.infodate as binfod
into #temp1
from #consec t1
left join #consec t2 on dateadd(DAY,1,t1.infodate)=t2.infodate

select t1.*
,t2.infodate as binfod
into #temp2
from #consec t1
left join #consec t2 on dateadd(DAY,1,t2.infodate)=t1.infodate
;with cte as(
select infodate,  ROW_NUMBER() over(order by infodate asc) as seq from #temp1
where binfod is null
),
cte2 as(
select infodate, ROW_NUMBER() over(order by infodate asc) as seq from #temp2
where binfod is null
)

select t2.infodate as [start_date]
,t1.infodate as [end_date] from cte t1
left join cte2 t2 on t1.seq=t2.seq 

As long as your date periods are not overlapping, that should do the job for you.

Deuced answered 5/12, 2013 at 14:48 Comment(0)
F
0

Here it is my sample with test data:

--required output
-- 01 - 03
-- 08 - 09
-- 12 - 14

DECLARE @maxRN int;
WITH #tmp AS (
                SELECT CAST('2013-01-01' AS date) DT
    UNION ALL   SELECT CAST('2013-01-02' AS date)
    UNION ALL   SELECT CAST('2013-01-03' AS date)
    UNION ALL   SELECT CAST('2013-01-05' AS date)
    UNION ALL   SELECT CAST('2013-01-08' AS date)
    UNION ALL   SELECT CAST('2013-01-09' AS date)
    UNION ALL   SELECT CAST('2013-01-12' AS date)
    UNION ALL   SELECT CAST('2013-01-13' AS date)
    UNION ALL   SELECT CAST('2013-01-14' AS date)
),
#numbered AS (
    SELECT 0 RN, CAST('1900-01-01' AS date) DT
    UNION ALL
    SELECT ROW_NUMBER() OVER (ORDER BY DT) RN, DT
    FROM #tmp
)

SELECT * INTO #tmpTable FROM #numbered;
SELECT @maxRN = MAX(RN) FROM #tmpTable;

INSERT INTO #tmpTable
SELECT @maxRN + 1, CAST('2100-01-01' AS date);

WITH #paired AS (
    SELECT 
    ROW_NUMBER() OVER(ORDER BY TStart.DT) RN, TStart.DT DTS, TEnd.DT DTE
    FROM #tmpTable TStart
    INNER JOIN #tmpTable TEnd 
    ON TStart.RN = TEnd.RN - 1
    AND DATEDIFF(dd,TStart.DT,TEnd.DT) > 1  
)

SELECT TS.DTE, TE.DTs 
FROM #paired TS
INNER JOIN #paired TE ON TS.RN = TE.RN -1
AND TS.DTE <> TE.DTs -- you could remove this filter if you want to have start and end on the same date

DROP TABLE #tmpTable

Replace #tmp data with your actual table.

Foreandafter answered 5/12, 2013 at 14:56 Comment(0)
D
0

You can do like this and here is the sqlfiddle

select
  min(ndate) as start_date,
  max(ndate) as end_date
from
(select
  ndate,
  dateadd(day, -row_number() over (order by ndate), ndate) as rnk
 from dates
 ) t
 group by
   rnk
Demodulator answered 8/3, 2020 at 3:34 Comment(0)
T
0

Another simple solution that could work here is -

with tmp as 
(
select
datefield
, dateadd('day',-row_number() over(order by date asc),datefield) as date_group 
from table
)
select
min(datefield) as start_date
, max(datefield) as end_date 
from tmp
group by date_group
Tiedeman answered 22/11, 2020 at 8:43 Comment(0)
E
0

with cte as( select start_date, end_date, start_date - row_number() over (order by start_date) as GRN from projects)
select min(start_date), max(end_date) from cte
group by grn order by count(grn),min(start_date)

Elroyels answered 12/7, 2023 at 16:22 Comment(1)
Thank you for contributing to the Stack Overflow community. This may be a correct answer, but it’d be really useful to provide additional explanation of your code so developers can understand your reasoning. This is especially useful for new developers who aren’t as familiar with the syntax or struggling to understand the concepts. Would you kindly edit your answer to include additional details for the benefit of the community?Acceptant
P
-1
SELECT InfoDate ,
    CASE
      WHEN TRUNC(InfoDate - 1) = TRUNC(lag(InfoDate,1,InfoDate) over (order by InfoDate))
      THEN NULL
      ELSE InfoDate
    END STARTDATE,
    CASE
      WHEN TRUNC(InfoDate + 1) = TRUNC(lead(InfoDate,1,InfoDate) over (order by InfoDate))
      THEN NULL
      ELSE InfoDate
    END ENDDATE
  FROM TABLE;
Phail answered 22/12, 2019 at 21:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.