MySQL: Find Missing Dates Between a Date Range
Asked Answered
B

6

6

I need some help with a mysql query. I've got db table that has data from Jan 1, 2011 thru April 30, 2011. There should be a record for each date. I need to find out whether any date is missing from the table.

So for example, let's say that Feb 2, 2011 has no data. How do I find that date?

I've got the dates stored in a column called reportdatetime. The dates are stored in the format: 2011-05-10 0:00:00, which is May 5, 2011 12:00:00 am.

Any suggestions?

Balsa answered 1/7, 2011 at 17:19 Comment(4)
So you're saying that a single date wouldn't have a row at all or that it has a row and there's no data in another column? If it doesn't have a row the only way I can think of would be to loop through each date and check it.Yearwood
You mean reportdatetime has the entry of Feb 2, 2011 but other fields associated to that date are not present?Bergamo
Possibly duplicate of #3539358Machinery
And another dupe #5523201Machinery
T
13

This is a second answer, I'll post it separately.

SELECT DATE(r1.reportdate) + INTERVAL 1 DAY AS missing_date
FROM Reports r1
LEFT OUTER JOIN Reports r2 ON DATE(r1.reportdate) = DATE(r2.reportdate) - INTERVAL 1 DAY
WHERE r1.reportdate BETWEEN '2011-01-01' AND '2011-04-30' AND r2.reportdate IS NULL;

This is a self-join that reports a date such that no row exists with the date following.

This will find the first day in a gap, but if there are runs of multiple days missing it won't report all the dates in the gap.

Trautman answered 1/7, 2011 at 18:32 Comment(5)
Modified it slightly to account for duplicate date value that caused the self join to go crazy: SELECT DISTINCT DATE(r1.date) + INTERVAL 1 DAY AS missing_date FROM mytable r1 LEFT OUTER JOIN (SELECT DISTINCT date FROM mytable ) r2 ON DATE(r1.date) = DATE(r2.date) - INTERVAL 1 DAY WHERE r1.date BETWEEN '2011-06-19' AND '2012-08-30' AND r2.date IS NULLMachinery
This work only if the missing date is one day but if you have more than one its failBlane
@MartinVarta, yes that's true. I already said that in my answer above.Trautman
@BillKarwin this solution can help, can change the range. SELECT * FROM ( SELECT DATE_ADD('2014-01-01', INTERVAL t4+t16+t64+t256+t1024 DAY) missing FROM (SELECT 0 t4 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 ) t4, (SELECT 0 t16 UNION ALL SELECT 4 UNION ALL SELECT 8 UNION ALL SELECT 12 ) t16, (SELECT 0 t64 UNION ALL SELECT 16 UNION ALL SELECT 32 UNION ALL SELECT 48 ) t64, (SELECT 0 t256 UNION ALL SELECT 64 UNION ALL SELECT 128 UNION ALL SELECT 192) t256 ) b WHERE missing NOT IN (SELECT fDesde262 FROM py262) AND ausente<NOW();Blane
@MartinVarta, great, why don't you post that as an answer?Trautman
T
8
  1. CREATE TABLE Days (day DATE PRIMARY KEY);

  2. Fill Days with all the days you're looking for.

    mysql> INSERT INTO Days VALUES ('2011-01-01');
    mysql> SET @offset := 1;
    mysql> INSERT INTO Days SELECT day + INTERVAL @offset DAY FROM Days; SET @offset := @offset * 2;
    

    Then up-arrow and repeat the INSERT as many times as needed. It doubles the number of rows each time, so you can get four month's worth of rows in seven INSERTs.

  3. Do an exclusion join to find the dates for which there is no match in your reports table:

    SELECT d.day FROM Days d 
    LEFT OUTER JOIN Reports r ON d.day = DATE(r.reportdatetime) 
    WHERE d.day BETWEEN '2011-01-01' AND '2011-04-30' 
        AND r.reportdatetime IS NULL;`
    
Trautman answered 1/7, 2011 at 17:27 Comment(2)
Hi Bill, thank you for the comment. As Sunny, mentioned, step 2 would be a challenge. Any suggestion on how to do that other than manually?Balsa
I've edited the above with an example of filling the table with 128 days worth of rows.Trautman
M
0

It could be done with a more complicated single query, but I'll show a pseudo code with temp table just for illustration:

Get all dates for which we have records:

CREATE TEMP TABLE AllUsedDates

SELECT DISTINCT reportdatetime
INTO AllUsedDates;

now add May 1st so we track 04-30

INSERT INTO AllUsedData ('2011-05-01')

If there's no "next day", we found a gap:

SELECT A.NEXT_DAY
FROM
    (SELECT reportdatetime AS TODAY, DATEADD(reportdatetime, 1) AS NEXT_DAY FROM AllUsed Dates) AS A
WHERE
    (A.NEXT_DATE NOT IN (SELECT reportdatetime FROM AllUsedDates)
    AND
    A.TODAY <> '2011-05-01') --exclude the last day
Menhir answered 1/7, 2011 at 17:39 Comment(0)
B
0

If you mean reportdatetime has the entry of "Feb 2, 2011" but other fields associated to that date are not present like below table snap

reportdate  col1    col2
5/10/2011   abc xyz
2/2/2011        
1/1/2011    bnv oda

then this query works fine

select reportdate from dtdiff where reportdate not in (select df1.reportdate from dtdiff df1, dtdiff df2 where df1.col1 = df2.col1)
Bergamo answered 1/7, 2011 at 17:42 Comment(2)
Hi Rahul, thank you for the message. Sorry that I wasn't clearer. I meant that Feb 2, 2011 doesn't have an entry in the table. It's missing from the reportdatetime column.Balsa
@Laximidi, in that case you have to have another table filled with all the dates between Jan 1, 2011 thru April 30, 2011 and then you can simply find the missing date by issueing the query select distinct repdt from dtall where repdt >= '1/1/2011' and repdt <= '1/6/2011' and repdt not in (select distinct dt from dtdiff)Bergamo
R
0

Try this

SELECT DATE(t1.datefield) + INTERVAL 1 DAY AS missing_date FROM table t1 LEFT OUTER JOIN table t2 ON DATE(t1.datefield) = DATE(t2.datefield) - INTERVAL 1 DAY WHERE DATE(t1.datefield) BETWEEN '2020-01-01' AND '2020-01-31' AND DATE(t2.datefield) IS NULL;

If you want to get missing dates in a datetime field use this.

SELECT CAST(t1.datetime_field  as DATE) + INTERVAL 1 DAY AS missing_date FROM table t1 LEFT OUTER JOIN table t2 ON CAST(t1.datetime_field  as DATE) = CAST(t2.datetime_field  as DATE) - INTERVAL 1 DAY WHERE CAST(t1.datetime_field  as DATE) BETWEEN '2020-01-01' AND '2020-07-31' AND CAST(t2.datetime_field  as DATE) IS NULL;
Robbegrillet answered 16/7, 2020 at 13:4 Comment(0)
A
0

The solutions above seem to work, but they seem EXTREMELY slow (taking possibly hours, I waited for 30 min only) at least in my database.

This clause takes less than a second in same database (of course you need to repeat it manually dozen times and possibly change function names to find the actual dates). pvm = my datetime, WEATHER = my table.

mysql> select year(pvm) as _year,count(distinct(date(pvm))) as _days from WEATHER where year(pvm)>=2000 and month(pvm)=1 group by _year order by _year asc;

--ako

Aplite answered 1/8, 2022 at 14:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.