Fill missing dates in PostgreSQL with zero
Asked Answered
E

1

6

I've a query like this in PostgreSQL:

select count(id_student) students, date_beginning_course from 
data.sessions_courses
left join my_schema.students on id_session_course=id_sesion 
where course_name='First course'
group by date_beginning_course

What I obtain with this query is the number of students that have attended a session of "First course" in several dates, for example:

Students  Date_beginning_course
____________________________________

5         2019-06-26
1         2019-06-28
5         2019-06-30
6         2019-07-01
2         2019-07-02

I'd like to fill this table with the missing date values, and, for each missing value, assign a '0' in Students column, because there are no students for this date. Example:

Students  Date_beginning_course
____________________________________

5         2019-06-26
0         2019-06-27  <--new row
1         2019-06-28
0         2019-06-29  <--new row
5         2019-06-30
6         2019-07-01
2         2019-07-02

Could you help me? Thanks! :)

Equipment answered 11/9, 2019 at 23:42 Comment(3)
Use a calendar table (probably generate-series()) and left join to that.Monotint
Indeed. Join to a subquery that uses generate_series() to create a range of dates, as per @wildplassers answerSpinner
I've tried to use a generate_series query with no success, I think I'm not doing it correctly.Equipment
H
7

You could generate a list of dates with the handy Postgres set-returning function generate_series() and LEFT JOIN it with the sessions_courses and students table:

SELECT
    COUNT(s.id_student) students, 
    d.dt
FROM 
    (
        SELECT dt::date 
        FROM generate_series('2019-06-26', '2019-07-02', '1 day'::interval) dt
    ) d
    LEFT JOIN data.sessions_courses c 
        ON c.date_beginning_course  = d.dt 
        AND c.course_name='First course'
    LEFT JOIN my_schema.students s 
        ON s.id_session_course = c.id_session 
GROUP BY d.dt 

You can change the date range by modifying the first two parameters of generate_series().

NB: it is a general good practive to index the column names in the query with the relevant table names (or table alias), so it is explicit to which table each column belongs. I changed your query accordingly, and had to make a few assumptions, that you might need to adapt.

Herriott answered 11/9, 2019 at 23:56 Comment(3)
I'm afraid this is not working well, because when I add more filters in WHERE clause my table returns all the results, not filtered :(Equipment
@belfastcowboy24: not sure what you mean... Anyway there was a mistake in the query: I just removed the WHERE clause; all filtering criterias should be placed in the ON clause of the relevant LEFT JOIN.Herriott
@Equipment . . . This appears to answer the question that you asked. You should accept this question. If you have a different question, then ask it as a new question.Audet

© 2022 - 2024 — McMap. All rights reserved.