Subtract hours from the now() function
Asked Answered
G

2

19

We have a machine running 24x7. Every day I report the number of pieces it produced per hour. In our case one working day means '2015-06-16 06:00:00' to '2015-06-17 06:00:00' for example.

Here is my code:

select date_trunc('hour', t_el_eventlog.eventtime at time zone 'CET') as hours,
       count (distinct t_el_eventlog.serialnumber) as count
from t_el_eventlog
where eventtime at time zone 'CET' between '2015-06-16 06:00:00'
                                       and '2015-06-17 06:00:00'
and sourceid = '44'
group by hours
order by hours asc
  • My Postgres version: "PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 32-bit"

  • The data types of two columns which I am dealing with:

    eventtime timestamp without time zone
    sourceid  integer NOT NULL
    
  • Time zone is "Europe/Berlin".

With the above query I get the information I want, but I have to change the date every day. Is it possible to use the now() function as default value for my case instead, so that I don't have to change the date manually everyday?

Giraudoux answered 17/6, 2015 at 14:18 Comment(7)
Your time frame does not make sense. The last six hours fall into "tomorrow" and would never be counted this way (because tomorrow, the new time frame is effective). Please clarify.Unfathomable
i want to know the previous day output...something like '2015-06-16 06:00:00' and '2015-06-17 06:00:00'Giraudoux
'2015-06-16 06:00:00' and '2015-06-17 06:00:00' is one working day and i want to know the output of the above working day on 2015-06-17 at 9 am/10 amGiraudoux
With a question like this you need to provide your exact table definition (and as always your version of Postgres). Exact data type matters. Also, tell us more about your time zone. at time zone 'CET' is most probably not what you want.Unfathomable
So you operate with timestamp without time zone, which is unaware of time zones. Why do you add at time zone 'CET' in your query? (It's wrong in any case: either incorrect or very inefficient.)Unfathomable
@ErwinBrandstetter: intially even i used above lines without timezone, but later a friend of mine from it background had to told me to use time zone CET (I think it is because of daylight settings..)Giraudoux
@ErwinBrandstetter: apart from these timezone thing... i would like to know one more thing from you..The above 8 lines of codes taking lot of time..can i make it faster in any other wayGiraudoux
U
25

Answer for timestamp

You need to understand the nature of the data types timestamp (timestamp without time zone) and timestamptz (timestamp with time zone). If you don't, read this first:

The AT TIME ZONE construct transforms a timestamp to timestamptz, which is almost certainly the wrong move for your case:

WHERE eventtime AT TIME ZONE 'CET' BETWEEN '2015-06-16 06:00:00'
                                       AND '2015-06-17 06:00:00'

First, it kills performance. Applying AT TIME ZONE to the column eventtime makes the expression not sargable. Postgres cannot use plain indexes on eventtime. But even without index, sargable expressions are cheaper. Adjust filter values instead of manipulating every row value.
You could compensate with a matching expression index, but it's probably just a misunderstanding and wrong anyway.

What happens in that expression?

  1. AT TIME ZONE 'CET' transforms the timestamp value eventtime to timestamptz by appending the time offset of your current time zone. When using a time zone name (not a numeric offset or an abbreviation), this also takes DST rules (daylight saving time) into account, so you get a different offset for "winter" timestamps. Basically you get the answer to the question:

    What's corresponding UTC timestamp for the given timestamp in the given time zone?

    When displaying the result to the user it is formatted as local timestamp with the according time offset for the current time zone of the session. (May or may not be the same as the one used in the expression).

  2. The string literals on the right side have no data type to them, so the type is derived from the assignment in the expression. Since that's timestamptz now, both are cast to timestamptz, assuming the current time zone of the session.

    What's the corresponding UTC timestamp for the given timestamp for the time zone setting of the current session.

    The offset can vary with DST rules.

Long story short, if you always operate with the same time zone: CET or 'Europe/Berlin' - same thing for present-day timestamps, but not for historic or (possibly) future ones, you can just cut the cruft.

The second problem with the expression: BETWEEN is almost always wrong with timestamp values. See:

SELECT date_trunc('hour', eventtime) AS hour
     , count(DISTINCT serialnumber)  AS ct  -- sure you need distinct?
FROM   t_el_eventlog
WHERE  eventtime >= now()::date - interval '18 hours'
AND    eventtime <  now()::date + interval '6 hours'
AND    sourceid  =  44  -- don't quote the numeric literal
GROUP  BY 1
ORDER  BY 1;

now() is the Postgres implementation of the SQL standard CURRENT_TIMESTAMP. Both return timestamptz (not timestamp!). You can use either.
now()::date is equivalent to CURRENT_DATE. Both depend on the current time zone setting.

You should have an index of the form:

CREATE INDEX foo ON t_el_eventlog(sourceid, eventtime)

Or, to allow index-only scans:

CREATE INDEX foo2 ON t_el_eventlog(sourceid, eventtime, serialnumber)

If you operate in different time zones, things get more complicated and you should use timestamptz for everything.

Alternative for timestamptz

Before the question update, it seemed like time zones matter. When dealing with different time zones, "today" is a functional dependency of the current time zone. People tend to forget that.

To just work with the current time zone setting of the session, use the same query as above. If executed in a different time zone, the results are wrong in actuality. (Applies to the above as well.)

To guarantee a correct result for a given time zone ('Europe/Berlin' in your case) regardless of the current time zone setting of the session, use this expression instead:

    ((now() AT TIME ZONE 'Europe/Berlin')::date - interval '18 hours')
            AT TIME ZONE 'Europe/Berlin'  -- 2nd time to convert back

Be aware that the AT TIME ZONE construct returns timestamp for timestamptz input and vice-versa.

As mentioned at the outset, all the gory details here:

Unfathomable answered 17/6, 2015 at 15:35 Comment(2)
But why does your solution need an hourly breakdown with generate_series? Generally, the expression tstz AT TIME ZONE 'tz' BETWEEN ts_lower AND ts_upper could be made sargable more simply, like: tstz BETWEEN ts_lower AT TIME ZONE 'tz' AND ts_upper AT TIME ZONE 'tz'Josie
@pozs: The hourly breakdown was one approach. I was working on the updated solution already - just like you hinted.Unfathomable
S
6

Your can use CURRENT_DATE:

 select date_trunc('hour', t_el_eventlog.eventtime at time zone 'CET') as hours,
        count(distinct t_el_eventlog.serialnumber) as count
 from t_el_eventlog
 where eventtime at time zone 'CET' between CURRENT_DATE + interval '6 hour' and
                                            CURRENT_DATE + interval '30 hour' and
       sourceid = '44'
 group by hours
 order by hours asc;

EDIT:

Erwin's comment is about the question not this answer. Using between for date/times is a bad idea. I suppose this should be repeated in every question that does this. But the problem is that the date/time values that are boundaries between days are counted twice.

The correct logic is:

 select date_trunc('hour', t_el_eventlog.eventtime at time zone 'CET') as hours,
        count(distinct t_el_eventlog.serialnumber) as count
 from t_el_eventlog
 where eventtime at time zone 'CET' >= CURRENT_DATE + interval '6 hour' and
       eventtime at time zone 'CET' < CURRENT_DATE + interval '30 hour' and
       sourceid = '44'
 group by hours
 order by hours asc;

Note the "<" for the second limit. Here is a good blog on this subject. Although Aaron is focused on SQL Server, the warnings (and some of the solutions) apply to other databases as well.

Stomatic answered 17/6, 2015 at 14:23 Comment(5)
thanks Gordon...it worked well to know the present working day output.Giraudoux
but what i want is previous day output and i tried like these CURRENT_DATE + interval '6 hour' and CURRENT_DATE - interval '18 hour'Giraudoux
You are including the upper bound, creating a corner case 25th hour. And you are completely ignoring time zones and actual data types. It is also not sargable and thus very inefficient. I doubt that people voting for this have been thinking it through.Unfathomable
@ErwinBrandstetter . . . I am aware of that. This is intended to implement the query that the OP asked for. You should really add that comment on the question.Stomatic
Fair enough. The problems originated in the question. But leading users probably shouldn't copy obvious errors without any hints. Unsuspecting visitors will in return copy that in good faith.Unfathomable

© 2022 - 2024 — McMap. All rights reserved.