sum every 3 rows of a table
Asked Answered
B

3

7

I have the following query to count all data every minute.

$sql= "SELECT COUNT(*) AS count, date_trunc('minute', date) AS momento
FROM p WHERE fk_id_b=$id_b GROUP BY date_trunc('minute', date) 
ORDER BY momento ASC";

What I need to do is get the sum of the count for each row with the count of the 2 past minutes.

For example with the result of the $sql query above
|-------date---------|----count----|
|2012-06-21 05:20:00 |      12     |
|2012-06-21 05:21:00 |      14     |
|2012-06-21 05:22:00 |      10     |
|2012-06-21 05:23:00 |      20     |
|2012-06-21 05:24:00 |      25     |
|2012-06-21 05:25:00 |      30     |
|2012-06-21 05:26:00 |      10     |  

I want this result:

|-------date---------|----count----|
|2012-06-21 05:20:00 |      12     |   
|2012-06-21 05:21:00 |      26     |     12+14
|2012-06-21 05:22:00 |      36     |     12+14+10
|2012-06-21 05:23:00 |      44     |     14+10+20
|2012-06-21 05:24:00 |      55     |     10+20+25
|2012-06-21 05:25:00 |      75     |     20+25+30
|2012-06-21 05:26:00 |      65     |     25+30+10
Banebrudge answered 29/6, 2012 at 20:15 Comment(3)
Very tricky with SQL. Can't your application do the work? You have to fetch the rows anyway, and you could do that in the same loop.Pushbike
@Parth Bhatt: The edit you approved is clearly asking a different question. Please do not approve such edits.Pedagogy
@interjay: Sorry my mistake, I interpreted it the wrong way.Stupendous
M
12

This is not so tricky with lag() window function (also on SQL Fiddle):

CREATE TABLE t ("date" timestamptz, "count" int4);
INSERT INTO t VALUES
('2012-06-21 05:20:00',12),
('2012-06-21 05:21:00',14),
('2012-06-21 05:22:00',10),
('2012-06-21 05:23:00',20),
('2012-06-21 05:24:00',25),
('2012-06-21 05:25:00',30),
('2012-06-21 05:26:00',10);

SELECT *,
    "count"
    + coalesce(lag("count", 1) OVER (ORDER BY "date"), 0)
    + coalesce(lag("count", 2) OVER (ORDER BY "date"), 0) AS "total"
  FROM t;
  1. I've double-quoted date and count columns, as these are reserved words;
  2. lag(field, distance) gives me the value of the field column distance rows away from the current one, thus first function gives previous row's value and second call gives the value from the one before;
  3. coalesce() is required to avoid NULL result from lag() function (for the first row in your query there's no “previous” one, thus it's NULL), otherwise the total will also be NULL.
Monasticism answered 29/6, 2012 at 20:28 Comment(2)
Excellent! I love the concept of window functions but keep forgetting to use them. Have an upvote :-)Pushbike
+1 for the right solution. I wrote an answer, because comment is not enough.Mahon
U
14

Here's a more general solution for the sum of values from current and N previous rows (N=2 in your case).

SELECT "date", 
sum("count") OVER (order by "date" ROWS BETWEEN 2 preceding AND current row)
FROM t
ORDER BY "date";

You can change N between 0 and "Unbounded". This approach gives you a chance to have a parameter in your app "count of the N past minutes". Also, no need for handling default values if out of bounds.

You can find more on this in PostgreSQL docs (4.2.8. Window Function Calls)

Used answered 29/6, 2012 at 20:15 Comment(0)
M
12

This is not so tricky with lag() window function (also on SQL Fiddle):

CREATE TABLE t ("date" timestamptz, "count" int4);
INSERT INTO t VALUES
('2012-06-21 05:20:00',12),
('2012-06-21 05:21:00',14),
('2012-06-21 05:22:00',10),
('2012-06-21 05:23:00',20),
('2012-06-21 05:24:00',25),
('2012-06-21 05:25:00',30),
('2012-06-21 05:26:00',10);

SELECT *,
    "count"
    + coalesce(lag("count", 1) OVER (ORDER BY "date"), 0)
    + coalesce(lag("count", 2) OVER (ORDER BY "date"), 0) AS "total"
  FROM t;
  1. I've double-quoted date and count columns, as these are reserved words;
  2. lag(field, distance) gives me the value of the field column distance rows away from the current one, thus first function gives previous row's value and second call gives the value from the one before;
  3. coalesce() is required to avoid NULL result from lag() function (for the first row in your query there's no “previous” one, thus it's NULL), otherwise the total will also be NULL.
Monasticism answered 29/6, 2012 at 20:28 Comment(2)
Excellent! I love the concept of window functions but keep forgetting to use them. Have an upvote :-)Pushbike
+1 for the right solution. I wrote an answer, because comment is not enough.Mahon
M
9

@vyegorov's answer covers it mostly. But I have more gripes than fit into a comment.

  1. Don't use reserved words like date and count as identifiers at all. PostgreSQL allows those two particular key words as identifier - other than every SQL standard. But it's still bad practice. The fact that you can use anything inside double-quotes as identifier, even "; DELETE FROM tbl;" does not make it a good idea. The name "date" for a timestamp is misleading on top of that.

  2. Wrong data type. Example displays timestamp, not timestamptz. Does not make a difference here, but still misleading.

  3. You don't need COALESCE(). With the window functions lag() and lead() you can can provide a default value as 3rd parameter:

Building on this setup:

CREATE TABLE tbl (ts timestamp, ct int4);
INSERT INTO tbl VALUES
  ('2012-06-21 05:20:00', 12)
, ('2012-06-21 05:21:00', 14)
, ('2012-06-21 05:22:00', 10)
, ('2012-06-21 05:23:00', 20)
, ('2012-06-21 05:24:00', 25)
, ('2012-06-21 05:25:00', 30)
, ('2012-06-21 05:26:00', 10);

Query:

SELECT ts, ct + lag(ct, 1, 0) OVER (ORDER BY ts)
              + lag(ct, 2, 0) OVER (ORDER BY ts) AS total
FROM   tbl;

Or better yet: use a single sum() as window aggregate function with a custom window frame:

SELECT ts, sum(ct) OVER (ORDER BY ts ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM   tbl;

Same result.
Related:

Mahon answered 29/6, 2012 at 21:14 Comment(1)
Default value to lag/lead is a nice one!Monasticism

© 2022 - 2024 — McMap. All rights reserved.