Aggregate $group for multiple date ranges
Asked Answered
H

2

21

In my aggregate, each document in the stream will have a date on it.

I need to sum some values over date ranges..

My documents look like:

{ value: 3,  date: [SoME TIME STAMP] },
{ value: 4,  date: [SoME TIME STAMP] },
{ value: 1,  date: [SoME TIME STAMP] },
{ value: -6, date: [SoME TIME STAMP] }

I want to be able to group these document base on the date range. i.e.: 1-7 days ago, 8-15 days ago. and 15-30 days ago.

I probably apply 3 different aggregate queries with 3 different $match on dates.

But is it possible to do all the $group and sum the "value" field in one run?

Huff answered 11/11, 2015 at 18:21 Comment(3)
I don't think you can do that in one go. You should really go for 3 countBrathwaite
@MateoBarahona of course you can do it in one go. This is largely what operators such as $cond are actually for.Gezira
@Blakes Seven: Ho you're rightBrathwaite
G
39

You need to conditionally determine the grouping key based on where the current date falls in between the range. This is basically achieved via $cond with nested condtions and the logical variant of $lt:

// work out dates somehow
var today = new Date(),
    oneDay = ( 1000 * 60 * 60 * 24 ),
    thirtyDays = new Date( today.valueOf() - ( 30 * oneDay ) ),
    fifteenDays = new Date( today.valueOf() - ( 15 * oneDay ) ),
    sevenDays = new Date( today.valueOf() - ( 7 * oneDay ) );

db.collection.aggregate([
    { "$match": {
        "date": { "$gte": thirtyDays }
    }},
    { "$group": {
        "_id": {
            "$cond": [
                { "$lt": [ "$date", fifteenDays ] },
                "16-30",
                { "$cond": [
                    { "$lt": [ "$date", sevenDays ] },
                    "08-15",
                    "01-07"
                ]}
            ]
        },
        "count": { "$sum": 1 },
        "totalValue": { "$sum": "$value" }
    }}
])

As $cond is a ternary operator, the first condition is evaluated to see if the condition is true, and when true the second argument is returned otherwise the third is returned when false. So by nesting another $cond in the false case you get the logical test on where the date falls, either "less that the 15 day date" which means its in the oldest range, or "less than 7 days" which means the middle range, or of course it's in the newest range.

I'm just prefixing the numbers here less than 10 with a 0 so it gives you something to sort on if you want, since the output of "keys" in $group is not in itself ordered.

But that is how you do this in a single query. You just work out what the grouping key should be based on where the date falls and accumulate for each key.

Gezira answered 12/11, 2015 at 5:19 Comment(2)
+1 Awesome! Just I think you should swap conditions. I've tested and it works fine... Nice job Blakes!Stratosphere
@amir Kindly shareDragrope
L
2

This is a nice use case for the $bucket stage, combined with $dateDiff introduced in Mongo 5:

// { date: ISODate("2021-12-04"), value: 3  } <= last 7 days
// { date: ISODate("2021-11-25"), value: 5  } <= last 15 days
// { date: ISODate("2021-11-24"), value: 1  } <= last 15 days
// { date: ISODate("2021-11-12"), value: 12 } <= last 30 days
// { date: ISODate("2021-10-04"), value: 8  } <= too old
db.collection.aggregate([

  { $set: {
    diff: { $dateDiff: { startDate: "$$NOW", endDate: "$date", unit: "day" } }
  }},
  // { value: 3,  diff: 0   }
  // { value: 5,  diff: -9  }
  // { value: 1,  diff: -10 }
  // { value: 12, diff: -22 }
  // { value: 8,  diff: -61 }

  { $match: { diff: { $gte: -30 } } },
  // { value: 3,  diff: 0   }
  // { value: 5,  diff: -9  }
  // { value: 1,  diff: -10 }
  // { value: 12, diff: -22 }

  { $bucket: {
    groupBy: "$diff",
    boundaries: [-30, -15, -7, 1],
    output: { total: { $sum: "$value" } }
  }}
])
// { _id: -30, total: 12 } <= 30 to 16 days ago
// { _id: -15, total: 6  } <= 15 to 8  days ago
// { _id: -7,  total: 3  } <= 7  to 0  days ago

This:

  • first computes (with $dateDiff) the number of days of difference between today ("$$NOW") and the document's date
    • if the date is 3 days ago, diff will be set to -3
  • then filters out any document older than 30 days, based on diff
  • and finally buckets documents based on diff within buckets whose boundaries are defined by boundaries: [-30, -15, -7, 1]
    • and for each bucket, we sum bucketed values
Late answered 4/12, 2021 at 12:23 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.