MongoDB aggregate, how to addToSet each element of array in group pipeline
Asked Answered
T

3

10

I have documents that contains a tags fields. It's a simple array with tag names inside, no object nor _id inside. Just plain tags like this ["Protocol", "Access", "Leverage", "Capability"].

And in my group pipeline I tried something like 'selectedTags': { $addToSet: '$tags' } but then I end up with an array containing arrays of tags. And I get the same with $push.

I tried to use $each or $pushAll but they are not supported as grouping operator as my shell tell me.

Can someone help me on this one please ?

Thank you

Edit:

Sample docs:

{
    "_id" : "HWEdDGsq86x4ikDSQ",
    "teamId" : "AdLizGnPuqbWNsFHe",
    "ownerId" : "Qb5EigWjqn2t3bfxD",
    "type" : "meeting",
    "topic" : "Grass-roots hybrid knowledge user",
    "fullname" : "Guidouil",
    "startDate" : ISODate("2017-07-30T09:00:05.513Z"),
    "shareResults" : true,
    "open" : true,
    "language" : "fr",
    "tags" : [
        "Protocol",
        "Challenge",
        "Artificial Intelligence",
        "Capability"
    ],
    "isDemo" : true,
    "createdAt" : ISODate("2017-11-15T19:24:05.513Z"),
    "participantsCount" : 10,
    "ratersCount" : 10,
    "averageRating" : 3.4,
    "hasAnswers" : true,
    "updatedAt" : ISODate("2017-11-15T19:24:05.562Z")
}
{
    "_id" : "rXvkFndpXwJ6KAvNo",
    "teamId" : "AdLizGnPuqbWNsFHe",
    "ownerId" : "Qb5EigWjqn2t3bfxD",
    "type" : "meeting",
    "topic" : "Profit-focused modular system engine",
    "fullname" : "Guidouil",
    "startDate" : ISODate("2017-07-24T12:00:05.564Z"),
    "shareResults" : true,
    "open" : true,
    "language" : "fr",
    "tags" : [
        "Initiative",
        "Artificial Intelligence",
        "Protocol",
        "Utilisation"
    ],
    "isDemo" : true,
    "createdAt" : ISODate("2017-11-15T19:24:05.564Z"),
    "participantsCount" : 33,
    "ratersCount" : 33,
    "averageRating" : 2.9393939393939394,
    "hasAnswers" : true,
    "updatedAt" : ISODate("2017-11-15T19:24:05.753Z")
}

Aggregation:

db.surveys.aggregate(
  { $match: query },
  {
    $group: {
      '_id': {
        'year': { $year: '$startDate' },
        'day': { $dayOfYear: '$startDate' },
      },
      'participants': { $sum: '$ratersCount' },
      'rating': { $avg: '$averageRating' },
      'surveys': { $push: '$_id' },
      'selectedTags': { $addToSet: '$tags' },
      'peoples': { $addToSet: '$fullname' },
    }
  },
  { $sort: { _id: 1 } }
);

then I tried to change the selectedTags to { $push: { $each: '$tags' } } or { $pushAll: '$tags' } but this does not execute :(

Edit 2:

In javascript I do it like that:

return Surveys.aggregate(
  { $match: query },
  { $group: {
    _id: dateGroup,
    participants: { $sum: '$ratersCount' },
    rating: { $avg: '$averageRating' },
    surveys: { $push: '$_id' },
    selectedTags: { $push: '$tags' },
    peoples: { $addToSet: '$fullname' },
  } },
  { $project: {
    _id: null,
    selectedTags: {
      $reduce: {
        input: "$selectedTags",
        initialValue: [],
        in: { $setUnion: ["$$value", "$$this"] }
      }
    },
  } }
);
Turbosupercharger answered 16/11, 2017 at 15:1 Comment(1)
Could you show your attempts? It might shed some light on what you are trying to achieve.Shawannashawl
S
31

To mimic functionality of $addToSet update operator with $each modifier in aggregation pipeline you can use a combination of $push on grouping stage and $reduce + $setUnion on projection stage. E.g.:

db.collection.aggregate([
    {$group:{
       _id: null,
       selectedTags: { $push: '$tags' }      
    }},
    {$project: {
        selectedTags: { $reduce: {
            input: "$selectedTags",
            initialValue: [],
            in: {$setUnion : ["$$value", "$$this"]}
        }}
    }}
])

results with a single document which contains a distinct list of tags from all documents in selectedTags array.

Shawannashawl answered 16/11, 2017 at 15:40 Comment(8)
You are right, I'm getting closer to what I want, now I have an array containing an array of tags. Like this : SelectedTags : [ ["Leverage", "Utilisation", "Capability", "Initiative"] ]Turbosupercharger
How? The snippet in the answer returns a flat array of tag.Shawannashawl
I still need the other fields, so I did thisTurbosupercharger
hmm, other fields don't matter really. Just replace selectedTags: { $push: '$tags' } in group stage, and project it with the expression I posted. Leave the rest fields as needed.Shawannashawl
I tried this gist.github.com/guidouil/147774bc36d0d62e1376eedaa1b54ff0 and it looks like it just returns me the original group partTurbosupercharger
Dear @Guidouil, please avoid external links. Include all the information within the question itself in the future. The nested array must be coming from javascript library you are using. I tried your query in mongodb shell and got a flat SelectedTags array per group. As a side note, you probably want to include rest of the fields in the projection stage. Otherwise you are losing them.Shawannashawl
You are right, my issue was linked to javascript library I'm using : Meteor. Currently it's on MongoDB 3.2 and it doesn't have reduce. I found a workaround to get it connected to a mongo 3.4 and now everything is great thanks to you. Merci beaucoup AlexTurbosupercharger
If you don't want to lose all other fields, you may want to use $addFields instead of $projectObnubilate
D
21

You can also use $unwind to get result:

db.collection.aggregate([
  {$unwind: "$tags"},
  {$group:{
     _id: null,
     selectedTags: { $addToSet: '$tags' }      
  }}
])
Dimeter answered 19/9, 2018 at 4:18 Comment(2)
Thank you, this is super simple!Vaulting
This is an clean elegant looking solution. Memory usage is a real concern at the unwind tags stage though.Leverett
W
0

Dannyxu and Alex Beck's answers both worked, but only partially when used with a group stage. I needed to combine both to get the desired result of a single flat array of tags:

Model.aggregate()
        .match({ /** some query */ })
        .group({
            _id: '$teamId',
            tagsSet: { $push: '$tags' },
            numRecords: { $sum: 1 },
        })
        .project({
            _id: 0,
            numRecords: 1,
            tagsSet: {
                $reduce: {
                    input: '$tagsSet',
                    initialValue: [],
                    in: { $setUnion: ['$$value', '$$this'] },
                },
            },
        })
        .unwind({ path: '$tagsSet' })
        .group({
            _id: null,
            selectedTags: { $addToSet: '$tagsSet' },
            numRecords: { $sum: '$numRecords' },
        })
Wiliness answered 20/12, 2021 at 20:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.