How to delete documents returned by an aggregation query in mongodb
Asked Answered
B

3

15

I am attempting to delete all the documents returned by an aggregation in Mongodb.

The query I have is as follows:

db.getCollection("Collection")
  .aggregate([
    {
      $match: { status: { $in: ["inserted", "done", "duplicated", "error"] } }
    },
    {
      $project: {
        yearMonthDay: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }
      }
    },  
    { $match: { yearMonthDay: { $eq: "2019-08-06" } } }
  ])
  .forEach(function(doc) {
    db.getCollection("Collection").remove({});
  });

I tried this query but it removes all the data in the database, any suggestions please?

Bolan answered 6/11, 2019 at 12:7 Comment(0)
W
31

Since the remove doesn't have a query condition its going to match with all the documents and delete irrespective of the aggregation result.

Solution (match the ids of the current cursor doc):

db.getCollection("Collection")
  .aggregate([
    {
      $match: { status: { $in: ["inserted", "done", "duplicated", "error"] } }
    },
    {
      $project: {
        yearMonthDay: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }
      }
    },
    { $match: { yearMonthDay: { $eq: "2019-08-06" } } }
  ])
  .forEach(function(doc) {
    db.getCollection("Collection").remove({ "_id": doc._id });
  });

Another better solution would be to have single round trip to db while deletion is get a list of ids from the aggregation cursor() via cursor.map()

var idsList = db
  .getCollection("Collection")
  .aggregate([
    {
      $match: { status: { $in: ["inserted", "done", "duplicated", "error"] } }
    },
    {
      $project: {
        yearMonthDay: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }
      }
    },
    { $match: { yearMonthDay: { $eq: "2019-08-06" } } }
  ])
  .map(function(d) {
    return d._id;
  });

//now delete those documents via $in operator
db.getCollection("Collection").remove({ _id: { $in: idsList } });
Whiffle answered 6/11, 2019 at 12:43 Comment(11)
Thanks for nswer, I will try that then i get u backBolan
Thnaks it works I have another question please, I can execute the second one using a js file ?Bolan
Yes it'll work in the shell as well as in a js file.Whiffle
thanks, but I will have to mention the connectionstring to mongo or I put just this code ?Bolan
U can execute a js file directly from the mongo shell but I sense that's not what u'are looking for, and probably u want this code to run in node.js or something.. then yes connection string and collection instance and all that is required. linkWhiffle
I want to execut it in linux script, it gives somme error :2019-11-06T14:33:53.151+0100 E QUERY [thread1] SyntaxError: expected expression, got '.' @(shell):1:0Bolan
look i created two files, file.js where I put your second suggestion and a script.sh contains mongo < file.js my quetion is ; I have to mention the ConnectionString in js file ??Bolan
Are you trying the first method -> executing a js file in mongo shell If yes, This has some detailed info. Also its not entirely clear from the error.. Would suggested to ask a another self contained question to reach better visibility & help.Whiffle
does the better solution work if the idsList is very large (millions of records)? What is the disadvantage of the first solution?Demitasse
@Demitasse I believe even if the idList is very large the single round trip query would be faster because _id is an indexed field, even though I don't have any benchmark. When compared to doing million IO for each _id. Also the "better solution" should only make a difference when the number of ids returned are very large.Whiffle
After .map(d => d._id) it was necessary to chain it with .toArray(), otherwise it's still a cursor and not collected into an array.Tacy
B
5

As per your query its not required to filter by aggregation and remove by another methods, you can apply this query filters in remove() method's filters,

  • the $expr Allows the use of aggregation expressions within the query language,
db.getCollection("Collection").remove({ 
  $and: [
    { status: { $in: ["inserted", "done", "duplicated", "error"] } },
    {
      $expr: {
        $eq: [
          { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
          "2019-08-06"
        ]
      }
    }
  ]
});

This can also support in deleteOne and deleteMany methods.

Berlin answered 13/5, 2021 at 16:2 Comment(0)
M
0

You can add expires to your document, and update this field in the $merge stage of your aggregation.

// 1) add this to your schema
deleteAt: {
  type: Date,
  expires: 0,
}

// 2) set the value of `deleteAt` for the documents to be deleted

// 3) add this at the end of the aggregation 
$merge: {
  into: "Collections" // or "Collection"
}
Mcclure answered 21/10, 2023 at 15:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.