MongoDB: upsert sub-document
Asked Answered
E

5

50

I have documents that looks something like that, with a unique index on bars.name:

{ name: 'foo', bars: [ { name: 'qux', somefield: 1 } ] }

. I want to either update the sub-document where { name: 'foo', 'bars.name': 'qux' } and $set: { 'bars.$.somefield': 2 }, or create a new sub-document with { name: 'qux', somefield: 2 } under { name: 'foo' }.

Is it possible to do this using a single query with upsert, or will I have to issue two separate ones?

Related: 'upsert' in an embedded document (suggests to change the schema to have the sub-document identifier as the key, but this is from two years ago and I'm wondering if there are better solutions now.)

Epstein answered 5/5, 2014 at 10:49 Comment(0)
W
61

No there isn't really a better solution to this, so perhaps with an explanation.

Suppose you have a document in place that has the structure as you show:

{ 
  "name": "foo", 
  "bars": [{ 
       "name": "qux", 
       "somefield": 1 
  }] 
}

If you do an update like this

db.foo.update(
    { "name": "foo", "bars.name": "qux" },
    { "$set": { "bars.$.somefield": 2 } },
    { "upsert": true }
)

Then all is fine because matching document was found. But if you change the value of "bars.name":

db.foo.update(
    { "name": "foo", "bars.name": "xyz" },
    { "$set": { "bars.$.somefield": 2 } },
    { "upsert": true }
)

Then you will get a failure. The only thing that has really changed here is that in MongoDB 2.6 and above the error is a little more succinct:

WriteResult({
    "nMatched" : 0,
    "nUpserted" : 0,
    "nModified" : 0,
    "writeError" : {
        "code" : 16836,
        "errmsg" : "The positional operator did not find the match needed from the query. Unexpanded update: bars.$.somefield"
    }
})

That is better in some ways, but you really do not want to "upsert" anyway. What you want to do is add the element to the array where the "name" does not currently exist.

So what you really want is the "result" from the update attempt without the "upsert" flag to see if any documents were affected:

db.foo.update(
    { "name": "foo", "bars.name": "xyz" },
    { "$set": { "bars.$.somefield": 2 } }
)

Yielding in response:

WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })

So when the modified documents are 0 then you know you want to issue the following update:

db.foo.update(
    { "name": "foo" },
    { "$push": { "bars": {
        "name": "xyz",
        "somefield": 2
    }}
)

There really is no other way to do exactly what you want. As the additions to the array are not strictly a "set" type of operation, you cannot use $addToSet combined with the "bulk update" functionality there, so that you can "cascade" your update requests.

In this case it seems like you need to check the result, or otherwise accept reading the whole document and checking whether to update or insert a new array element in code.

Whitleather answered 6/5, 2014 at 7:25 Comment(4)
Yep, that's what I currently have - $set followed by a $push if no matching documents were found... was just wondering if there's a better way to go about this. Thanks for the explanation!Epstein
How do you make this atomic if you have multiple processes attempting to update at the same time. Seems like there could be a race condition and you might call $push more than once and end up with more than one record in the array. Is there a way to insert the new value in an atomic way?Spillage
@MichaelMoser Only just saw the comment. But by adding in inequality test like { "name": "foo", "bars.name": { "$ne": "xyz" } } as the query then you ensure that you are not duplicating "pushed" elements.Whitleather
must the first parameter contain two elements? Can't I do: db.foo.update( { "name": "foo" }, { "$set": { "bars.$.somefield": 2 } }, { "upsert": true } ) if name is unique?Constructivism
E
6

I was digging for the same feature, and found that in version 4.2 or above, MongoDB provides a new feature called Update with aggregation pipeline.
This feature, if used with some other techniques, makes possible to achieve an upsert subdocument operation with a single query.

It's a very verbose query, but I believe if you know that you won't have too many records on the subCollection, it's viable. Here's an example on how to achieve this:

const documentQuery = { _id: '123' }
const subDocumentToUpsert = { name: 'xyz', id: '1' }

collection.update(documentQuery, [
    {
        $set: {
            sub_documents: {
                $cond: {
                    if: { $not: ['$sub_documents'] },
                    then: [subDocumentToUpsert],
                    else: {
                        $cond: {
                            if: { $in: [subDocumentToUpsert.id, '$sub_documents.id'] },
                            then: {
                                $map: {
                                    input: '$sub_documents',
                                    as: 'sub_document',
                                    in: {
                                        $cond: {
                                            if: { $eq: ['$$sub_document.id', subDocumentToUpsert.id] },
                                            then: subDocumentToUpsert,
                                            else: '$$sub_document',
                                        },
                                    },
                                },
                            },
                            else: { $concatArrays: ['$sub_documents', [subDocumentToUpsert]] },
                        },
                    },
                },
            },
        },
    },
])
Earthworm answered 27/11, 2020 at 0:0 Comment(0)
A
5

if you dont mind changing the schema a bit and having a structure like so:

{ "name": "foo", "bars": { "qux": { "somefield": 1 },
                           "xyz": { "somefield": 2 },
                  }
}

You can perform your operations in one go. Reiterating 'upsert' in an embedded document for completeness

Acrylonitrile answered 6/1, 2017 at 7:19 Comment(1)
In the case where the values for bars.name is not known ahead of time, this schema becomes difficult to validate and query.Ratty
J
2

There's a way to do it in two queries - but it will still work in a bulkWrite.

This is relevant because in my case not being able to batch it is the biggest hangup. With this solution, you don't need to collect the result of the first query, which allows you to do bulk operations if you need to.

Here are the two successive queries to run for your example:

// Update subdocument if existing
collection.updateMany({
    name: 'foo', 'bars.name': 'qux' 
}, {
    $set: { 
        'bars.$.somefield': 2 
    }
})
// Insert subdocument otherwise
collection.updateMany({
    name: 'foo', $not: {'bars.name': 'qux' }
}, {
    $push: { 
        bars: {
            somefield: 2, name: 'qux'
        }
    }
})

This also has the added benefit of not having corrupted data / race conditions if multiple applications are writing to the database concurrently. You won't risk ending up with two bars: {somefield: 2, name: 'qux'} subdocuments in your document if two applications run the same queries at the same time.

Juan answered 9/8, 2019 at 8:55 Comment(0)
S
0

I was trying to do something similar - I wanted to create the document if it didn't exist, add a subdocument to an array if there was no subdocument already there, or increment the count field in the subdocument if it was already in the array. Based on these and other answers here's what I came up with using Spring Boot MongoTemplate. I believe this is atomic, in that if another process simultaneously is updating elements it will not result in duplicates and still achieve the desired results. Unfortunately it requires 3 database transactions; I'm not sure if there's a more efficient way to do this.


@Document(collection = "dealerships")
public class Dealership {

    public static class Car {
        String carBrand;
        int count;
        public Car(Sstring brand) {
            this.carBrand = brand;
            count = 1;
        }
    }

    String id;
    List<Car> carsOnLot = new ArrayList<>();    //important - create an empty array when inserting new document

    //Constructor, getters, setters...
}

//First, let's try inserting the a new Dealership document, if that Dealership already exists this will fail
try {
    mongoTemplate.insert(new Dealership("Annapolis"));
} catch (org.springframework.dao.DuplicateKeyException dex) {            
    System.out.println("Annapolis dealer exists!");
}

//----- At this point the Annapolis dealer document exists, but we don't know if our Car is in the carsOnLot array ------------

//This is the query and update for adding to the array "carsOnLot", if the particular car is not already listed
Query addCarQuery = new Query().addCriteria(Criteria.where("id").is("Annapolis"));
addCarQuery.addCriteria(Criteria.where("carsOnLot.carBrand").ne("Audi"));

Update addCarUpdate = new Update();
addCarUpdate.addToSet("carsOnLot", new Car("Audi"));    //this will not duplicate an existing element


//Let's try adding this car brand to the array
UpdateResult ur = mongoTemplate.updateFirst(addCarQuery, addCarUpdate, Dealership.class);
if (ur.getModifiedCount() == 1)  // we added it - our job is done
    return;
else   
    System.out.println("Already Audis on the lot!");


//------ At this point we already have an entry for our brand, so let's increment the count

//This is the query for incrementing the count of cars
Query updateQuery = new Query().addCriteria(Criteria.where("id").is("Annapolis"));
updateQuery.addCriteria(Criteria.where("carsOnLot").elemMatch(Criteria.where("carBrand").is("Audi")));

Update updateUpdate = new Update().inc("carsOnLot.$.count", 1); //increment the count for Audis by 1

 ur = mongoTemplate.updateFirst(updateQuery, updateUpdate, Dealership.class);
 if (ur.getModifiedCount() == 1)    // we incremented it - our job is done
     return;

// Log a failure, something isn't right
Scary answered 22/4, 2023 at 21:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.