Correct way to update multiple documents inside a transaction
Asked Answered
J

1

7

TL;DR Should I use Transaction.getAll() or use a for loop and update the documents one by one using Transaction.get()

consider the following schema:

- someTopLevelCollection
   - accountId1
       - projects
       - tasks
       - users
  - accountId2
       - projects
       - tasks
       - users

Whenever a project/task is created in projects, tasks sub collection respectively, counters in users collection are updated say, projectsCount and tasksCount.

The reference to the users are kept inside the project and tasks as array of userIds as follows:

Note: other fields removed for sake of brevity

project/tasks structure:

{
   "name": "someName",
   "status": "someStatus",
   "users": [
       "userId1",
       "userId2",
       ....
   ],
   ...
}

Now I need to update say a counter for all the userIds inside the users array using the transaction.

Method 1:

const accountId = context.params.accountId;
const userIds = snapshot.data().users;

userIds.forEach(userId => {

    const userDocRef = db.collection(someTopLevelCollection)
        .doc(accountId)
        .collection('users')
        .doc(userId);

    let transaction = db.runTransaction(transaction => {
        return transaction.get(userDocRef)
            .then(doc => {
                const snapshotData = doc.data();
                let newCounterValue = snapshotData[counterName] + 1;
                transaction.update(userDocRef, {counterName: newCounterValue});
                return Promise.resolve(`Incremented ${counterName} to ${newCounterValue}`);
            });

    }).then(result => {
        console.log('Transaction success!', result);
        return true;

    }).catch(err => {
        console.error('Transaction failure:', err);
        return false;
    });

});

Method 2:

const accountId = context.params.accountId;
const userIds = snapshot.data().users;

let userDocRefs = [];
userIds.forEach(userId => {
    const userDocRef = db.collection(someTopLevelCollection)
        .doc(accountId)
        .collection('users')
        .doc(userId);
    userDocRefs.push(userDocRef)
});

let transaction = db.runTransaction(transaction => {
    return transaction.getAll(userDocRefs)
        .then(docs => {

            docs.forEach(doc => {
                const snapshotData = doc.data();
                let newCounterValue = snapshotData[counterName] + 1;
                transaction.update(doc.ref, {counterName: newCounterValue});
            });
            return Promise.resolve('Completed transaction successfully');

        });

}).then(result => {
    console.log('Transaction success!', result);
    return true;

}).catch(err => {
    console.error('Transaction failure:', err);
    return false;

});

The following are my questions:

  1. when a document change occurs externally when getAll() is under execution, are all the documents fetched again to maintain consistency. If so, what's the use case to use getAll()?
  2. When a transaction is run one by one using the for loop and a document change occurs externally to the current document being modified in the transaction, is the transaction retried just for that document?

Thanks.

Janitajanith answered 24/5, 2019 at 12:17 Comment(0)
T
0

First the brief answers, then explanations:

  1. Yes, when changes happen externally to a ref in your getAll(), the entire transaction will be retried (i.e. all the docs will be re-read). More nuance below.
  2. No, the transaction isn't retried for just that one document. Transactions are all-or-none. It will either all succeed, or everything will be re-run again if it fails (including all reads and all writes). But the answer's subtler, given the context you've provided; more below on that.

What's With getAll?

The deeper answer to your first question is subtle because Firestore locks differently between mobile/web and server.

  • Because of longer latency for mobile/web, transactions lock doc refs optimistically. So if you're coming from mobile/web, writes can indeed happen to one of your docs while you're in the middle of a transaction. In those cases, the entire transaction will be retried (i.e. the entire function you passed to db.runTransaction will be run again — that is, you'll literally re-pull the latest versions of all the docs, then attempt to write to each of them).
  • But server-side Firestore calls (e.g. using the admin interface) lock pessimistically because it's assumed servers have low-latency and good connections. When it comes to data contention, your getAll() call on the server precludes all other attempts to write to those docs until your transaction completes. In other words, it's not possible in server-side Firestore transactions for a "document change [to] occur externally." You'll definitely read the docs successfully, and they won't change from outside while you're mid-transaction.

And then you mentioned "why getAll"? The reason is that you want the atomicity guaranteed by a batch fetch. Firestore transactions require that all reads are done before any writes. When you need to read and edit a lot of docs atomically, you'll likely want to getAll.

So in your example "Method #1", the full set of documents you're editing aren't guaranteed to be atomically changed. Method #1 only provides atomicity at the per-document level — meaning if these changes somehow need to be consistent with each other, Method #1 is not the right way to do things. I suspect you'll want to use Method #2, given your description of how the user records interrelate with the root document that has all the userIds in it.

Note, FWIW, that Firestore now supports atomic increments which accomplish what Method #1 wants. Atomic increments guarantee, without needing transactions, that a number will be written with an incremented value atomically (i.e. without others writing to it in between your reading and writing the number). But these atomic increments are still guaranteed only at the per-doc (well, per-number) level — so if your change needs to ripple across a bunch of docs in order to be consistent with each other, you'll still want to use Method #2.

Transaction Retries

Suppose you're on mobile or web, and so your transaction holds only an optimistic lock on all the doc refs. In this case, it's possible for another thread to write to one of your docs while you're mid-transaction.

Since transactions are all-or-none, when someone writes to even a single doc that you've read in your transaction, the entire transaction is re-run. The whole function you passed to runTransaction is reattempted again; everything is re-read, everything is re-updated. (Don't ask me whether you get charged reads/writes again on a failed transaction's reattempt... I have no idea). Firestore clients will automatically reattempt your transaction five times, after which it gives up and throws an exception.

Re-attempting the whole transaction, instead of just the one document that was changed, is necessary because the entire premise of transaction locking is to guarantee that a set of interdependent changes either happens to all documents or to no documents. So re-attempting only a single doc read/write won't be what you want (because you chose to use a transaction presumably because the doc changes are interrelated).

Thicket answered 4/2, 2023 at 9:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.