How do distributed transactions work (eg. MSDTC)?
Asked Answered
M

1

14

I understand, in a fuzzy sort of way, how regular ACID transactions work. You perform some work on a database in such a way that the work is not confirmed until some kind of commit flag is set. The commit part is based on some underlying assumption (like a single disk block write is atomic). In the event of a catastrophic error, you can just clear out the uncommitted data in the recovery phase.

How do distributed transactions work? In some of the MS documentation I have read that you can somehow perform a transaction across databases and filesystems (among other things).

This technology could be (and probably is) used for installers, where you want the program to be fully installed or fully absent. You simply begin a transaction at the start of the installer. Next you could connect to the registry and filesystem, making the changes that define the installation. When the job is done, simply commit, or rollback if the installation fails for some reason. The registry and filesystem are automatically cleaned for you by this magical distributed transaction coordinator.

How is it possible that two disparate systems can be transacted upon in this fashion? It seems to me that it is always possible to leave the system in an inconsistent state, where the filesystem has committed its changes and the registry has not. I think in MSDTC it is even possible to perform a transaction across the network.

I have read http://blogs.msdn.com/florinlazar/archive/2004/03/04/84199.aspx, but it feels like only the beginning of the explanation, and that step 4 should be expanded considerably.

Edit: From what I gather on http://en.wikipedia.org/wiki/Distributed_transaction, it can be accomplished by a two-phase commit (http://en.wikipedia.org/wiki/Two-phase_commit). After reading this, I'm still not understanding the method 100%, it seems like there is a lot of room for error between the steps.

Molybdenite answered 11/9, 2008 at 6:1 Comment(1)
There is a great deal of room for error. In particular, it relies on "COMMIT PREPARED" always working. Reality differs.Raimes
T
4

About "step 4":

The transaction manager coordinates with the resource managers to ensure that all succeed to do the requested work or none of the work if done, thus maintaining the ACID properties.

This of course requires all participants to provide the proper interfaces and (error-free) implementations. The interface looks like vaguely this:

public interface ITransactionParticipant {
    bool WouldCommitWork();
    void Commit();
    void Rollback();
}

The Transaction manager at commit-time queries all participants whether they are willing to commit the transaction. The participants may only assert this if they are able to commit this transaction under all allowable error conditions (validation, system errors, etc). After all participants have asserted the ability to commit the transaction, the manager sends the Commit() message to all participants. If any participant instead raises an error or times out, the whole transaction aborts and individual members are rolled back.

This protocol requires participants to have recorded their whole transaction content before asserting their ability to commit. Of course this has to be in a special local transaction log structure to be able to recover from various kinds of failures.

Turoff answered 11/9, 2008 at 6:31 Comment(10)
What happens if you have participants A and B, A gets committed and returns success, then the B gets committed and returns failure but before A can be rolled back, the network drops? Another cases would be the network failure could prevent B from being committed.Casefy
B may not fail on Commit() after returning true on WouldCommitWork() and A would not be committed before all participants return true on WouldCommitWork(). All steps of the protocol have associated timeouts which cause automatic rollbacks when hit. If a part of the cluster fails it will have to replay the log from a non-failing member to be able to join again. Fail-proofing clusters against byzantine errors is a hot research topic and requires more than two participants.Turoff
The only way B can prevent failure is to lock out any changes that could cause failure, but OK so scratch the first option. OTOH hardware failure still remains. -- The cases I'm interested in is where any link can fail while both system continue to provide service to other clients. In that case the system need to be coherent even while a dropped link is down. I don't see how they can ever agree on if transaction is to be committed and be sure that every one comes to the same conclusion.Casefy
That's the point of WouldCommitWork(): only legal transactions pass on to Commit(). Regarding the split-brain possibility of having both A and B servicing users without being able to communicate anymore: You need at least 2N+1 (N>0) servers to avoid that and servers who find themselves in a minority group must shutdown. See techthoughts.typepad.com/managing_computers/2007/10/… and also en.wikipedia.org/wiki/Two-phase_commit_protocolTuroff
What about the simpler case where there is no cluster providing a service, just the 2 systems A and B trying to participate in a single transaction. I don't see how WouldCommitWork() can possibly guarantee that a call to Commit will work; after a successful call to B's WouldCommitWork() and A's Commit(), but before B's Commit() is called, B's hardware might fail/the network might die/etc... by which time A has already been committed and thus presumably cannot be rolled back? (Does this relate to the Two Generals Problem?)Kean
Actually I think my concern is addressed by the assumptions listed on Wikipedia's Two-Phase Commit page: "no node crashes forever, that the data in the write-ahead log is never lost or corrupted in a crash". I think this means that when the node is eventually back up (network or hardware) then it will eventually be committed.Kean
Exactly: As soon as a single node has actually Committed(), the transaction is committed and any failure recovery after this point has to take care to keep the transaction committed.Turoff
@DavidSchmitt, after reading this thread of comments I'm understanding that at least one person is concerned about uncontrollable circumstances inside the Commit method. However, correct me if I'm wrong, but transactions don't exist to protect against uncontrollable circumstances like hardware failure, they exist for logical operations that should stop the commit. Generally speaking, in our organization, if uncontrollable circumstances like that arise we end up having to roll up our sleeves, read logs and messages, and re-queue those transactions by hand. Does that sound right?Thracophrygian
@MichaelPerrenoud: As I've written in the last paragraph, the WouldCommitWork() function already has written the transaction's contents to some store, marked as pending commit. When restarting the system after a crash, the transaction manager can and has to decide for each in flight transaction whether it must be committed (when everyone has the pending commit data stored) or whether it must be rolled back.Turoff
What if the rollback causes another error, forcing a rollback of the rollback, which results in another error... etc, until you have recursive rollbacks...?Dihydric

© 2022 - 2024 — McMap. All rights reserved.