How does raft handle committing entries from previous one?
Asked Answered
M

5

9

In raft paper section 5.4.2

If a leader crashes before committing an entry, future leaders will attempt to finish replicating the entry. However, a leader cannot immediately conclude that an entry from a previous term is committed once it is stored on a majority of servers. There could be a situation where an old log entry is stored on a majority of servers, yet can still be overwritten by a future leader.

The author mentioned that to avoid the situation above

To eliminate problems like the one in Figure 8, Raft never commits log entries from previous terms by counting replicas. Only log entries from the leader’s current term are committed by counting replicas; once an entry from the current term has been committed in this way, then all prior entries are committed indirectly because of the Log Matching Property.

But wouldn't the same problem still occur?

Given the following situation that the author provided

edge case

When S5 is elected leader it only looks at its current committed log which is (term3, index1) and this is going to override term2 entries in all followers.

How does making a leader looking at its own committed log solve the problem?

Menashem answered 14/9, 2017 at 15:46 Comment(0)
J
6

Read the caption on this image. Both (d) and (e) are possible resolutions to the state of the log produced by (a), (b), and (c). The problem is, even though in (c) entry (2, 2) was replicated to a majority of the cluster, this is illustrating that it could still be overwritten when S5 is elected in (d). So, the solution is to only allow nodes to commit entries from their own term. In other words, replicating an entry on a majority of nodes does not equal commitment. In (c), entry (2, 2) is replicated to a majority of the cluster, but because it's not in the leader's term (at least 4) it's not committed. But in (e), after the leader replicates an entry from its current term (4), that prevents the situation in (d) from occurring since S5 can no longer be elected leader.

Jennette answered 15/9, 2017 at 18:5 Comment(3)
OP asked me when I answered something similar, but I didn't know the answer. In order not to wait for OP, let me ask you the same: assume S1 replicated 4 to S2 and S3 as shown in the picture, considered it committed and at the same time was cut off from the other nodes. How do S2 and/or S3 know to prevent S5 from becoming the leader?Vassily
@Vassily Reading the receiver's implementation described in the protocol, the candidate does not get the vote of a follower unless its log is not at least up to date with the follower's log.Technique
@Jennette say we ended up in case (d), according to the rule in the paper, we never committed 3. So now: (1) we might have interleaved committed and non committed log entries? the paper seems saying once current entry is committed, we can conclude all previous entries are committed. have interleaved committed and non committed log entries contradict this. (2) say server 1 crashes, and then reboots, since committed info is volatile and not stored, server 1 would faithfully reapply all log entries it has captured. thus it will execute 3, an entry that is never committed, thus causing discrepancy?Quintuplet
R
0

After S1 replicates entry 4 with a higher term than 2 and 3. S5 will no longer be elected as leader, since the leader election strategy of Raft:

Raft determines which of two logs is more up-to-date by comparing the index and term of the last entries in the logs. If the logs have last entries with different terms, then the log with the later term is more up-to-date. If the logs end with the same term, then whichever log is longer is more up-to-date.

So, in my opinion, the appended log entry 4 in (e) implicitly promote all the entries' term before it. Because what we only care about is the term or the last entry, rather than entry 2 any more.

This is just like what the proposer do in Phase 2 of Paxos:

If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.

That's say, propose the learned value 2 with a higher propose number.

Rimrock answered 30/11, 2017 at 10:14 Comment(0)
H
0

I think both situations in figure 8 (d) and (e) are legal in Raft because the paper says:

To eliminate problems like the one in Figure 8, Raft never commits log entries from previous terms by counting replicas. Only log entries from the leader’s current term are committed by counting replicas.

In figure 8(d) the entries with term 2 is not in the local log of leader S5, and they are not committed to the state machine. It is ok to overwrite them with entries with term 3. Only entries in leader's current log are eligible to be considered as committed by counting the number of replicas.

Heirdom answered 7/3, 2019 at 3:59 Comment(0)
T
0

If we allow entry from the previous term being committed, after (c), entry numbered 2 will be committed. After that, if 3 is selected as the leader, it will overwrite the committed 2. Thus, S5 and S1 will execute different commands. To prevent that, we will not allow 2 committed. Thus, the commands that are executed in all state machines will become consistent.

Technique answered 16/11, 2019 at 10:45 Comment(0)
V
-1

I think the question is that the leader crashed after few followers applied log to the state machine. In (c) in the figure, the Leader has copied the log of term 2 to more than half of the nodes, and then the Leader updates the commitIndex to 2 and sends the heartbeat. And before leader crash, only S2 received the heartbeat and applies the log of term 2 to the state machine .According to the paper, S5 can be new leader by votes from S3 and S4, and try to append the log of term 3 to S2~S4. But, this operation should not be allowed, because S2 has already applied the log at index 2 to state machine. Seems Raft does not cover this situation

Vena answered 10/12, 2019 at 8:57 Comment(1)
I think i know the problem is. In term 4, S1 will not send rpc with commit index 2 because it does not know whether the log entry for term 2 has been replicated to major nodes.Vena

© 2022 - 2025 — McMap. All rights reserved.