How does the messenger maintain the sequencing of the messages during chat and when users log in again?
Asked Answered
P

1

5

I was asked this question in an interview and was unable to answer it.

How does FB messenger order the messages on user side when two messages are concurrent in order to avoid view difference in display order during the chat period and when user visits the messenger again. I thought that we can store a timestamp with each message, which is the time the message is received by the server. However, this will not ensure the correct ordering of messages for clients.

Take a scenario where the server timestamp cannot determine the exact order of messages would look like this:

  1. User-1 sends a message M1 to the server for User-2.
  2. The server receives M1 at T1.
  3. Meanwhile, User-2 sends a message M2 to the server for User-1.
  4. The server receives the message M2 at T2, such that T2 > T1.
  5. The server sends message M1 to User-2 and M2 to User-1.
  6. So User-1 will see M1 first and then M2, whereas User-2 will see M2 first and then M1.

I read that resolve this issue, we can use Vector clocks but was unable to understand how the message sequencing be preserved for different users during the chat and when users log in again.

In the above scenario, user1 will see M1 followed by M2 whereas user2 will see M2 followed by M1. Now if each user also generates a sequence number or timestamp for each of its message to each of the client (separately). Then in scenario above user1 will send message M1 with sequence <1 (user1 seq), 0(user2 seq) > and user2 will send message M2 with sequence <0 (user1 seq), 1(user2 seq) >. So when both the message arrives at user1 and user2 they will have: M1 <1, 0> M2 <0, 1>

Now let’s say user1 sends more messages M3 <2, 1> and M4 <3, 1> then each of client will have following msgs. M1 <1, 0> M2 <0, 1> M3 <2, 1> M4 <3, 1>

So in this case when the user is logged in the display order for user-1 and user-2 during chat will be M1,M2,M3,M4 and M2,M1,M3,M4 respectively. Now, I want to know how will the same order be preserved for user-1 and user-2 end when the login again ?

Thanks.

Pythian answered 29/1, 2021 at 11:22 Comment(4)
I think you're overthinking it. We can assume that the servers have some reliable atomic sequencing such that no two messages between people can have the exact same sequence. As long as that sequence is included in the response then they can order them on the client side too, even if that means a message changes position when messages are sent very quickly. This can be observed with current implementations of popular messaging services.Malaguena
Isn't a realtime DB that atomically tracks the sequence, present on servers as default?!. Also when the client side commit is made, then the client app can store state locally which can then be retrieved from locally stored DB on both sides upon next login. This info from client apps can then be synchronized(as soon as logged in) with server which should correct any invalid sequences in its own DB store and dispatch pending messages to either side.Legume
every messenger apps use some kinda local db caches for uncommited messages to server when offline, if for some reason the db is out of sync the latest info can be fetched from server and re-written locally. if the app is completely online, then there is no need for local cache, as there will be only one central server that will maintain sequence.Legume
@Legume : The comments written by you is not correct. It's a distributed system. Challenges are pretty different.Pythian
E
8

The problem here is how we will generate a consistent chat conversation for each user from these sequence numbers.

Let's assume a conversation between Alice and Bob.

Message sequence structure:

message<Alice seq number,  Bob sequence number>

One thing to note is numbers in M1, M2, M3,... are just used to differentiate the messages and don't have any relation with the actual message sequence.

View from Alice side:

1) Alice sends M1<1,0>
2) Bob sends M2<1,1>
3) Alice sends M3<2,1>
Now, Bob sends one message(M5) but before Alice gets that, Alice sends one more message.
4) Alice sends M4<3,1>
And now, she received a message from Bob.
5) Bob sends M5<2,2> 
Since Bob didn't get M4 before sending M5 the Alice sequence number in M5 is 2. 
If he would have got that, the M5 would look like M5<3,2>.

Now, View from Bob side:

1) Alice sends M1<1,0>
2) Bob sends M2<1,1>
3) Alice sends M3<2,1>
Now, Bob sends message M5 before getting M4 from Alice
4) Bob sends M5<2,2>
5) Alice sends M4<3,1>

Now when Alice logins next time server will fetch the data and sort it:

1) First sort with Bob sequence number. 
2) if two or more messages have the same Bob's sequence number then sort it in Alice's sequence number within them.

Similarly for Bob

1. First sort the message-ids with respect to Alice sequence number.
2. if two or more messages have the same Alice's sequence number then sort it in Bob's sequence number within them.

So for Alice, it would be in the order of Bob's sequence number:

M1<1,0>  
M2<1,1>  
M3<2,1>  
M4<3,1>  
M5<2,2>  

For Bob, it would be in the order of Alice's sequence number:

M1<1,0>  
M2<1,1>  
M3<2,1>  
M5<2,2>  
M4<3,1>

How we will store the message sequences in the database:

enter image description here

How a client will know which is his/her sequence number?

In our example we decide that the first number will be Alice sequence number and the second will be Bob's. But in real-time how this decision will be made. This can easily be solved if we make a convention that the first sequence number will always be the sender's sequence number and the second one is receivers. So when someone receives a message then he knows that the first sequence number is the sender's sequence number. and when he prepares the next message he increments his sequence number from the last received message and puts it in the first place and takes the sender's sequence number from the received message and put it in second place.

How server will know that which sequence number has to be stored where?

Now since we defined the above convention if the server gets a message from Alice the first field will be Alice sequence number and the second will be Bob's sequence number so it will store in that way. Similarly, it does it for Bob also.

Note: I was also looking for the solution for the above problem but didn't get anything on the net that can help so made my own solution. Please correct me if it breaks any use case so that we can improve it or try something else.

Extraterritorial answered 18/9, 2021 at 6:34 Comment(1)
How will this work in group chat ?Emendation

© 2022 - 2024 — McMap. All rights reserved.