What do matrix clocks solve but vector clocks can't?

Asked 26/1, 2014 at 3:12 Answered 10/8, 2014 at 11:41

distributed-computing clock vector-clock

I understand the need for vector clocks in terms of scalar logical clocks failing to provide enough information to tell whether there is an update conflict in a key value store update for example.

But I am not sure what problem is still unsolved by vector clocks and then solved by the more bulky matrix clocks?

Bowens answered 26/1, 2014 at 3:12 Comment(0)

In an eventual consistency environment all messages ever created by a system need to be kept until every peer has received the message (== eventual consistency). But you don't want to keep messages for ever, so you need to have a way to tell which messages were received by all nodes and can be deleted, this is why you use matrix clocks.

Matrix clocks are a list of vector clocks, so you know the current state of each node in the system. Based on this you can know which peer received already which messages. When you exchange messages with another node in the system you compare the matrix clocks and remember always the highest values for each node. Afterwards you can delete messages which were sent before, because the node already must have received them.

This is a very brief description of TSAE (timestamped anti-entropy) protocol. You can read more about it in the dissertation project Weak-consistency group communication and membership by Richard Andrew Golding from 1992 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.7385&rep=rep1&type=pdf) starting from chapter 5.

Determination answered 28/1, 2014 at 1:48 Comment(8)

Couldn't you find out what other nodes know eventually through the use of vector clocks? (or would this take a lot longer, and the number of messages would be a lot longer?) – Bearish 13/11, 2014 at 12:19

yes, you'd need to talk to every node p2p to get there vector clocks and know what they know. matrix clocks you the info from every node you communicate with – Determination 13/11, 2014 at 12:21

Are there any real world systems that uses matrix clock for this very purpose of eventual consistency? Or do newer eventual consistency systems do something different? – Distinguished 19/11, 2022 at 8:40

It seems for this approach to work with dynamic number of processes entering and leaving an eventually consistent system, there needs to be some process discovery algorithm to let each host know which hosts has shut down and no longer needs to be kept in the 'dictionary (instead of vector) clock' – Distinguished 19/11, 2022 at 8:46

Another paper for this purpose being described by peter Paper: Discarding Obsolete Information in a Replicated Database System (1987, TSE Transactions on Software Engineering) S. K. Sarin groups.csail.mit.edu/tds/papers/Lynch/tse87.pdf – Distinguished 19/11, 2022 at 8:51

If we resort to the node talking to every other node to get their vector clock, each host wil still need to cache them locally as it awaits updates from outdated hosts, resulting in creating a matrix clock by definition. Hence, it might be faster to just send matrix clock with a few peers to allow the information to propagate transitively instead of talking to every node. – Distinguished 19/11, 2022 at 9:23

I haven't come across a real world system using matrix clocks yet. These highly available distributed systems would only work, if all concurrent messages are commutative. And if you have enough participants in the system, most messages will be concurrent to other messages, basically all of them have to be commutative, to guarantee that everyone reaches the same state eventually. And this is difficult to achieve. You could try to use CRDTs, but you can't model everything with CRDTs. – Determination 23/11, 2022 at 10:55

Additionally, in practice you often times need to implement sharding. So you don't even want everyone to have the same set of information. You could look at systems like syncthing.net which allows you to sync files between different nodes, and no node is the server. But maybe they just do a diff of the file database, because that's easier than storing a long list of events and making sure everything was propagated correctly. Also how would you deal with peers which disappear for a very long time, you never know if they ever come back online, if you need to keep those events around. – Determination 23/11, 2022 at 11:1

The distinctions among Lamport clock (scalar logical clock, in your term), vector clock, and matrix clock lie in that they represent different levels of knowledge.

For vector clock $vt_i[1 \ldots n]$ in site $i$, the entry $vt_i[k]$ represents the knowledge the site $S_i$ has about site $S_k$. The knowledge has the form of "$i$ knows $k$ that $\ldots$".

For matrix clock $mt_i[1 \ldots n, 1 \ldots n]$ in site $S_i$, the entry $mt_i[k,l]$ represents the knowledge the site $S_i$ has about the knowledge by $S_k$ about site $S_l$. The knowledge here the form of "$i$ knows $k$ knows $l$ that $\ldots$".

Intuitively, we can do more things with more knowledge.

The following description is mainly quoted from [1]:

Vector clocks and matrix clocks are widely used in asynchronous distributed message-passing systems.

Some example areas using vector clocks are checkpointing, causal memory, maintaining consistency of replicated files, global snapshot, global time approximation, termination detection, bounded multiwriter construction of shared variables, mutual exclusion and debugging (predicate detection).

Some example areas that use matrix clocks are designing fault-tolerant protocols and distributed database protocols, including protocols to discard obsolete information in distributed databases, and protocols to solve the replicated log and replicated dictionary problems.

For matrix clock, we notice that $min_k(mt_i[k,i]) \ge t$ means that site $S_i$ knows that every other site $k$ knows its progress till its local time $t$.

It is this property that allows a site to no longer send an information with a local time $\le t$ or to discard obsolete information.

[1] Concurrent Knowledge and Logical Clock Abstractions Ajay D. Kshemkalyani 2000

Injection answered 10/8, 2014 at 11:41 Comment(0)

Recommended topics

Hot tags