In Kafka is each message replicated across all partitions of a topic?
Asked Answered
P

3

49

If a topic has 4 partitions, and a publisher sends a message to the topic, will that same message be replicate across all four partitions or only one?

Phelips answered 27/6, 2017 at 18:36 Comment(4)
why down vote this?????Phelips
Please check this little article on Kafka replication: meuslivros.github.io/kafka/ch04s04.html. I just upvoted the question. The one who downvoted without any reason is not cool.Kallman
In fact replication is not done across partitions but it's done across nodes of a multi broker kafka cluster.Kallman
I didn't vote on this but a valid reason for down-votes can be (as per SO guidelines): "This question does not show any research effort". I'd agree with that and it's not uncool to downvote such questions.Sopping
H
43

Partitioning and replication are two different things.

Partitioning is for scalability. A topic is partitioned in one or more partitions distributed on different brokers so that more consumers can connect to these brokers in order to receive messages sent to the same topic but from different partitions. Increasing partitions increases scalability and the possibility to have more consumers to get messages from the same topic. Answering your question, each message sent to a topic comes into only one partition (of the topic itself).

Replication is for fault-tolerance. You can specify a replication factor on topic creation and it means that every partition for that topic is replicated more times on different brokers. One replica is the "leader" where producer sends and consumer gets messages; other replicas are "follower" which have copies of messages from the "leader" replica. If the broker which handles the "leader" replica goes down, one of the "follower" becomes leader.

Huzzah answered 28/6, 2017 at 6:17 Comment(6)
thank you for your detailed answer. So, does this mean that if you split a topic into 2 partitions you won't be able to guarantee the order in which they are consumed?Phelips
Kafka guarantees messages ordering only per partition not per topic. It means that if you need a specific order for some messages, you have to assign a key to the messages so that the producer sends messages with same key to the same partition and you have the order you need. If you don't use a key, the way it works is in round robin and you lose ordering.Huzzah
ok, but how about if you want to horizontally scale your consumers, say 5 consumers. They all need to subscribe to the same topic and the load of the topic needs to be spread across them. If we use partitions then how do we deal with the "Broadcast" case where we want exactly the same message to go to each consumer..do we need to put it on each partition? Can a single producer write the same message to all partitions?Phelips
It works using consumer groups. When consumers belong to the same consumer group, each of them reads from one or more partitions of the topic. Remember the each partition can be read from only one consumer. So for example, topic with 4 partitions and you have 2 consumers, each of them will read messages from 2 partitions; if you go up to 4 consumers each will get 1 partition; if you go to 5 consumers, the last one will be idle because no partitions are available. If you have consumers in different consumer groups they will receive same messages as a "broadcast".Huzzah
Please, clarify, what means "receive same messages as a "broadcast". ? "If consumers are in different groups (or group was not set) they are receiving all topic messages, so it means they are reading all partitions?Marquise
For example, 1 topic A with 2 partitions 0 and 1. Then two consumer groups A and B. If A has a consumer it's reading from topic both partitions 0 and 1. If B has a consumer it's reading from topic and again on same partitions 0 and 1, so getting same messages. The consumers get messages across different partitions when they belong to the same consumer group.Huzzah
A
12

Replication does not occur across partitions. Each message goes into a single partition of the topic, no matter how many partitions the topic has.

If you have set the replication-factor for topic to a number larger than 1 (assuming you have multiple brokers running in the cluster), then each partition of the topic is replicated across those brokers.

Amandie answered 27/6, 2017 at 19:19 Comment(0)
S
1

"In Kafka, is each message replicated across all partitions of a topic?" - Answer: No.

Each Kafka topic's partitions can hold distinct data. The determination of which partition receives data relies on the Kafka producer. When sending data to a Kafka topic, an optional key parameter is available. If provided, the partition is determined using Murmur hashing. Alternatively, if no key is provided or if it's null, data is evenly distributed across all partitions using a round-robin approach.

"In Kafka, the rule states that within a topic, only one consumer can be assigned per partition within a single consumer group. However, it's permissible to have multiple consumer groups for the same topic."

Slaby answered 5/4 at 5:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.