Can a Kafka producer create topics and partitions?
Asked Answered
J

4

16

currently I am evaluating different Messaging Systems. There is a question related to Apache Kafka which I could not answer myself.

Is it possible for a Kafka producer to create topics and partitions (on existing topics aswell) dynamically? If yes, is there any disadvantage that comes with it?

Thanks in Advance

Jacie answered 22/4, 2017 at 19:58 Comment(0)
F
25

Updated:

The kafka broker has a property: auto.create.topics.enable

If you set that to true if the producer publishes a message to the topic with the new topic name it will automatically create a topic for you.

The Confluent Team recommends not doing this because the explosion of topics, depending on your environment can become unwieldy, and the topic creation will always have the same defaults when created. It's important to have a replication-factor of at least 3 to ensure durability of your topics in the event of disk failure.

Floppy answered 24/4, 2017 at 4:55 Comment(1)
Thanks, I my case I want to have a topic/partition for each device (producer). I do not know how many devices there will be, so I want to add them dynamically. The above solution sounds a bit “sluggish“. I gues a classic Pub/Sub system might work better.Jacie
V
6

When you are starting your kafka broker you can define a bunch of properties in conf/server.properties file. One of the property is auto.create.topics.enable if you set this to true (by default) kafka will automatically create a topic when you send a message to a non existing topic. The partition number will be defined by the default settings in this same file.

Disadvantages : as far as I know, topics created this way will always have the same default settings (partitions, replicas ...).

Volume answered 25/4, 2017 at 14:15 Comment(1)
Thus, in fact beacuse of the downside of having the same partition number for ALL the topics this is not a viable solutionParttime
M
3

From java you can create a topic, if needed. Whether it's recommended or not, depends on the use-case. E.g. if your topic name is a function of the incoming payload to the producer, it might be useful. Following is the code snippet that works in kafka 0.10.x

void createTopic(String zookeeperConnect, String topicName) throws InterruptedException {
    int sessionTimeoutMs = <some-int-value>;
    int connectionTimeoutMs = <some-int-value>;

    ZkClient zkClient = new ZkClient(zookeeperConnect, sessionTimeoutMs, connectionTimeoutMs, ZKStringSerializer$.MODULE$);

    boolean isSecureKafkaCluster = false;
    ZkUtils zkUtils = new ZkUtils(zkClient, new  ZkConnection(zookeeperConnect), isSecureKafkaCluster);

    Properties topicConfig = new Properties();
    try {
      AdminUtils.createTopic(zkUtils, topicName, 1, 1, topicConfig,
      RackAwareMode.Disabled$.MODULE$);
    } catch (TopicExistsException ex) {
    //log it 
    }
    zkClient.close();
}

Note: It's only allowed to increase no. of partitions.

Marceau answered 27/4, 2017 at 12:37 Comment(2)
we use a similar approach to create topics on the fly. What about partitions?Propylaeum
@Propylaeum The AdminUtils.createTopic() method takes both the number of partitions and replications as argument. So, you can choose those accordingly.Marceau
B
1

For any messaging system, i don't think it is recommended way to create topic/partition or any queue dynamically by producer.

For you use case, you can probably use device_id as your as partition key to distinguish the messages.That way you can use one topic.

Bermuda answered 26/4, 2017 at 4:43 Comment(3)
I thought about that. The problem is, I do not know all devices/device-ids. Or in other words, I want to add devices that publish data dynamically.Jacie
I don't think you need to worry about anticipating the keys (i.e. the devices). Kafka by default will assign partitions randomly. If you want to separate by device (i.e. key) you can create a stream that filters on the key name.Salmagundi
@Girdhar if using device_id, the consumer has to first read all messaes in the topic and then filter out by device_id to get relevant data, is it?Adamant

© 2022 - 2024 — McMap. All rights reserved.