After a Kafka topic has been created by a producer or an administrator, how would you change the number of replicas of this topic?
Edit: I was proven to be wrong - please check excellent answer from Łukasz Dumiszewski.
I'm leaving my original answer for completness for now.
I don't think you can. Normally it would be something like
./kafka-topics.sh --zookeeper localhost:2181 --alter --topic test2 --replication-factor 3
but it says
Option "[replication-factor]" can't be used with option"[alter]"
It is funny that you can change number of partitions on the fly (which is often hugely destructive action when done in runtime), but cannot increase replication factor, which should be transparent. But remember, it is 0.10, not 10.0... Please see here for enhancement request https://issues.apache.org/jira/browse/KAFKA-1543
To increase the number of replicas for a given topic you have to:
1. Specify the extra replicas in a custom reassignment json file
For example, you could create increase-replication-factor.json and put this content in it:
{"version":1,
"partitions":[
{"topic":"signals","partition":0,"replicas":[0,1,2]},
{"topic":"signals","partition":1,"replicas":[0,1,2]},
{"topic":"signals","partition":2,"replicas":[0,1,2]}
]}
2. Use the file with the --execute option of the kafka-reassign-partitions tool
[or kafka-reassign-partitions.sh - depending on the kafka package]
For example:
$ kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute
3. Verify the replication factor with the kafka-topics tool
[or kafka-topics.sh - depending on the kafka package]
$ kafka-topics --zookeeper localhost:2181 --topic signals --describe
Topic:signals PartitionCount:3 ReplicationFactor:3 Configs:retention.ms=1000000000
Topic: signals Partition: 0 Leader: 2 Replicas: 0,1,2 Isr: 2,0,1
Topic: signals Partition: 1 Leader: 2 Replicas: 0,1,2 Isr: 2,0,1
Topic: signals Partition: 2 Leader: 2 Replicas: 0,1,2 Isr: 2,0,1
See also: the part of the official documentation that describes how to increase the replication factor.
{ "topics": [ { "topic": "YOUR_TOPIC_NAME_1" }, { "topic": "YOUR_TOPIC_NAME_2" } ], "version": 1 }
The command then looks like kafka-reassign-partitions.sh --zookeeper #.#.#.#:2181,#.#.#.#:2181,#.#.#.#:2181 --broker-list #,#,# --topics-to-move-json-file reassignment.topics.json --generate
–
Electrometer kafka-reassign-partitions
cause any downtime? I have some topics with a replication factor of 1 (default, forgot to specify when creating), and I'm wondering if my producers will get errors while partitions reassigned. –
Den You can also use kafkactl for this:
# first run with --validate-only to see what kafkactl will do
kafkactl alter topic my-topic --replication-factor 2 --validate-only
# then do the replica reassignment
kafkactl alter topic my-topic --replication-factor 2
Note that the Kafka API that kafkactl is using for this is only available for Kafka ≥ 2.4.0.
Disclaimer: I am contributor to this project
Edit: I was proven to be wrong - please check excellent answer from Łukasz Dumiszewski.
I'm leaving my original answer for completness for now.
I don't think you can. Normally it would be something like
./kafka-topics.sh --zookeeper localhost:2181 --alter --topic test2 --replication-factor 3
but it says
Option "[replication-factor]" can't be used with option"[alter]"
It is funny that you can change number of partitions on the fly (which is often hugely destructive action when done in runtime), but cannot increase replication factor, which should be transparent. But remember, it is 0.10, not 10.0... Please see here for enhancement request https://issues.apache.org/jira/browse/KAFKA-1543
Łukasz Dumiszewski's answer is correct but manually generating that file is a bit hard. Luckily there are some easy ways to achieve what @Łukasz Dumiszewski said.
If you are using
kafka-manager
tool, from version2.0.0.2
you can change the replication factor inGenerate Partition Assignment
section in a topic view. Then you should click onReassign Partitions
to apply the generated partition assignment (if you select a different replication factor, you will get a warning but you can click onForce Reassign
afterward).If you have ruby installed you can use this helper script
- If you prefer nodejs you can generate the file with this gist too.
The scripted answer of @Дмитрий-Шепелев did not include a solution for topics with multiple partitions. This updated version does:
#!/bin/bash
brokerids="1,2,3"
topics=`kafka-topics --list --zookeeper zookeeper:2181`
while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
"partitions":['
for t in $topics; do
sep=","
pcount=$(kafka-topics --describe --zookeeper zookeeper:2181 --topic $t | awk '{print $2}' | uniq -c |awk 'NR==2{print $1}')
for i in $(seq 0 $[pcount - 1]); do
if [ "${t}" == "${lines[-1]}" ] && [ "$[pcount - 1]" == "$i" ]; then sep=""; fi
randombrokers=$(echo "$brokerids" | sed -r 's/,/ /g' | tr " " "\n" | shuf | tr "\n" "," | head -c -1)
echo " {\"topic\":\"${t}\",\"partition\":${i},\"replicas\":[${randombrokers}]}$sep"
done
done
echo ' ]
}'
Note: it also randomizes the brokers and picks two replicas per partition. So make sure the brokerid's in the script are correctly defined.
Execute as follows:
$ ./reassign.sh > reassign.json
$ kafka-reassign-partitions --zookeeper zookeeper:2181 --reassignment-json-file reassign.json --execute
This script may help you, if you want change replication factor for all topics:
#!/bin/bash
topics=`kafka-topics --list --zookeeper zookeeper:2181`
while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
"partitions":[' > tmp.json
for t in $topics; do
if [ "${t}" == "${lines[-1]}" ]; then
echo " {\"topic\":\"${t}\",\"partition\":0,\"replicas\":[0,1,2]}" >> tmp.json
else
echo " {\"topic\":\"${t}\",\"partition\":0,\"replicas\":[0,1,2]}," >> tmp.json
fi
done
echo ' ]
}' >> tmp.json
kafka-reassign-partitions --zookeeper zookeeper:2181 --reassignment-json-file tmp.json --execute
If you have a lot of partitions, using kafka-reassign-partitions
to generate the json file required by Łukasz Dumiszewski's answer (and the official documentation) can be a timesaver. Here is an example of replicating a 64 partition topic from 1 to 2 servers without having to specify all the partitions:
expand_topic=TestTopic
current_server=111
new_servers=111,222
echo '{"topics": [{"topic":"'${expand_topic}'"}], "version":1}' > /tmp/topics-to-expand.json
/bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file /tmp/topics-to-expand.json --broker-list "${current_server}" --generate | tail -1 | sed s/\\[${current_server}\\]/\[${new_servers}\]/g | tee /tmp/topic-expand-plan.json
/bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /tmp/topic-expand-plan.json --execute
/bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic ${expand_topic}
Outputs:
Topic:TestTopic PartitionCount:64 ReplicationFactor:2 Configs:retention.ms=6048000
Topic: TestTopic Partition: 0 Leader: 111 Replicas: 111,222 Isr: 111,222
Topic: TestTopic Partition: 1 Leader: 111 Replicas: 111,222 Isr: 111,222
....
1. Copy all topics to json file
#!/bin/bash
topics=`kafka-topics.sh --zookeeper localhost:2181 --list`
while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
"topics":['
for t in $topics; do
echo -e ' { "topic":' \"$t\" '},'
done
echo ' ]
}'
bash alltopics.sh > alltopics.json
2. Run kafka-reassign-partitions.sh to generate rebalanced file
kafka-reassign-partitions.sh --zookeeper localhost:2181 --broker-list "0,1,2" --generate --topics-to-move-json-file alltopics.json > reassign.json
3. Cleanup reassign.json file it contains existing and proposed values
4. Run kafka-reassign-partitions.sh to rebalance topics
kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file reassign.json --execute
In the first step we need to alter topics with replicas
./kafka-topics.sh --describe --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --topic test2
then in the next step we need to identify brokers list where we need to sync our replicas and it requires topic rebalance to do this create a json file and define all the ISR brokers and topic
{"version":1,
"partitions":[
{"topic":"test2","partition":0,"replicas":[0,10]},
{"topic":"test2","partition":1,"replicas":[10,20]}
]}
In the last we need to rebalance the topics for partitions
./kafka-reassign-partitions.sh --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --reassignment-json-file /tmp/increase-replication-factor.json --execute
To verify
[root@prod-az-p2-kafka02 bin]# ./kafka-topics.sh --describe --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --topic test2
Topic: test2 TopicId: -LoL36ztSeyC8rzvnp4YMw PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: test2 Partition: 0 Leader: 10 Replicas: 0,10 Isr: 10
Topic: test2 Partition: 1 Leader: 20 Replicas: 10,20 Isr: 20,10
The answer by Lukas is correct, but it leaves open the question about how best to generate the topic assignment JSON files that kafka-reassign-partitions
needs as input.
I like to use the DataDog topicmappr tool to create the topic re-assignments in an intelligent way. The tool is deterministic, inspects the current layout, and can optimize it in various configurable ways.
For example:
topicmappr rebuild --brokers "-2" --topics .\* --topics-exclude __.\* \
--replication 2 --optimize-leadership --force-rebuild --skip-no-ops \
--out-path remaps/ --zk-addr $zk
would rebalance all topics (excluding topics starting with "__") with a replication factor of 2, optimize leadership so that leadership for the given topics is spread evenly across the available brokers, force a map rebuild, skip anything that hasn't changed, and output all the resulting JSONs to the remaps
directory.
The tool can optimize partition placement for even storage (for unbalanced partitions) or partition counts, is rack-aware, and has various other useful options.
The tool is completely safe to use, as all it does is output a summary of everything it is doing, and the JSONs needed for the remapping. It doesn't make any changes itself.
This script will generate the JSON for kafka-reassign-partitions.sh
and feed it into that script to increase the replication factor. The new set of replicas will:
- Keep the current replicas
- Add new unique brokers (this will prevent unneeded data migrations)
This script was tested with 2.8.0 Kafka scripts. Only the variables at the top of the file will need modified.
#!/bin/bash
KAFKA_BIN="./bin"
KAFKA_CONNECTION_ARGS="--bootstrap-server localhost:9094"
broker_ids="1,2,3"
topic="topic_foobar"
new_replication_factor=3 # New replication factor
reassignment_file="./reassignment.json"
#~~~~ Don't change anything after this line ~~~~#
# Generate a list of "partition|replicas"
topic_data="$("$KAFKA_BIN/kafka-topics.sh" $KAFKA_CONNECTION_ARGS --describe --topic "$topic" | tail -n +2 | sed -E 's/.*Partition:\s+([0-9]+).*Replicas:\s+([0-9,]+).*/\1|\2/g')"
partition_count=$(echo "$topic_data" | wc -l)
echo '{
"version": 1,
"partitions": [' > "$reassignment_file"
log_dirs="$(yes '"any"' | head -n $new_replication_factor | sed -e ':a;N;$!ba;s/\n/,/g')"
obj_sep=","
while read -r partition_data; do
partition=$(echo "$partition_data" | cut -d '|' -f 1)
replicas=$(echo "$partition_data" | cut -d '|' -f 2)
# Randomize the replicas (using this list as a queue)
random_replicas="$(echo $broker_ids | tr "," "\n" | shuf)"
# Loop until the replicas has desired RF - 1 commas
while [ "$(echo "$replicas" | tr -dc , | wc -c)" != $((new_replication_factor-1)) ]; do
# Pick the next replica, add it to the list if it isn't already there, otherwise advance the queue
next_replica="$(echo "$random_replicas" | head -1)"
if [[ $replicas != *$next_replica* ]]; then
replicas="$replicas,$next_replica"
else
random_replicas="$(echo "$random_replicas" | tail -n +2)"
fi
done
# Don't add a comma on the last object
if [ "$((partition_count-1))" == "$partition" ]; then obj_sep=""; fi
echo ' {
"topic": "'"$topic"'",
"partition": '"$partition"',
"replicas": ['"$replicas"'],
"log_dirs": ['"$log_dirs"']
}'$obj_sep >> "$reassignment_file"
done < <(echo "$topic_data")
echo ' ]
}' >> "$reassignment_file"
cat "$reassignment_file"
read -p "Apply the above reassignment? (Ctrl-C to exit): "
"$KAFKA_BIN/kafka-reassign-partitions.sh" $KAFKA_CONNECTION_ARGS --execute --reassignment-json-file "$reassignment_file"
For Kafka versions 0.8.x
to later
, the --replica-assignment
option to modify the number of replicas.
For Kafka versions 2.4.0 and later, kafka-replica-assignment
tool option to modify the number of replicas.
I will recommende to use below Post by confluence
You can use kafka-ui web application to manage your kafka cluster, including changing your replication factor for topics.
Open the UI, select your topic from right hand corner and choose "Edit settings". Change to your desired replication factor and save.
To increase the number of replicas for a given topic you have to:
1. Specify the extra partitions to the existing topic with below command(let us say increase from 2 to 3)
bin/kafktopics.sh --zookeeper localhost:2181 --alter --topic topic-to-increase --partitions 3
2. Specify the extra replicas in a custom reassignment json file
For example, you could create increase-replication-factor.json and put this content in it:
{"version":1,
"partitions":[
{"topic":"topic-to-increase","partition":0,"replicas":[0,1,2]},
{"topic":"topic-to-increase","partition":1,"replicas":[0,1,2]},
{"topic":"topic-to-increase","partition":2,"replicas":[0,1,2]}
]}
3. Use the file with the --execute option of the kafka-reassign-partitions tool
bin/kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute
4. Verify the replication factor with the kafka-topics tool
bin/kafka-topics --zookeeper localhost:2181 --topic topic-to-increase --describe
© 2022 - 2024 — McMap. All rights reserved.