I am running a simple 3 node of kafka
and 5 node of zookeeper
to run the kafka
, I would like to know which is the good way of backup my kafka
, same for my zookeeper
.
For the moment I just export my data directory to a s3 bucket...
Thanks.
I am running a simple 3 node of kafka
and 5 node of zookeeper
to run the kafka
, I would like to know which is the good way of backup my kafka
, same for my zookeeper
.
For the moment I just export my data directory to a s3 bucket...
Thanks.
Zalando has recently published pretty good article how to backup Kafka and Zookeeper. Generally there are 2 paths for Kafka backup:
The preferred backup solution will depend on your use case. E.g. for streaming applications, first solution may give you less pain, while when using Kafka for event sourcing, the second solution may be more desirable.
Regarding Zookeeper, Kafka keeps there information about topics (persistent store), as well as for broker discovery and leader election (ephemeral). Zalando settled on using Burry, which simply iterates over Zookeeper tree structure, dumps it to file structure, to later zip it and push to cloud storage. It suffers from a little problem, but most probably it does not impact backup of Kafka's persistent data (TODO verify). Zalando describes there, that when restoring, it is better to first create Zookeeper cluster, then connect a new Kafka cluster to it (with new, unique broker IDs), and then restore Burry's backup. Burry will not overwrite existing nodes, not putting ephemeral information about old brokers, what is stored in backup.
Note: Although they mention usage of Exhibitor, it is not really needed for backup when backing up with Burry.
Apache Kafka already keeps your data distributed and also provide strong consistent replication capabilities.
From an architecture design point of view first we need to understand that what a backup means for us?
is it for surviving a data center failure?
As you said in the comment imagine the case when your entire datacenter is down, then it means that everything running in that datacenter is gone, not just the kafka. To handle such kind of failures, you need to design a real-time replication strategy to a different datacenter & you can use kafka-mirror maker for that. You need to set up a kafka cluster in a different data center (not necessarily with same hardware resources) and then configure your current data center Kafka to be mirrored on this other datacenter.
In the case of a datacenter wide failure, all of your services will be running from this fallback datacenter and they will be using your mirrored Kafka as the primary kafka.
Then once the other data center is back, you can set up the mirror in the opposite way and you can come to your old (destroyed) datacenter.
Kafka connect has a couple of out the box connectors for transporting data from kafka with consistency guarantee. So maybe you can choose AWS S3 as your backup store and the following connector can do that for you.
Pinterest has secor service which transfer data to AWS S3, Google & Mircosoft Cloud storages. I am sure you can also find some dedicated connectors for all the big cloud providers. Few things which need to be considered in case of backing up the Kafka data to a highly available cloud storage.
kafka has a data retention policy per topic, so the old data will be removed from the Kafka servers by Kafka itself, but it will still stay in your AWS S3 bucket, so if you directly copy it back in case of a restore event then you will see much more data on Kafka brokers and also it will not be a good idea to restore entire data into existing running Kafka cluster because then you will start processing old data. So be selective & careful in this process
For zookeeper, you can also copy the data to AWS S3 but you need to be careful in restoring because of the ephemeral nodes. I have found few links which can help:
https://jobs.zalando.com/tech/blog/backing-up-kafka-zookeeper/ https://www.elastic.co/blog/zookeeper-backup-a-treatise https://medium.com/@Pinterest_Engineering/zookeeper-resilience-at-pinterest-adfd8acf2a6b
In the end, "Prevention is better than cure". So if you are running in a cloud provider setup like AWS then you can deploy your cluster setup by keeping failures upfront in your mind. Below link has some information.
https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-kafka-on-aws/
© 2022 - 2024 — McMap. All rights reserved.