Messaging platform with QoS / Kafka partition overloading
Asked Answered
H

1

6

I'm having a recurrent issue with Kafka: I partition messages by customer id, and sometimes it happens that a customer gets a huge amount of messages. As a result, the messages of this customer and all other customers in the same partition get delayed.

Are there well-known ways to handle this issue? Possibly with other messaging platforms?

Ideally, only the messages of one customer would be delayed. Other customer's messages would get an equal share of consumers' bandwidth.

Note: I must partition by customer id, because I want to consume the messages of any given custom in order. However, I can consume the messages of two customers in any order.

Heyer answered 23/5, 2018 at 9:35 Comment(3)
is customer numbers fixed? I am not sure dynamic partitons is recommended. Do you have enough capacity to consume the toipics?Lesson
also, this statment dosen't make sense. "I partition messages by customer id, and sometimes it happens that a customer gets a huge amount of messages. As a result, the messages of this customer and all other customers in the same partition get delayed." are messages partitioned by customer ID or not?Lesson
Check this article, to me that is a good overview: jack-vanlightly.com/blog/2017/12/4/…Encomiastic
L
4

I will try and answer based on the limited information porovided.

Kafka partitoins are the smalles unit of scalability, so for example, if you have 10 parallel consumers (kafka topic listeners) you should partiton your topic by this number or higher otherwise, some of your listeners will bet starved as kafka manage the consumers in a way that only one consumer will be getting messages from a partiton. This is to protect the partiton from mixing messages order. The other way is supported as consumers can handle more than one partiton at a time.

My design solution will be to decide how much capacity are you planning to allocate for the consumers (microservices) instances? This number will guide you to the right number of partitons.

I would avoid using a dynamic number of partitons as this does not scale well. Use the number that match the capacity you plan to allocate and some extra spare in the case you need to scale up in the future. Let's say tomorrow you have 5 new customers, adding partitons is not easy or wise.

Kafka will make sure the messages stay in order per partition so this is free for your use case. What you need is on the consumer end to be able to handle the different customer ID messages in the right order. To avoid messages to the same customer get mixed order your partiton must be a higher level category of customers, I can think of customer type/region/size ... The idea is that all of a single customer messages stay in the same topic.

Your partitoin key must relate to the size of messages/data so your messages spread eavenly over your kafka cluster. This helps with the kafka cluster scale & redundency itself.

deciding on the right partitioning strategy is hard but it is worth the time spent on planning it.

One design solution come up a lot is hashing. Map a partition number using a HASH from customer ID to a partiton key. Again, decide on a fixed partiton number and let the HASH map the customer ID to your partiton key.

using X modulo partitions

X customers have a lot of messages and you need to have one topic per customer. so in this case you map a customer per topic so your modulo will be the number of these customers.

Y customers are low trafic customers, for these customers use a different modulo of Y/5 for example so you have 5 customers sharing a topic.

make sure you add the X partiton number to the Y partition number so you dont overlap.

the only issue I see is this is not flexible, you cannot change the mapping if the number of customers changes. You might allow more topics in each group to support future partitons.

Lesson answered 2/6, 2018 at 8:30 Comment(2)
I'm using a hashing partitioner: Customer IDs are hashed, and the hash is mapped to a partition by using a modulo. As a result, customers are evenly distributed on all partition, but nothing guarantees that partitions will get a similar amount of traffic: If a single customer has 10 times more messages than others, his partition gets more messages. The number of partitions is already quite large.Heyer
I see. Maybe you can aggregate small customers together and leave the busy customers on a per customer topic. customer with heavy trafic will use a larger modulo (one customer per topic) and customers with lower traffic will share topics.Lesson

© 2022 - 2024 — McMap. All rights reserved.