When to use Apache kafka instead of ActiveMQ [closed]
Asked Answered
G

4

101

I am working on Apache Kafka. I want to know which one is better: Kafka or ActiveMQ. What is the main difference between this two technologies? I want to implement Kafka in Spring MVC.

Gollin answered 28/6, 2017 at 2:15 Comment(1)
Possible duplicate of ActiveMQ or RabbitMQ or ZeroMQ orLexicostatistics
H
84

Kafka and ActiveMQ may have some overlaps but they were originally designed for different purposes. So comparing them is just like comparing an Apple and an Orange.

Kafka

Kafka is a distributed streaming platform with very good horizontal scaling capability. It allows applications to process and re-process streamed data on disk. Due to it's high throughput it's commonly used for real-time data streaming.

ActiveMQ

ActiveMQ is a general-purpose message broker that supports several messaging protocols such as AMQP, STOMP, MQTT. It supports more complicated message routing patterns as well as the Enterprise Integration Patterns. In general it is mainly used for integration between applications/services especially in a Service Oriented Architecture.

Haynie answered 28/6, 2017 at 2:58 Comment(2)
First thought was comparing apple inc. with an orangeIndispensable
@Indispensable even I thought was comparing Apple Inc. and Opera until I saw O...Forebear
O
27

I hear this question every week... While ActiveMQ (like IBM MQ or JMS in general) is used for traditional messaging, Apache Kafka is used as streaming platform (messaging + distributed storage + processing of data). Both are built for different use cases.

You can use Kafka for "traditional messaging", but not use MQ for Kafka-specific scenarios.

The article “Apache Kafka vs. Enterprise Service Bus (ESB)—Friends, Enemies, or Frenemies? (https://www.confluent.io/blog/apache-kafka-vs-enterprise-service-bus-esb-friends-enemies-or-frenemies/)” discusses why Kafka is not competitive but complementary to integration and messaging solutions (including ActiveMQ) and how to integrate both.

Oceanus answered 27/7, 2018 at 7:13 Comment(0)
S
22

I think one thing that should be noted in a discussion about which brokers to use (and when Kafka comes up) is that the Kafka benchmark that is frequently referenced shows the upper limit of any modern distributed computer. Today's brokers all have about the same total capacity in MB/s. Kafka does extremely well with small messages (10-1024 bytes) when compared to other brokers, but still limits out at around the ~75 Mb/s mark (per-broker).

There is frequently an apples-to-oranges comparison esp when talking "clustering". ActiveMQ and other enterprise brokers cluster the publishing of messages and the tracking of consumer subscriptions. Kafka clusters the publishing and requires the consumer to track subscription. Seems minimal, but its a significant difference.

All brokers have the same back pressure issues-- Kafka can do a "LAZY PERSISTENCE" where the producer isn't waiting around for the broker to sync to disk.. this is good for a lot of use cases, but probably not the I-care-about-every-single-message scenario ppatierno mentions in his slide show.

Kafka really good for horizontal scaling for things like big data processing of small messages. ActiveMQ is more ideal for the class of use case frequently referred to as enterprise messaging (this is just a term, doesn't mean Kafka isn't good for the enterprise)-- transacted data (although Kafka is adding this).. kiosk.. retail store.. store and forward.. dmz traversal.. data center-to-data center publishing.. etc

Springs answered 28/6, 2017 at 13:20 Comment(7)
Can you say why Kafka isn't what you want for I-care-about-every-single-message scenarios? Message queues where you keep track of where you're up to and the sender keeps a backlog of messages that it sent so that the receiver can roll back and connect back and request old messages again is very reliable isn't it? And it gets way better throughput. Like this: cedanet.com.au/ceda/persistent-message-queue.phpBobsled
The default behavior of 'send()' in the Kafka Producer API is asynchronous. Process failure while messages are buffered in memory will result in message loss. Split brain and partition leader failover can also lead to message loss. There isn't a silver bullet.. its benefits and trade-offs. FWIW-- Producer side fanout + JMS-like persistence gets my vote for best distributed computing option to not lose messages.Springs
To solve the throughput question-- produce via multiple threads. Single threaded blocking isn't always 'bad'. It is reliable and provides best-available maintaining of message ordering. Again, its benefits and trade-offs. Receiver rollback and reprocess is very reliable. The headaches are (imho) due to lack of readily available samples of how to do it most effectively, so programmers new to messaging frequently struggle with it. Idempotent / replay has its downsides and reliability issues as well.Springs
Q: How is CEDA different than store-and-forward? Simply looks like local producer thread to a local broker... then local broker forwards to remote broker which writes it to disk.Springs
75Mbps isn't at all representative of Kafka-scale. That's about 1% of what I've seen in production.Jenkins
@Jenkins what is your per-broker number?Springs
@Jenkins The throughput number (55-75) seems to be a magic number for lots of distributed system (probably due to latency) and has zero to do with any app having 'better' architecture. see a 1TiB per hr Kafka cluster sees 55MiB per broker twitter.com/jakekorab/status/1179038733443551232Springs
S
0

Kafka scores over other traditional messaging brokers like ActiveMQ due to high throughput, Partitioning, replication, and fault-tolerance features.

The main reason for high throughput in Kafka is due to pull based consumption by application at its own speed.

The in-built Partitioning feature in Kafka improves the scalability of the application. It allows a single Topic to be distributed on multiple brokers.

In addition to message broker role, Kakfa is distributed event streaming platform. It can send message, store and process events.

The event processing capability allows Kafka to be used in various use cases and those can be found in official documentation page.

Messaging, website tracking, Log aggregation, stream processing , commit log

etc.

Spiritless answered 20/4, 2023 at 13:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.