Does Kafka python API support stream processing?

Asked 19/8, 2018 at 14:59 Answered 12/12, 2023 at 18:23

Solved python apache-kafka apache-kafka-streams kafka-python stream-processing

I have used Kafka Streams in Java. I could not find similar API in python. Do Apache Kafka support stream processing in python?

Ladylike answered 19/8, 2018 at 14:59 Comment(2)

There is github.com/wintoncode/winton-kafka-streams -- this is not part of Apache Kafka. I don't know how stable it is and if it's suitable for production yet. – Potence 19/8, 2018 at 17:43

And there is also github.com/robinhood/faust – Ataman 20/8, 2018 at 7:5

Kafka Streams is only available as a JVM library, but there are a few comparable Python implementations of it

robinhood/faust (Not maintained as of 2020, but was forked)
wintincode/winton-kafka-streams (appears not to be maintained)
fluvii (see discussion)
bytewax

In theory, you could try playing with Jython or Py4j to work with the JVM implementation, but probably would require more work than necessary.

Outside of those options, you can also try Apache Beam, Flink or Spark, but they each require an external cluster scheduler to scale out (and also require a Java installation).

If you are okay with HTTP methods, then running a KSQLDB instance (again, requiring Java for that server) and invoking its REST interface from Python with the built-in SQL functions can work. However, building your own functions there will requiring writing JVM compiled code, last I checked.

If none of those options are suitable, then you're stuck with the basic consumer/producer methods.

Scevo answered 19/8, 2018 at 15:59 Comment(5)

Is there any example or tutorials to use docs.confluent.io/current/ksql/docs/tutorials/… with faust streaming? – Cluny 8/4, 2019 at 6:47

KSQL is implemented in Java, so I'm not sure I understand the question – Scevo 8/4, 2019 at 22:31

@circket_007, KSQL is not available in python. This is what you mean. Am I right? – Cluny 9/4, 2019 at 4:9

@Maha KSQL server has a REST API, so you can submit queries from any language – Scevo 11/4, 2019 at 0:58

btw: here is the direct link to the forked project: github.com/faust-streaming/faust – Machinate 4/11, 2022 at 11:25

If you are using Apache Spark, you can use Kafka as producer and Spark Structured Streaming as consumer. No need to rely on 3rd part libraries like Faust.

To consume Kafka data streams in Spark, use the Structured Streaming + Kafka Integration Guide.

Keep in mind that you will have to append spark-sql-kafka package when using spark-submit:

spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1 StructuredStreaming.py

This solution has been tested with Spark 3.0.1 and Kafka 2.7.0 with PySpark.

This resource can also be useful.

Curley answered 21/3, 2021 at 14:29 Comment(1)

And if you're a Python person, you can write the originating code (the code that is, for example, probing a vibration sensor) in Python, and use either the Kafka Python library to publish messages directly, or fluentd to publish JSON provided by a Python script – Weeden 5/10, 2023 at 13:19

Previously KStrame python API was not available but now its available with new KStream python library https://pypi.org/project/kstreams/

Features:

Produce events
Consumer events with Streams
Prometheus metrics and custom monitoring
TestClient
Custom Serialization and Deserialization
Easy to integrate with any async framework. No tied to any library!!
Yield events from streams
Store (kafka streams pattern)
Stream Join
Windowing

Appalling answered 2/1, 2023 at 16:38 Comment(1)

Those last three features are not implemented, according to the docs – Scevo 22/1, 2023 at 14:34

There is a relatively new library called FastStream:

FastStream is a powerful and easy-to-use Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS and Redis.

It looks really good and quite simple to use, although I haven't personally used it yet. It is in constant development and supports more brokers besides Kafka (like RabbitMQ).

Madelyn answered 12/12, 2023 at 18:23 Comment(0)

Recommended topics

Hot tags