Multiple Streams support in Apache Flink Job
Asked Answered
E

2

7

My Question in regarding Apache Flink framework.

Is there any way to support more than one streaming source like kafka and twitter in single flink job? Is there any work around.Can we process more than one streaming sources at a time in single flink job?

I am currently working in Spark Streaming and this is the limitation there.

Is this achievable by other streaming frameworks like Apache Samza,Storm or NIFI?

Response is much awaited.

Explode answered 6/11, 2016 at 17:35 Comment(0)
M
7

Yes, this is possible in Flink and Storm (no clue about Samza or NIFI...)

You can add as many source operators as you want and each can consume from a different source.

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

Properties properties = ... // see Flink webpage for more details    

DataStream<String> stream1 = env.addSource(new FlinkKafkaConsumer08<>("topic", new SimpleStringSchema(), properties);)
DataStream<String> stream2 = env.readTextFile("/tmp/myFile.txt");

DataStream<String> allStreams = stream1.union(stream2);

For Storm using low level API, the pattern is similar. See An Apache Storm bolt receive multiple input tuples from different spout/bolt

Middlebrooks answered 7/11, 2016 at 3:40 Comment(7)
Right.Thanks for the answer. Can we add this flink receiver in spark project? Is there any middle ware to join flink streaming with apache spark.Explode
I never used Spark. No clue. Furthermore, I am not aware of any middleware to combine Flink and Spark -- and I am wondering why you want to do this in the first place...Middlebrooks
Actually i am working on spark project. But I cant stream data from multiple streaming sources in a single job there using spark streaming.So i want to overcome this problem using flink. and really what to know how to join these both.Explode
I have no idea how this could be done... For sure, you cannot mix both in the same application code. You might want to use a layer in between. For example, do some processing with Flink, write result somewhere (maybe Kafka) and read it into Spark afterwards.Middlebrooks
Yes. This is what i wanted to know. Thanks :)Explode
There is a typos: see should be envEldon
Thanks @lasclocker. Feel free to edit directly next time :)Middlebrooks
S
0

Some solutions have already been covered, I just want to add that in a NiFi flow you can ingest many different sources, and process them either separately or together.

It is also possible to ingest a source, and have multiple teams build flows on this without needing to ingest the data multiple times.

Sower answered 5/4, 2019 at 11:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.