Apache Ignite vs. Apache Storm (in-depth)
Asked Answered
B

1

6

Apache Ignite and Apache Storm are two rather different technologies in many aspects - especially since Storm has one very specific use-case, while Ignite has quite a large set of tools under one roof. As I understand it, the core of Ignite is its in-memory storage. Built on that is its data locality sensitive computation. Built on that are all kinds of cool "toys". The one I am interested in is the Streaming functionality, which is basically a querying listener on the changing in-memory cache.

If I set the sliding window to one tuple, Ignite provides - like Storm - a one-tuple-at-a-time functionality. The data is stored in memory by Ignite. Storm does not "store" the data in an in-memory sense, but the tuples are of course also stored in memory. So in both cases I have streaming and I have data in memory and I am able to distribute my computation.

I get a sense that writing programs that do many steps of data transformations might be easier to write in Storm, due to the abstractions of both technologies. What is to say about that?

Second question: What about the performance? I'd guess Ignite's data locality might give it an advantage. On the other hand I think multiple steps might be better distributed in Storm (different bolts on all kinds of machines), while an Ignite program might not be split so easily.

If I still wanted to distribute the stream (not just per data, but also the steps on different machines). I guess I would have to write multiple Ignite streamers, which communicate through Caches, right? This would sound more difficult to write than in Storm (bringing us back to the first question).

Bonaventura answered 26/11, 2015 at 10:46 Comment(0)
A
3

I get a sense that writing programs that do many steps of data transformations might be easier to write in storm, due to the abstractions of both technologies. What is to say about that?

You are probably right about that. It does seem like multiple transformations would be easier in storm, although Ignite also has decent support for it by streaming newly produced tuples into another cache.

What about the performance? I'd guess the Ignite's data locality might give it an advantage. On the other hand I think multiple steps might be better distributed in Storm (different bolts on all kinds of machines), while an Ignite program might not be split so easily.

From what I hear within the community, Ignite should be an order of magnitude faster than Storm.

If I still wanted to distribute the stream (not just per data, but also the steps on different machines). I guess I would have to write multiple Ignite streamers, which communicate through Caches, right?

Yes, you are right. Having multiple caches in Ignite is not a bad thing, and is actually recommended. Most users end up having a dozen or two.

This would sound more difficult to write than in Storm (bringing us back to the first question).

It sounds like you need to decide how important the performance is for you.

Appear answered 28/11, 2015 at 5:35 Comment(5)
1) Could you please give a pointer to where to find out about this "decent support"? 2) Why would Ignite be magnitudes faster than Storm? I don't really see any technical reason. (It is kind of hard to believe.)Bonaventura
By "decent support" I meant exactly what you were suggesting - streaming results of one stream into another cache.Appear
As far as the performance, just test it out yourself. The performance difference often comes from code efficiency, and not necessarily from the architecture.Appear
Regarding 1: I meant which Ignite Interface I should use. org.apache.ignite.IgniteCache or is there a even more convenient one for exactly this use-case?Bonaventura
Ignite has separate streaming API that should better fit your use-case. Take a look at IgniteDataStreamer and StreamReceiver interfaces. Here is the documentation page for more details: apacheignite.readme.io/docs/streaming--cepContributory

© 2022 - 2024 — McMap. All rights reserved.