Vertx clustering alternative
Asked Answered
R

1

7

Anyone with real-world experience of Vertx cluster managers other than Hazelcast have advice on our requirement below?

For our (real time sensor data) system we have hundreds of verticles in multiple JVM's, but we do not need, or want, the eventbus to span multiple physical servers.

We're running Vertx on multiple servers but our platform is less complex if we don't pool a single eventbus between all of them (we prefer to be explicit about passing messages between servers).

Hazelcast is the wrong cluster manager for us. We don't need its peer discovery between servers, but crucially any release change of Hazelcast means that new clients cannot join a cluster with existing running clients running the previous version so bringing up one new verticle compiled with vertx 3.6.3 into an existing cluster is not possible unless we stop the entire cluster and restart it with all the verticles recompiled to 3.6.3. This seriously impacts our development. It's helpful for the verticles to be more plug-and-play and vertx can do that but Hazelcast can't (due to constant version incompatibilities).

Can anyone recommend a vertx cluster manager that fits our use case?

Repast answered 8/3, 2019 at 9:56 Comment(7)
It sounds like you don't need cluster manager, but a pubsub solution.Suburbanize
Which Hazelcast version are you using with Vertx? From Hazelcast 3.6, clients can connect to servers running on later versions as long as minimum Hazelcast version on client is 3.6.Trucking
@Alexey, Vertx is a pubsub solution, using Hazelcast for peer discovery (which we use) and sharing state between nodes (which we don't). We have a large complex event-driven message-passing environment as part of a real-time urban sensor data research platform with hundreds of processes subscribing to real-time data, re-publishing e.g. predictions, all asynch. IMHO Vertx is great for this application, and we represent a real-world use-case. Hazelcast seems designed for another use-case (academic compute clusters?). Vertx supports alternatives, so any suggestions? Thanks.Repast
@Trucking - thanks. Most recently we were Vertx 3.5.2 which embedded Hazelcast 3.8.2 (I think). Now we're moving to Vertx 3.6.3 w/Hazelcast 3.10.5. Existing successful nodes say "... [3.8.2] Cluster version set to 3.8", new 3.10.5 node says something along the lines of "existing cluster different version, cannot join" (I don't have a record of exact message). Vertx w/Hazelcast has done this throughout the four years we've been using it and we've coped, but it seems about time to accept the overhead of trying a different cluster manager.Repast
I'm well aware of what Vert.x is (and what it isn't - pubsub solution). What I'm suggesting is not to stop using Vert.x, but to send messages to something like redis.io/topics/pubsub , if Hazelcast isn't working for you.Suburbanize
@Repast aha.. so its not the clients but it seems that you are trying to upgrade the version of Hazelcast servers in a running cluster. This problem can easily be solved by Rolling Upgrades feature, which is available in Enterprise version. I’m not sure if Redis has that capability at all.Trucking
Thanks, I think this is really a vertx question, rather than Hazelcast. Vertx is a great platform with non-blocking verticles communicating pub/sub via its core EventBus (we've been using it for four years). Tim Fox successfully applied his RabbitMQ experience. We want to start/stop JVM's containing Vertx verticles on a single server and have them leave/join the eventbus on a single server with the minimum of fuss. Vertx also has Ignite, Infinispan or Zookeeper and maybe one is a better fit for us. If not then we could consider not using the EventBus but thats really core code.Repast
R
7

I've now had time to review each of the alternatives Vertx directly supports as a 'cluster manager' (Hazelcast, Zookeeper, Ignite, Infinispan) and we're proceeding with a Zookeeper architecture for our system, replacing Hazelcast:

Zookeeper / Vertx multi-server architecture

Here's the background to our decision:

We started as a fairly typical (if there is such a thing) Vertx development with multiple verticles in a JVM responding to external events (urban sensor data entering our java/vertx feed handlers) published on the eventbus and the data being processed asynchronously in many other vertx verticles, often involving them publishing new derived data as new asynchronous messages.

Quite quickly we wanted to use multiple JVM's, mainly to isolate the feedhandlers from the rest of the code so if things broke the feedhandlers would keep running (as a failsafe they're persisting the data as well as publishing it). So we added (easily) Vertx clustering so the JVM's on the same machine could communicate and all verticles could publish/subscribe messages in the same system. We used the default cluster manager, Hazelcast, and modified the config so the vertx clustering is limited to the single server (we run multiple versions of the entire platform on different servers and don't want them confusing each other). We have hundreds of verticles in half-a-dozen JVM's.

Our environment (search SmartCambridge vertx) is fairly dynamic with rapid development cycles (e.g. to create a new feedhandler and have it publishing its data on the eventbus) and that means we commonly wish to start up a JVM containing these new verticles and have it join an existing vertx cluster, maybe permanently, maybe just for a while. Vertx/Hazelcast has joining a (vertx) cluster as a fairly serious operation, i.e. Hazelcast has (I believe) a concept of Hazelcast cluster members and Hazelcast clients, where clients can come and go easily but joining a Hazelcast cluster as a member requires considerable code compatibility between the existing cluster and the new member. Each time we upgraded our Vertx library the Hazelcast library version would change and this made it impossible for a newly compiled vertx verticle to join an existing vertx cluster.

Note we have experimented with having the Vertx eventbus flow between multiple servers, and also extend the eventbus into the browser/javascript, but in both cases have found it simpler/more robust to be explicit about routing messages from server to server and have written verticles specifically for that purpose.

So the new plan (after several years of Vertx development), given our environment of 5 production/development servers but with the vertx eventbus always limited to single servers, is to implement a single Zookeeper cluster across all 5 servers so we get the Zookeeper native resilience goodness, and configure each production server to use a different znode root (the default is 'io.vertx' but this is a simple config option).

This design has an attractive simple minimum build on a single server (i.e Zookeeper + Vertx) so ad-hoc development on a random machine (e.g. laptop) is still possible but we can extend our platform to have multiple servers in a single vertx cluster trivially by setting a common znode root.

Repast answered 2/4, 2019 at 13:59 Comment(3)
Why dont you switch to Kafka ?Levitical
because Kafka is a different product, with different objectives. We went through a period where "Kafka" was the answer to any question involving streaming, as we did before that with "MongoDB" for storage. Basically Kafka is well suited to applications where the messages are large and important 'transactions', trivially partitionable, and which should be guaranteed delivery, in the right order, even when there has been an outage. It is unsuited to very high-volume low-latency spatiotemporal sensor data.Repast
how many events are you publishing/consuming in a second? Kafka is well suited for high volume low latency stuff IMO. You can configure it in a way however you want it. You already mentioned the partitioning. You can configure/disable acknowledgments (which might need idempotence checks on consuming side etc).Arabeila

© 2022 - 2024 — McMap. All rights reserved.