Middleware to build data-gathering and monitoring for a distributed system [closed]

S

2

7

I am currently looking for a good middleware to build a solution to for a monitoring and maintenance system. We are tasked with the challenge to monitor, gather data from and maintain a distributed system consisting of up to 10,000 individual nodes.

The system is clustered into groups of 5-20 nodes. Each group produces data (as a team) by processing incoming sensor data. Each group has a dedicated node (blue boxes) acting as a facade/proxy for the group, exposing data and state from the group to the outside world. These clusters are geographically separated and may connect to the outside world over different networks (one may run over fiber, another over 3G/Satellite). It is likely we will experience both shorter (seconds/minutes) and longer (hours) outages. The data is persisted by each cluster locally.

This data needs to be collected (continuously and reliably) by external & centralized server(s) (green boxes) for further processing, analysis and viewing by various clients (orange boxes). Also, we need to monitor the state of all nodes through each groups proxy node. It is not required to monitor each node directly, even though it would be good if the middleware could support that (handle heartbeat/state messages from ~10,000 nodes). In case of proxy failure, other methods are available to pinpoint individual nodes.

Furthermore, we need to be able to interact with each node to tweak settings etc. but that seems to be more easily solved since that is mostly manually handled per-node when needed. Some batch tweaking may be needed, but all-in-all it looks like a standard RPC situation (Web Service or alike). Of course, if the middleware can handle this too, via some Request/Response mechanism that would be a plus.

Monitoring

Requirements:

1000+ nodes publishing/offering continuous data
Data needs to be reliably (in some way) and continuously gathered to one or more servers. This will likely be built on top of the middleware using some kind of explicit request/response to ask for lost data. If this could be handled automatically by the middleware this is of course a plus.
More than one server/subscriber needs to be able to be connected to the same data producer/publisher and receive the same data
Data rate is max in the range of 10-20 per second per group
Messages sizes range from maybe ~100 bytes to 4-5 kbytes
Nodes range from embedded constrained systems to normal COTS Linux/Windows boxes
Nodes generally use C/C++, servers and clients generally C++/C#
Nodes should (preferable) not need to install additional SW or servers, i.e. one dedicated broker or extra service per node is expensive
Security will be message-based, i.e. no transport security needed

We are looking for a solution that can handle the communication between primarily proxy nodes (blue) and servers (green) for the data publishing/polling/downloading and from clients (orange) to individual nodes (RPC style) for tweaking settings.

There seems to be a lot of discussions and recommendations for the reversed situation; distributing data from server(s) to many clients, but it has been harder to find information related to the described situation. The general solution seems to be to use SNMP, Nagios, Ganglia etc. to monitor and modify large number of nodes, but the tricky part for us is the data gathering.

We have briefly looked at solutions like DDS, ZeroMQ, RabbitMQ (broker needed on all nodes?), SNMP, various monitoring tools, Web Services (JSON-RPC, REST/Protocol Buffers) etc.

So, do you have any recommendations for an easy-to-use, robust, stable, light, cross-platform, cross-language middleware (or other) solution that would fit the bill? As simple as possible but not simpler.

Staal answered 20/11, 2012 at 23:11 Comment(2)

Maintaining reliable communication with 1000+ publishers is not an easy task for a single Monitor Server. Are you allowed to do any load balancing? Also, assuming an average message size of 2 kbytes and 15 messages per second per blue box, the network should be able to deal with an aggregate of 2x15x1,000+=30,000+ kbytes per second = 240+mbit; another reason to think about partitioning your data flows. And do you have any multicast at your disposal on the network? – Nertie 21/11, 2012 at 22:9

Yes, a possible solution is to partition publishers into different groups, handled by multiple servers/subscribers. In reality, the sheer task of monitoring 1000 nodes (plus sub-nodes) is of course also tricky to solve in a good and manageable way. However, we want to keep the basic solution as simple, performant and robust as possible. Although we need to plan for the numbers provided, it is not likely we will experience such large setups from start (depends on our customers). Plan for the worst - hope for the best. We do not yet know if we have multicast available for all networks. – Dorris 23/11, 2012 at 8:8

M

3

Seems ZeroMQ will fit the bill easily, with no central infrastructure to manage. Since your monitoring servers are fixed, it's really quite a simple problem to solve. This section in the 0MQ Guide may help:

http://zguide.zeromq.org/page:all#Distributed-Logging-and-Monitoring

You mention "reliability", but could you specify the actual set of failures you want to recover? If you are using TCP then the network is by definition "reliable" already.

Mccarley answered 21/11, 2012 at 6:20 Comment(3)

I added some additional info to the question. Since our clusters will be spread out geographically over large areas we will need to support all kinds of networks, both good and bad. We will likely experience bad networks (2.5G/3G/Satellite), cables being severed (physically broken), power outages to infrastructure etc. All data we need to get will be persisted by the publishers (in db/file) for several reasons so we are not primarily looking for a solution to persist messages automatically but it should be easy to implement a method to be able to ask for old/missing data. – Dorris 21/11, 2012 at 10:0

Take a look at the FileMQ project, which is a large-scale file pubsub system built over 0MQ. This may not be a complete answer but it gives you full persistence, a very simple API (the filesystem), and will recover from failures. You haven't specified your requirements in terms of throughput but I'm assuming your networks will be much slower than your file systems. See zguide.zeromq.org/page:all#Large-scale-File-Publishing – Mccarley 22/11, 2012 at 4:40

Thanks, it is really good to see such good documentation, both API-wise and general into/best practice documentation (The Guide). Our networks will most likely be much slower than our file systems, yes. We will sometimes experience very slow networks and will likely have to be able to provide different API:s (thin/rich) to be able to handle all cases. Btw, we have successfully hooked up ZeroMQ into our test rig using clrzmq. So far it works as advertised, looks really promising! – Dorris 23/11, 2012 at 8:14

N

6

Disclosure: I am a long-time DDS specialist/enthusiast and I work for one of the DDS vendors.

Good DDS implementations will provide you with what you are looking for. Collection of data and monitoring of nodes is a traditional use-case for DDS and should be its sweet spot. Interacting with nodes and tweaking them is possible as well, for example by using so-called content filters to send data to a particular node. This assumes that you have a means to uniquely identify each node in the system, for example by means of a string or integer ID.

Because of the hierarchical nature of the system and its sheer (potential) size, you will probably have to introduce some routing mechanisms to forward data between clusters. Some DDS implementations can provide generic services for that. Bridging to other technologies, like DBMS or web-interfaces, is often supported as well.

Especially if you have multicast at your disposal, discovery of all participants in the system can be done automatically and will require minimal configuration. This is not required though.

To me, it looks like your system is complicated enough to require customization. I do not believe that any solution will "fit the bill easily", especially if your system needs to be fault-tolerant and robust. Most of all, you need to be aware of your requirements. A few words about DDS in the context of the ones you have mentioned:

1000+ nodes publishing/offering continuous data

This is a big number, but should be possible, especially since you have the option to take advantage of the data-partitioning features supported by DDS.

Data needs to be reliably (in some way) and continuously gathered to one or more servers. This will likely be built on top of the middleware using some kind of explicit request/response to ask for lost data. If this could be handled automatically by the middleware this is of course a plus.

DDS supports a rich set of so-called Quality of Service (QoS) settings specifying how the infrastructure should treat that data it is distributing. These are name-value pairs set by the developer. Reliability and data-availability area among the supported QoS-es. This should take care of your requirement automatically.

More than one server/subscriber needs to be able to be connected to the same data producer/publisher and receive the same data

One-to-many or many-to-many distribution is a common use-case.

Data rate is max in the range of 10-20 per second per group

Adding up to a total maximum of 20,000 messages per second is doable, especially if data-flows are partitioned.

Messages sizes range from maybe ~100 bytes to 4-5 kbytes

As long as messages do not get excessively large, the number of messages is typically more limiting than the total amount of kbytes transported over the wire -- unless large messages are of very complicated structure.

Nodes range from embedded constrained systems to normal COTS Linux/Windows boxes

Some DDS implementations support a large range of OS/platform combinations, which can be mixed in a system.

Nodes generally use C/C++, servers and clients generally C++/C#

These are typically supported and can be mixed in a system.

Nodes should (preferable) not need to install additional SW or servers, i.e. one dedicated broker or extra service per node is expensive

Such options are available, but the need for extra services depends on the DDS implementation and the features you want to use.

Security will be message-based, i.e. no transport security needed

That certainly makes life easier for you -- but not so much for those who have to implement that protection at the message level. DDS Security is one of the newer standards in the DDS ecosystem that provides a comprehensive security model transparent to the application.

Nertie answered 26/11, 2012 at 4:13 Comment(0)

M

3

Seems ZeroMQ will fit the bill easily, with no central infrastructure to manage. Since your monitoring servers are fixed, it's really quite a simple problem to solve. This section in the 0MQ Guide may help:

http://zguide.zeromq.org/page:all#Distributed-Logging-and-Monitoring

You mention "reliability", but could you specify the actual set of failures you want to recover? If you are using TCP then the network is by definition "reliable" already.

Mccarley answered 21/11, 2012 at 6:20 Comment(3)

I added some additional info to the question. Since our clusters will be spread out geographically over large areas we will need to support all kinds of networks, both good and bad. We will likely experience bad networks (2.5G/3G/Satellite), cables being severed (physically broken), power outages to infrastructure etc. All data we need to get will be persisted by the publishers (in db/file) for several reasons so we are not primarily looking for a solution to persist messages automatically but it should be easy to implement a method to be able to ask for old/missing data. – Dorris 21/11, 2012 at 10:0

Take a look at the FileMQ project, which is a large-scale file pubsub system built over 0MQ. This may not be a complete answer but it gives you full persistence, a very simple API (the filesystem), and will recover from failures. You haven't specified your requirements in terms of throughput but I'm assuming your networks will be much slower than your file systems. See zguide.zeromq.org/page:all#Large-scale-File-Publishing – Mccarley 22/11, 2012 at 4:40

Thanks, it is really good to see such good documentation, both API-wise and general into/best practice documentation (The Guide). Our networks will most likely be much slower than our file systems, yes. We will sometimes experience very slow networks and will likely have to be able to provide different API:s (thin/rich) to be able to handle all cases. Btw, we have successfully hooked up ZeroMQ into our test rig using clrzmq. So far it works as advertised, looks really promising! – Dorris 23/11, 2012 at 8:14

Recommended topics

Hot tags