How to enforce the order of messages passed to an IoT device over MQTT via a cloud-based system (API design issue)
Asked Answered
A

3

8

Suppose I have an IoT device which I'm about to control (lets say switch on/off) and monitor (e.g. collect temperature readings). It seems MQTT could be the right fit. I could publish messages to the device to control it and the device could publish messages to a broker to report temperature readings. So far so good.

The problems start to occur when I try to design the API to control the device.

Lets day the device subscribes to two topics:

  • /device-id/control/on
  • /device-id/control/off

Then I publish messages to these topics in some order. But given the fact that messaging is typically an asynchronous process there are no guarantees on the order of messages received by the device.

So in case two messages are published in the following order:

  1. /device-id/control/on
  2. /device-id/control/off

they could be received in the reversed order leaving the device turned on, which can have dramatic consequences, depending on the context.

Of course the API could be designed in some other way, for example there could be just one topic

  1. /device-id/control

and the payload of individual messages would carry the meaning of an individual message (on/off). So in case messages are published to this topic in a given order they are expected to be received in the exact same order on the device.

But what if the order of publishes to individual topics cannot be guaranteed? Suppose the following architecture of a system for IoT devices:

                       / control service \
application -> broker -> control service -> broker -> IoT device
                       \ control service /

The components of the system are:

  • an application which effectively controls the device by publishing messages to a broker
  • a typical message broker
  • a control service with some business logic

The important part is that as in most modern distributed systems the control service is a distributed, multi instance entity capable of processing multiple control messages from the application at a time. Therefore the order of messages published by the application can end up totally mixed when delivered to the IoT device.

Now given the fact that most MQTT brokers only implement QoS0 and QoS1 but no QoS2 it gets even more interesting as such control messages could potentially be delivered multiple times (assuming QoS1 - see https://mcmap.net/q/401475/-is-message-order-preserved-in-mqtt-messages).

My point is that separate topics for control messages is a bad idea. The same goes for a single topic. In both cases there are no message delivery order guarantees.

The only solution to this particular issue that comes to my mind is message versioning so that old (out-dated) messages could simply be skipped when delivered after another message with more recent version property.

  • Am I missing something?
  • Is message versioning the only solution to this problem?
Albumin answered 29/1, 2016 at 23:43 Comment(0)
P
5

Am I missing something?

Most definitely. The example you brought up is a generic control system, being attached to some message-oriented scheme. There are a number of patterns that can be used when referring to a message-based architecture. This article by Microsoft categorizes message patterns into two primary classes:

  • Commands and
  • Events

The most generic pattern of command behavior is to issue a command, then measure the state of the system to verify the command was carried out. If you forget to verify, your system has an open loop. Such open loops are (unfortunately) common in IT systems (because it's easy to forget), and often result in bugs and other bad behaviors such as the one described above. So, the proper way to handle a command is:

  1. Issue the command
  2. Inquire as to the state of the system
  3. Evaluate next action

Events, on the other hand, are simply fired off. As the publisher of an event, it is not my business to worry about who receives the event, in what order, etc. Now, it should also be pointed out that the use of any decent message broker (e.g. RabbitMQ) generally carries strong guarantees that messages will be delivered in the order which they were originally published. Note that this does not mean they will be processed in order.

So, if you treat a command as an event, your system is guaranteed to act up sooner or later.

Is message versioning the only solution to this problem?

Message versioning typically refers to a property of the message class itself, rather than a particular instance of the class. It is often used when multiple versions of a message-based API exist and must be backwards-compatible with one another.

What you are instead referring to is unique message identifiers. Guids are particularly handy for making sure that each message gets its own unique id. However, I would argue that de-duplication in message-based architectures is an anti-pattern. One of the consequences of using messaging is that duplicates are possible, so you should try to design your system behaviors to be stateless and idempotent. If this is not possible, it should be considered that messaging may not be the correct communication solution for the need.

Using the command-event dichotomy as an example, you could perform the following transaction:

  1. The controller issues the command, assigning a unique identifier to the command.
  2. The control system receives the command and turns on.
  3. The control system publishes the "light on" event notification, containing the unique id of the command that was used to turn on the light.
  4. The controller receives the notification and correlates it to the original command.

In the event that the controller doesn't receive notification after some timeout, the controller can retry the command. Note that "light on" is an idempotent command, in that multiple calls to it will have the same effect.

Propman answered 2/2, 2016 at 14:40 Comment(1)
Thank you very much for your answer. It's very interesting, I'll definitely have a closer look at it. Nevertheless, I'll leave the question open for a while to see if there are more valuable approaches to this problem.Albumin
C
1

When state changes, send the new state immediately and after that periodically every x seconds. With this solution your systems gets into desired state, after some time, even when it temporarily disconnects from the network (low battery).

BTW: You did not miss anything.

Commandant answered 30/1, 2016 at 9:56 Comment(0)
M
1

Apart from the comment that most brokers don't support QOS2 (I suspect you mean that a number of broker as a service offerings don't support QOS2, such as Amazon's AWS IoT service) you have covered most of the major points.

If message order really is that important then you will have to include some form of ordering marker in the message payload, be this a counter or timestamp.

Macomber answered 30/1, 2016 at 10:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.