How do Real Time Operating Systems work?

Asked 11/2, 2009 at 12:11 Answered 15/12, 2012 at 6:2

I mean how and why are realtime OSes able to meet deadlines without ever missing them? Or is this just a myth (that they do not miss deadlines)? How are they different from any regular OS and what prevents a regular OS from being an RTOS?

Interphone answered 11/2, 2009 at 12:11 Comment(1)

It's also important to notice the difference between a soft' real time system and a 'hard' real time system. – Alexandrina 11/2, 2009 at 14:29

Meeting deadlines is a function of the application you write. The RTOS simply provides facilities that help you with meeting deadlines. You could also program on "bare metal" (w/o a RTOS) in a big main loop and meet you deadlines.

Also keep in mind that unlike a more general purpose OS, an RTOS has a very limited set of tasks and processes running.

Some of the facilities an RTOS provide:

Priority-based Scheduler
System Clock interrupt routine
Deterministic behavior

Priority-based Scheduler

Most RTOS have between 32 and 256 possible priorities for individual tasks/processes. The scheduler will run the task with the highest priority. When a running task gives up the CPU, the next highest priority task runs, and so on...

The highest priority task in the system will have the CPU until:

it runs to completion (i.e. it voluntarily give up the CPU)
a higher priority task is made ready, in which case the original task is pre-empted by the new (higher priority) task.

As a developer, it is your job to assign the task priorities such that your deadlines will be met.

System Clock Interrupt routines

The RTOS will typically provide some sort of system clock (anywhere from 500 uS to 100ms) that allows you to perform time-sensitive operations. If you have a 1ms system clock, and you need to do a task every 50ms, there is usually an API that allows you to say "In 50ms, wake me up". At that point, the task would be sleeping until the RTOS wakes it up.

Note that just being woken up does not insure you will run exactly at that time. It depends on the priority. If a task with a higher priority is currently running, you could be delayed.

Deterministic Behavior

The RTOS goes to great length to ensure that whether you have 10 tasks, or 100 tasks, it does not take any longer to switch context, determine what the next highest priority task is, etc...

In general, the RTOS operation tries to be O(1).

One of the prime areas for deterministic behavior in an RTOS is the interrupt handling. When an interrupt line is signaled, the RTOS immediately switches to the correct Interrupt Service Routine and handles the interrupt without delay (regardless of the priority of any task currently running).

Note that most hardware-specific ISRs would be written by the developers on the project. The RTOS might already provide ISRs for serial ports, system clock, maybe networking hardware but anything specialized (pacemaker signals, actuators, etc...) would not be part of the RTOS.

This is a gross generalization and as with everything else, there is a large variety of RTOS implementations. Some RTOS do things differently, but the description above should be applicable to a large portion of existing RTOSes.

Partiality answered 11/2, 2009 at 13:37 Comment(3)

"This task will run to completion" sounds like Windows 3.1! Then you mean RTOSes are non preemptive? – Interphone 11/2, 2009 at 14:3

No, if you are the highest priority you run until you voluntarily give up, OR a higher priority task than you becomes ready, at which time the (old) high priority gets pre-empted. I'll clarify in the main text. Thanks! – Partiality 11/2, 2009 at 14:24

Thanks for the great answer. Could you please clarify how System Clock Interrupt routines behavior you described is specific to RTOS? I mean, how is it different from a standard timer API each general-purpose OS has? – Incertitude 25/8, 2018 at 8:48

In RTOSes the most critical parameters which should be taken care of are lower latencies and time determinism. Which it pleasantly does by following certain policies and tricks.

Whereas in GPOSes, along with acceptable latencies the critical parameters is high throughput. you cannot count on GPOS for time determinism.

RTOSes have tasks which are much lighter than processes/threads in GPOS.

Rozanneroze answered 15/12, 2012 at 6:2 Comment(0)

It is not that they are able to meet deadlines, it is rather that they have deadlines fixed whereas in a regular OS there is no such deadline.

In a regular OS the task scheduler is not really strict. That is the processor will execute so many instructions per second, but it may occasionally not do so. For example a task might be pre-empted to allow a higher priority one to execute (and may be for longer time). In RTOS the processor will always execute the same number of tasks.

Additionally there is usually a time limit for tasks to completed after which a failure is reported. This does not happen in regular OS.

Obviously there is lot more detail to explain, but the above are two of the important design aspects that are used in RTOS.

Claudineclaudio answered 11/2, 2009 at 12:14 Comment(0)

Your RTOS is designed in such a way that it can guarantee timings for important events, like hardware interrupt handling and waking up sleeping processes exactly when they need to be.

This exact timing allows the programmer to be sure that his (say) pacemaker is going to output a pulse exactly when it needs to, not a few tens of milliseconds later because the OS was busy with another inefficient task.

It's usually a much simpler OS than a fully-fledged Linux or Windows, simply because it's easier to analyse and predict the behaviour of simple code. There is nothing stopping a fully-fledged OS like Linux being used in a RTOS environment, and it has RTOS extensions. Because of the complexity of the code base it will not be able to guarantee its timings down to as small-a scale as a smaller OS.

The RTOS scheduler is also more strict than a general purpose scheduler. It's important to know the scheduler isn't going to change your task priority because you've been running a long time and don't have any interactive users. Most OS would reduce internal the priority of this type of process to favour short-term interactive programs where the interface should not be seen to lag.

Beardsley answered 11/2, 2009 at 12:51 Comment(0)

You might find it helpful to read the source of a typical RTOS. There are several open-source examples out there, and the following yielded links in a little bit of quick searching:

A commercial RTOS that is well documented, available in source code form, and easy to work with is µC/OS-II. It has a very permissive license for educational use, and (a mildly out of date version of) its source can be had bound into a book describing its theory of operation using the actual implementation as example code. The book is MicroC OS II: The Real Time Kernel by Jean Labrosse.

I have used µC/OS-II in several projects over the years, and can recommend it.

Cognate answered 28/4, 2009 at 7:39 Comment(0)

"Basically, you have to code each "task" in the RTOS such that they will terminate in a finite time."

This is actually correct. The RTOS will have a system tick defined by the architecture, say 10 millisec., with all tasks (threads) both designed and measured to complete within specific times. For example in processing real time audio data, where the audio sample rate is 48kHz, there is a known amount of time (in milliseconds) at which the prebuffer will become empty for any downstream task which is processing the data. Therefore using the RTOS requires correct sizing of the buffers, estimating and measuring how long this takes, and measuring the latencies between all software layers in the system. Then the deadlines can be met. Otherwise the applications will miss the deadlines. This requires analysis of the worst-case data processing throughout the entire stack, and once the worst-case is known, the system can be designed for, say, 95% processing time with 5% idle time (this processing may not ever occur in any real usage, because worst-case data processing may not be an allowed state within all layers at any single moment in time).

Example timing diagrams for the design of a real time operating system network app are in this article at EE Times, PRODUCT HOW-TO: Improving real-time voice quality in a VoIP-based telephony design http://www.eetimes.com/design/embedded/4007619/PRODUCT-HOW-TO-Improving-real-time-voice-quality-in-a-VoIP-based-telephony-design

Ermelindaermengarde answered 13/7, 2011 at 23:39 Comment(0)

What is important is realtime applications, not realtime OS. Usually realtime applications are predictable: many tests, inspections, WCET analysis, proofs, ... have been performed which show that deadlines are met in any specified situations.

It happens that RTOSes help doing this work (building the application and verifying its RT constraints). But I've seen realtime applications running on standard Linux, relying more on hardware horsepower than on OS design.

Jewelljewelle answered 11/2, 2009 at 12:21 Comment(2)

A RTOS makes very strict guarantees on things that are important, like interrupt servicing times, task switching latency, etc Real-time applications are NOT possible without a proper RTOS. – Beardsley 11/2, 2009 at 12:37

I am just speaking of what I have seen. And more than often, realtime problems are solved by huge CPU frequencies and a lot of time margin. – Jewelljewelle 11/2, 2009 at 13:3

I haven't used an RTOS, but I think this is how they work.

There's a difference between "hard real time" and "soft real time". You can write real-time applications on a non-RTOS like Windows, but they're 'soft' real-time:

As an application, I might have a thread or timer which I ask the O/S to run 10 times per second ... and maybe the O/S will do that, most of the time, but there's no guarantee that it will always be able to ... this lack of guarantee is why it's called 'soft'. The reason why the O/S might not be able to is that a different thread might be keeping the system busy doing something else. As an application, I can boost my thread priority to for example HIGH_PRIORITY_CLASS, but even if I do this the O/S still has no API which I can use to request a guarantee that I'll be run at certain times.
A 'hard' real-time O/S does (I imagine) have APIs which let me request guaranteed execution slices. The reason why the RTOS can make such guarantees is that it's willing to abend threads which take more time than expected / than they're allowed.

Stringency answered 11/2, 2009 at 12:41 Comment(2)

It's not just scheduling - the OS must make sure that no random things kick in like garbage collection or memory address space defragmentation, so that you know that malloc() will always return without a delay, so (for example) the aeroplane the autopilot is controlling will not crash. – Staid 11/2, 2009 at 12:54

And presumably hardware interrupts too. – Stringency 11/2, 2009 at 12:56

... well ...

A real-time operating system tries to be deterministic and meet deadlines, but it all depends on the way you write your application. You can make a RTOS very non real-time if you don't know how to write "proper" code.

Even if you know how to write proper code: It's more about trying to be deterministic than being fast.

When we talk about determinism it's

1) event determinism

For each set of inputs the next states and outputs of a system are known

2) temporal determinism

… also the response time for each set of outputs is known

This means that if you have asynchronous events like interrupts your system is strictly speaking not anymore temporal deterministic. (and most systems use interrupts)

If you really want to be deterministic poll everything.

... but maybe it's not necessary to be 100% deterministic

Raffish answered 18/5, 2009 at 11:58 Comment(10)

"If you really want to be deterministic poll everything." - What if you miss an event of higher priority inbetween poll cycles? Will this not make the OS response non real time for those events? – Interphone 22/5, 2009 at 12:14

Of course it will, but you did your analysis and made sure that all the events from outside of the OS come within certain time boundaries (something like a sporadic server for your inputs). In a fault condition (cracked cable) you should throw away the events anyhow. What you make sure by polling and not using any interrupts is, that the fact the you use interrupt is not anymore degrading determinism. – Raffish 25/5, 2009 at 7:25

Are you trying to say that this is effectively a trade off between latency and determinism? IMO the "events at well defined boundaries" model fails when you have an event hierarchy (i.e. prioritized events). There is no reason why a totally unrelated event should have to respect the time boundaries of a low priority (LP) event/task. The LP task needs to be preempted even if the HP event occurs at t0+dt. Where dt is an infinitesimally small period of time and t0 is the time when the LP task started. – Interphone 25/5, 2009 at 16:36

You first need to define what (hard) real-time means for you and your system. Do you need (hard) real-time? 1) deterministic (event+time), which strictly speaking means no interrupts. 2) best effort - not so deterministic anymore, most of the times you will have low latency. But what if this unexpected thing happens where you suddenly have all this noise on one of your interrupt lines together with the timer tick interrupt and some user pressed some button which causes another INT - while the system is executing code from cache - you are not time deterministic anymore - maybe low latency? – Raffish 26/5, 2009 at 6:23

Noise on interrupt lines sounds like a hardware problem - use a H/W low pass filter (LPF). If that is not an option then how about selectively masking the noisy interrupt line (for example until the ISR returns) instead of disabling all interrupts? I think when you choose to say that you will have prioritized events you are effectively declaring that in the presence of High priority tasks the tasks with lower priorities do not have to be real time. – Interphone 26/5, 2009 at 15:17

Sometimes it's not as easy as filtering out noise with a low pass filter, like when a sensor or cable are broken and sending funny stuff. Also I would no go for polling without interrupts because of this issue. In case the low priority tasks don't have real-time requirements we are not talking about a deterministic hard real-time system anymore. Anyhow, just image a happy real-time system with only a timer interrupt, not even cache. Even this is strictly speaking non time deterministic due to the drift of the crystal from which the timer tick is derived. Add cache and determinism goes away. – Raffish 27/5, 2009 at 8:3

Ok I think the difference of opinion stems from the fact that I am making an implicit assumption here that it's impossible to guarantee hard real time deadlines in a system with purely random events hence RTOSes are always "best effort" (IMO). Now if we are talking about crystal drift then the main clock generator (used by the microprocessor) is also susceptible which implies polling cannot be guaranteed to be deterministic either. – Interphone 27/5, 2009 at 19:47

Oh yes, here we go. You got it! Berger's theorem: Strictly speaking there's nothing like a hard real-time system. With a polled system you might get closer though than with one which uses interrupts. My conclusion is, that you should use as few interrupts as possible and that a real-time system in practice is one, which passes some test suites to prove that it does what it's supposed to do. That's not for purists, tough! – Raffish 28/5, 2009 at 22:24

"...which passes some test suites to prove that it does what it's supposed to do" my thoughts exactly. I agree with your conclusion! – Interphone 29/5, 2009 at 6:29

"Berger's theorem:..." is this a real theorem? Could you please provide a link to a detailed description and proof? – Interphone 29/5, 2009 at 6:35

The textbook/interview answer is "deterministic pre-emption". The system is guaranteed to transfer control within a bounded period of time if a higher priority process is ready to run (in the ready queue) or an interrupt is asserted (typically input external to the CPU/MCU).

Junitajunius answered 13/11, 2009 at 2:59 Comment(0)

They actually don't guarantee meeting deadlines; what they do that makes them truly RTOS is to provide the means to recognize and deal with deadline overruns. 'Hard' RT systems generally are those where missing a deadline is disastrous and some kind of shutdown is required, whereas a 'soft' RT system is one where continuing with degraded functionality makes sense. Either way an RTOS permits you to define responses to such overruns. Non RT OS's don't even detect overruns.

Carleecarleen answered 13/11, 2009 at 3:34 Comment(0)

-1

Basically, you have to code each "task" in the RTOS such that they will terminate in a finite time.

Additionally your kernel would allocate specific amounts of time to each task, in an attempt to guarantee that certain things happened at certain times.

Note that this is not an easy task to do however. Imagine things like virtual function calls, in OO it's very difficult to determine these things. Also an RTOS must be carefully coded with regard to priority, it may require that a high priority task is given the CPU within x milliseconds, which may be difficult to do depending on how your scheduler works.

Hydroscope answered 11/2, 2009 at 12:17 Comment(3)

"Basically, you have to code each "task" in the RTOS such that they will terminate in a finite time" - then its the application that should be called realtime and not the OS. – Interphone 11/2, 2009 at 14:6

What happens when a task runs out of time? – Interphone 11/2, 2009 at 14:7

the task is forcibly preempted and restarted on its next time slice. A good RTOS would raise an error or notify that this had occured. – Hydroscope 11/2, 2009 at 20:1

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags