| |||||||||||||||||||||||||||
|
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
AbstractDesigning and developing real time software today is a sophisticated juggling act. Today’s systems employ more functionality than ever, elaborate GUIs, network connectivity, interprocessor communication and, at the same time, the time constraints must still be met. This paper explores some of the different types of RTOSs and the consequences of using them. Designing and developing real time software today is a sophisticated juggling act. This paper is meant to provide the reader with a general overview of Real-Time Operating Systems (RTOS) including some basic definitions, fundamental formulas, and basic taxonomy of RTOSs. This first part of the paper will explain exactly what a RTOS is, define some terminology, and discuss the difference between soft and hard real-time operating systems. The second section extends some basic concepts to include an explanation of interrupt latencies and context switch time and how they affect performance, an analysis of static vs. dynamic systems, a look at several different scheduling algorithms including implementation and general failure handling in RTOS. IntroductionThis paper is meant to provide the reader with a general overview of real-time operating systems (RTOS) including some basic definitions, fundamental formulas, and basic taxonomy of RTOSs. The paper is broken down into the general categories of:
Review of Vocabulary and ConceptsThis section will explain exactly what a RTOS is, define some terminology, and discuss the difference between soft and hard real-time operating systems. First of all, there is a common misconception that a RTOS must be fast. While this is usually the case, it is not always necessarily true. However, in all cases, the system must be predictable. In other words, the correctness of an answer is a function of time. The right answer at the wrong time is wrong. A good basic definition of a RTOS is the following: “A real-time system is one in which the correctness of the computations not only depends upon the logical correctness of the computation but also upon the time at which the result is produced. If the timing constraints are not met, system failure is said to have occurred.” [1] Example: Consider the bottling equipment in a beverage plant. It’s not sufficient enough if the equipment attempts to cap bottles at regular time intervals. If it’s off by just a fraction of a second, then the bottle may not be capped properly and the beer will be spoiled. That is, the time at which the machine caps the bottle is a critical part of the process. Some definitions that may be helpful to the reader are:
Soft vs. Hard RTOSPractically speaking, RTOS fall into two groups, those that are able to respond in “hard” real time, and those which respond in “soft” real time where requirements are less severe. Simply stated, a hard-real time operating system must, without fail, provide a response to some kind of event in a specific time window. This response must be predictable and independent of other activities undertaken by the operating system. Often, hard RTOSs employ specific hardware and special device drivers. Examples of hard RTOSs include: Chimera II, QNX, and LynxOS. In contrast, a soft RTOS is one that has reduced constraints on “lateness” but still must operate quickly with fairly constant time constraints. It must be “good enough” to service events so that, on average, the response should be satisfied. Windows NT is a popular example of a soft RTOS. A simple way for the end-user to decide on whether a soft or hard RTOS is required is to ask the question, “What are the implications of a failure?”. If the answer is catastrophic, you need a hard RTOS, for everything else, a soft RTOS will probably do. RTOS Selection ConsiderationsThis section extends some basic concepts to include an explanation of interrupt latencies and context switch time and how they affect performance, an analysis of static vs. dynamic systems, a look at several different scheduling algorithms including implementation and general failure handling in RTOS. Interrupt Latency and Context Switch TimeTwo metrics which are frequently used to compare real-time operating systems are interrupt latency and context switch time. Interrupt latency is defined as the worst-case time that can pass between the assertion of an interrupt and the execution of the first instruction of the corresponding interrupt service routine (ISP). This latency depends on three things:
All RTOSs have some interrupt latency associated with them, and when combined with context switch time, provides an indication of the responsiveness of a system. Context switch time is defined as the amount of time it takes a RTOS to save the context of one task (save its pointers and registers to a control block) and load the context of another task from a second control block. Context switch time never shows up in isolation in a system, it is always overhead that results from a RTOS call. But be careful here, though context switch time and interrupt latency often provide some measure for comparison, frequently, the vendors publishing the data do not even include the time it takes the RTOS to decide which task to execute. Also, context switch times can very dramatically depend on whether the FPU registers are saved as well. Dynamic vs. Static SystemsThere are two further sub-classifications of RTOSs, dynamic and static. By dynamic, we mean reconfigurable. A dynamically reconfigurable system can change in time without the need to halt the system. Consider, as an example, a tactile sensor on the end of a robotic manipulator used to explore an object. The resolution of the sensor is 2n by 2m taxels where n and m can vary dynamically between 1 and 4. When the robot is exploring uninteresting parts of object such as the edge of a table, one would want to use low resolution for reduced computation. But when exploring more interesting parts, such as the rounded corner of the table, one would want to use the more demanding high resolution mode with more computational overhead. Such dynamic systems may have many sensors or actuators, only a subset of which may be used at any one time. Or, hardware could be used in a different configuration as in the example before. In any case, it is crucial that critical tasks do not fail, even if the tasks or frequency of tasks change in the system. This requires a dynamic scheduler. SchedulersBefore we discuss dynamic schedulers, a brief review of a static scheduler, the rate monotonic (RM) algorithm is in order. RM is a fixed priority or static algorithm. The algorithm basically says to:
By specifying the period and computational overhead time of each task, the behavior of the system can be categorized apriori. RM is not without problems. For instance, the schedulable bound is less than 100%. What does this mean? The CPU utilization of a given task is computed as the ratio of the worst-case computing time to the period. The total CPU utilization is computed as the sum:
For RM, the worst case schedulable bound for n tasks is:
and for 1, 2, 3, and 4 tasks, is found to be:
Thus, a set of tasks for which CPU utilization is less than 69% will always meet deadlines. All tasks can be guaranteed to meet deadlines if:
If: then the subset of highest priority tasks, S such that: will be guaranteed to meet all deadlines and will form the critical set. As a note, the worst case bound for RM is quite pessimistic, and it has been shown that, for the average case: Another problem with RM is that is does not support dynamically changing periods. For example: Take a set of three tasks, P1, P2, P3 of periods T1=30ms, T2=50ms, T3=100ms. Under RM, these tasks would have the following fixed priority assignment (from highest to lowest): P1, P2, P3. Now, suppose the period of P1 changes to T1=75ms. Under RM, this would require a reassignment of priorities to P2, P1, P3 which violates the condition that the priorities are static. This brings us to dynamic priority algorithms. Though many such algorithms exist, we will restrict our discussion to only a few: Earliest-Deadline-First Algorithm (EDF), Minimum-Laxity-First Algorithm (MLF), and Maximum-Urgency-First Algorithm (MUF). Earliest-Deadline First (EDF)As the name implies, this algorithm uses the deadline of a task as its priority. The task with the earliest deadline has the highest priority. The task with the latest deadline has the lowest priority. The advantage to this is that one can achieve a 100% CPU schedule bound and that it supports dynamic priorities. The disadvantages of EDF are:
Minimum-Laxity First (MLF)This algorithm assigns a laxity to each task The scheduler then selects task with the minimum laxity to execute next. Laxity is defined as:
Laxity is a measure of flexibility. A laxity of ti means that even if the task is delayed for ti time units, it will still meet its deadline. A laxity of 0 means that the task must be executed now or will fail to meet its deadline. This is the basis for the MUF algorithm that follows. Main difference between MLF and EDF is that MLF takes into consideration the execution time of a task. Like EDF, MLF has a schedule bound of 100% and there is no way to guarantee which task(s) will fail in transient overload. Maximum-Urgency-FirstA combination of fixed and dynamic priority scheduling (a.k.a. mixed priority). Each task is assigned an urgency which is defined as a combination of two fixed priorities and one dynamic priority. One of the fixed priorities, the criticality, has precedence over the dynamic priority. Meanwhile, the dynamic priority has precedence over the other fixed priority, called the user priority. The dynamic priority is inversely proportional to the laxity of a task. MUF consists of two parts. The assignment of the criticality and user priority (done apriori), and the actions of the MUF scheduler done at run-time. Assigning criticality and user priority:
Note that static priorities are assigned once and do not change during execution. The dynamic priority is assigned at run-time, inversely proportional to the laxity. Before its cycle, each task must specify its desired start time, deadline time, and worst-case execution time. Whenever a task becomes ready to run, a reschedule operation is performed. The MUF scheduler is used to determine which task is to be performed next, using the following algorithm:
The optional assignment of user priorities assure we never reach step 4, and thus provides a deterministic scheduling algorithm. MUF provides for many advantages over EDF or RM.
There are three types of errors MUF can detect.
The first case is the standard notion of a missed deadline. The second case detects bad worst-case estimates. The third case allows the scheduler to make the most of CPU time (it will not start executing a task if it knows it will not complete it) and thus provides early detection of missed deadlines. ImplementationOne possible problem with the MUF approach is the computational overhead associated with it. The solution is to encode the algorithm into a single urgency value as shown below:
For such an encoding scheme, the ranges are:
This scheme will work as long as: This efficient encoding scheme then allows the MUF scheduler to calculate only a single dynamic priority for each task, then select the task with the highest urgency. RTOS Failure HandlingThe use of deadline failure handlers is recommended for all tasks in a system for several reasons.
So how do failure handlers work? Whenever a task fails to meet a deadline, a failure handler is invoked. The handler should be able to be programmed to execute at the same or different priority as a failing task to prevent problems with priority inversion. Possible actions by the handler include:
All modern RTOSs should provide some type of failure handling. ConclusionDesigning and developing real time software today is a sophisticated juggling act. Today’s systems employ more functionality than ever, elaborate GUIs, network connectivity, interprocessor communication and, at the same time, the time constraints must still be met. The consequences of failure range from severe to the unthinkable. This paper has explored some of the different types of RTOSs and the consequences of using them, with the hope that the reader will be in a better position to select the appropriate system. [1] This definition is located in the real-time FAQ in the Internet’s comp.realtime newsgroup. About Chandrta/Chandrasekar
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
All Topics, MFC / C++, C#, ASP.NET, .NET >> Design and Strategy >> Unedited Reader Contributions
Updated: 28 Oct 2004 1:52 |
Article content copyright Chandrta/Chandrasekar, 2004 everything else Copyright © CodeProject, 1999-2005. Advertise on The Code Project | Privacy |
| MSDN Communities | ASPAlliance • Developer Fusion • DevGuru • Programmers Heaven • Planet Source Code • Resource Index • Tek-Tips Forums • What is XML? • VisualBuilder • ZVON • Search Us! |