C boolean invalid values handling [duplicate]
Asked Answered
G

1

3

I'm in a safety critical embedded C project and there's a discussion about detecting memory corruptions (e.g. buffer overflows) in boolean variables. As everyone knows, in C, the "boolean" type is actually an N-bit integer and that means it has potentially 2N-2 invalid values. E.g. if you declare FALSE as 0 and TRUE as 1 (by macros, constants or enums), then it is possible to say that <0 (in case of signed type) or >1 are consequences of memory corruption (or a bug).

So theoretically it should be possible construct such fault capture code blocks:

if (b == TRUE)       { /* Good, do something               */ }
else if (b == FALSE) { /* Good, but don't do anything      */ }
else                 { /* Memory corruption. Deal with it. */ }

Or do it with switch-case. It is mandatory to have for state variables and other enum types, but doing it for booleans certainly adds a lot of code and my question is - is it worth the effort?

Gaia answered 18/5, 2020 at 14:4 Comment(16)
Considering that memory corruption could also overwrite b with precisely 0 or 1 I doubt this is really worth it. There's better ways of trying to detect memory overwrites that don't require programmers to add code to every single check they performExcogitate
How do you detect corruption in variables of other types, e.g., int? Wouldn't they be just as critical, or even more so?Easel
What is the benefit of terminating if … else if constructs with an else clause? might answer the question. I gave an example with bool in my answer. All safety-critical (and MISRA-C compatible) programs need to implement defensive programming. That includes, terminating all else if with else, as well as terminating all switch with default.Gluck
A better question might be why you are still using C90 in safety-related software. It contains a lot of hazards that C99 solved: "implicit int", "the struct hack", reliable integer division etc.Gluck
It's close to impossible to do this using a purely software approach in a single controller. A critical embedded project will have redundant hardware, error-correcting memory, hardware watchdogs and similar higher level supervisors. And if you used stdbool.h here instead, an optimizing compiler would simply optimize away the third path in this case, and this would likely happen with other "impossible" checks you try to implement.Caryopsis
There is no reason in the C standard to believe the code shown would suffice to detect invalid bit patterns, if b is declared as a _Bool. The C standard does not say what happens if, when reading a _Bool object, the bits in memory are other than those used for the values 0 or 1. To detect invalid bit patterns, you should examine the contents of memory using a pointer to unsigned char. If b is some integer type other than _Bool, then that code might work (although the C standard still permits padding bits and trap representations).Senate
On some old codebases, before a bool (_Bool) type was standardized and widely used, regular integers were used as booleans. Every time I have to work in one of those I hate it. What is the type of b? If it is a bool or _Bool, you're not going to get far trying to do what you're doing. You get into some weird behavior pretty quickly. This example: godbolt.org/z/Chcd7M returns the value 56 from a function returning bool.Balefire
If the b in your example code has type _Bool (or bool), it would be interesting to see which of your three branches is taken if it has a value other than 0 or 1. The behavior is probably undefined. So if you really need to check for things like this, it is probably better to avoid using _Bool (or bool) as the type of such objects altogether.Staminody
I would say @Gluck answered my question. I know about that MISRA else-if termination rule, which means it's about automotive safety topic, but I could find if MISRA tells about undetermined state of boolean (I guess not). Anyway, that link to another question answer gives me confidence that this scenario has been used out there.Gaia
@FredLarson - integer values can have range checks. E.g. PWM duty cycle range can be 0 to 100 (in percentages).Gaia
@Groo - there's one MCU, but it's safety MCU with ECC, WD and other features. But those features don't protect against internal SW caused corruptions. MPU protects, but it has it's own downsides. Anyway, this boolean thing is just one potential option to help detecting corruption.Gaia
@Excogitate - true, it won't be a single effective countermeasure, but having plenty of run-time sanity checks, asserts and other (hardware) features increases the likelyness of capturing corruption early. On the other hand increased code complexity also increases likelyness of bugs, so that's why there's probably some balance point...Gaia
@MikkL. Safety-related design is all about minimizing the probability of failures. So you should have defensive programming and ECC and safety MCU and wdog... and so on. Many safety features overlap each other, which is great, because if one of them isn't working for whatever reason, another will.Gluck
Btw it actually doesn't matter if the bool type is a home-made enum or the type defined by the language. See this: #52164370. My evil, intentional UB access of the _Bool in that example could as well have been caused by runaway code or other memory corruption.Gluck
@ThomasJager: the example you have shown is the result of an undefined behavior, and as such demonstrates a successful demonstration of purposefully confusing the compiler. You are only allowed to alias a pointer with char* for reading the contents of the memory, not for writing arbitrary values into memory and then using the previously aliased object afterwards, especially purposefully using values not representable using the aliased type. You may as well ask why these two functions return different results.Caryopsis
@Groo That was the point.Balefire
C
0

Depends on the safety class you try to reach. The above example is not very safe when considering that the memory corruption also could mean a change in bit0 which would make a TRUE to FALSE or vice versa.

Therefore I have seen much more wrapping to secure critican variables.

such as storing each variable in a struct consisting of the variable itself and its complement as copy.

struct tag_intvar{
    int variable;
    int complement;
};

and then working with getter/setter functions to grant atomic access and and perform consistensy checking/handling.

int setintvalue(tag_intvar* var, int val){
    if(isconsistent(var)){
        var.variable = val;
        var.complement = ~val;
        return TRUE;
    }
    //... inconsistent... handler
    return FALSE;
}

while

int isconsistent(tag_intvar* var){
   return (var.variable == ~var.complement)?TRUE:FALSE;
}
Civies answered 18/5, 2020 at 14:34 Comment(5)
And then... the memory gets corrupted after line 4 in setintvalue.Caryopsis
yes, there are serveral levels of safety. Maybe Mikk clarifies the class of safety he is looking for. this is coming from the low end (white goods, like dishwashers). this is not as high as medical, automotive, aviation or even space.Civies
@Civies - In fact I was thinking about something like that, but probably those getters/setters should be called from uninterrupted section (interrupts disabled). On the other hand it would be quite large update to existing code.Gaia
yes, it needs to be as atomic as possible, therefore disable interrupt for line 3 and 4. But as @Groo mentioned, the disturbtion coud still happen in between line 3 and 4. but this would then be recognized by the getter, that has to check as well.Civies
The corruption could take place in setintvalue(), however the likelihood of it happening at exactly this location is low. So you have significantly reduced the probability of a bit flip. Bit flips can often occur when higher power devices are used. Often the probability of an error is important in these standards. Safety is different in this aspect from security (and reliability).Nodular

© 2022 - 2024 — McMap. All rights reserved.