Odd Behavior From Splint Bounds Checking
Asked Answered
C

1

7

Any splint experts out there?? I'm trying to use splint to statically analyze a large project I have in C. I'm seeing an excess number of bounds checking errors that are obviously not bounds errors. I wrote a small test program to try to isolate the problem, and noticed some really strange warnings when I ran splint on the code. I have 3 different examples. Here is the first:

int arr[3];

int main(void)
{
    int i;
    int var;

    arr[3] = 0; // (1) warning with +bounds, no warning with +likely-bounds

    return 0;
}

The arr[3] assignment generates a warning when using +bounds as I would expect, but does nothing when I use +likely-bounds. What does +likely-bounds even do? It seems to not work. The second example:

int arr[3];

int main(void)
{
    int i;
    int var;

    for (i = 0; i < 3; i++)
        var = arr[i]; // (2) warning, even though I'm within the bounds.

    return 0;
}

In this example splint complains that I'm reading outside the bounds of the array ("A memory read references memory beyond the allocated storage.") for var = arr[i], even though I'm obviously not. This should be a warning because the values in array are not initialized, but that's not the warning I get. Initializing the last value in the array will clear the error (but initializing the first or second won't). Am I doing something wrong? In the third example:

int arr[3];

int main(void)
{
    int i;
    int var;

    arr[3] = 0; // warning

    for (i = 0; i < 4; i++)
        var = arr[i]; // (3) no warning because arr[3] = 0 statement.

    return 0;
}

A warning is generated for arr[3] = 0, but not var = arr[i], even though it's obvious that the loop goes outside the bounds of the array. It looks like writing to the end of an array expands how large splint thinks the array is. How is that possible?

In short my questions are:

  1. What does the likely-bounds flag do?
  2. Is there any way that I can make splint give me legitimate errors that relate to going out of bounds?
  3. Is there any way to make splint not increase the size of arrays that are accessed past their bounds? Right now splint is reporting more than 750 warnings and I don't have time to verify each warning one by one.
Cobbs answered 22/11, 2011 at 23:9 Comment(2)
"This should be a warning because the values in array are not initialized" - Actually, static variables (including arrays) are initialized to zero by default, unless you explicitly initialize them otherwise.Intact
Also, I applaud you for including SSCCEs, not enough people do.Intact
M
2

Up front: I do not know 'splint', but I know the techniques quite well from using PC Lint intensively, and discussing several issues with its makers.

That said:

  • In your first example, the arr[3] is only flagged with +bounds propably because the element one past the last is a special case: It is allowed to create and use a pointer to the element one past the last, but it is not allowed to dereference such a pointer. Therefore, in syntax checkers (QA-C as well) it happens quite frequently that such warnings are less severe for N+1. Did you try arr[4]? My guess is, +likely_bounds will be sufficient for that.
  • The second example probably is caused by a somewhat confused 'splint'. I have seen similar errors in early versions of PC Lint and QA-C, since "value tracking" is far from easy. However, I cannot tell why split is complaining.
  • Your third example, 'splint' is correctly complaining about initializing arr[3], but for value tracking purposes it then has assumed arr[3] to be valid, and refrains from complaining about the loop. I'd guess you could initialize arr[100] and let the loop run until 100 without complaint!
Markusmarl answered 26/11, 2011 at 15:29 Comment(10)
I ran the test on the first example with +likely-bounds, with arr[3] changed to arr[100], and did not get any warnings. It has been several days and I haven't seen any solutions to the problem. It looks like splint doesn't have ability to do useful bounds checking at this time, does anyone know of any free tools that can be used for this purpose?Cobbs
I just know that it took Gimpel several bug-fix releases to get the value tracking under control and useful. It's not free, though, but affordable, less than US$300 for a single seat, several years of updates included. Runs on Windows, but can analyze anything, from Linux to AIX and Solaris.Markusmarl
Also, I'm not really clear on what you are saying regarding example 1. "It is allowed to create and use a pointer to the element one past the last". When you say it, do you mean the compiler or splint? Are you suggesting that dereferencing an array one past the end is not actually an error, and this is how compilers compile code?Cobbs
No, not at all. It's just that you may perform pointer calculations using a pointer one past the last element of a bounded array. That once was intended for copying like for (i=0;i<N; i++) {*tgt++ = *src++; } and similar. This snippet creates pointers one past the array bounds. And in systems with memory protection using narrow boundaries it might produce an exception. Dereferencing is still not allowed. But still, for a compiler (as well as a syntax/semantics checker) this is a special case to be handled differently, both from elements [0, N-1] and from elements [N+1, +>.Markusmarl
I do not have a clue what you are suggesting or implying. The example you provided doesn't tell me anything, assuming tgt and src are both pointing at an array of N items from the start, neither reads or writes past the last element. And the last statement is even more confusing, "and from elements [N+1, +>". What do elements 2 past the array and on have to do with this discussion? It sounds like you know something about static analysis that I don't though. From my perspective it is absurd to not flag an error for 1 past the array, but it seems common for static checkers to ignore this.Cobbs
Is there common verbage used for telling a static analysis tool to check for one past the last element? Do any tools have a flag for this caveat?Cobbs
C99 Standard (I use n1256.pdf), 6.5.6 sub 8 page 83: "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated." Common verbiage? Not that I know of...Markusmarl
And my numbering may be confusing, but if you have int arr[7];, the regular elements have indexes 0-6, index 7 cannot be used, but a pointer to it can validly be used in pointer arithmetic as long as the result lies between 0-6 inclusive, and pointers to indexes 8 and above produce undefined behavior, even if they are only used for pointer arithmetic with a valid result. E.g: *((arr + 7) - 1) is OK, *((arr + 8) - 2) may be compiled correctly, but the behavior is undefined.Markusmarl
let us continue this discussion in chatMarkusmarl
Ohh.. I understand what you are saying now. Still very odd static analysis tools don't catch arr[N] = value. I also tried CppCheck and it too did not catch arr[N] = value.Cobbs

© 2022 - 2024 — McMap. All rights reserved.