How could reading numbers using sscanf crash?
Asked Answered
R

3

11

Cppcheck has detected a potential problem in a code like this:

float a, b, c;
int count = sscanf(data, "%f,%f,%f", &a, &b, &c);

It says that: "scanf without field width limits can crash with huge data". How is that possible? Is that a known bug in some sscanf implementations? I understand that the numbers may overflow (numerically), but how could the program crash? Is that a false positive in cppcheck?

I have found a similar question: scanf Cppcheck warning, but the answer is not completely satisfying. The answer mentions type safety, but that should not be an issue here.

Rhomb answered 15/2, 2012 at 11:53 Comment(9)
Try sscanf_s instead. As normal scanf, sscanf is not overflow safe.Heshvan
@guitarflow: The problem is that I don't see where it may overflow.Rhomb
@Heshvan Or don’t. sscanf_s isn’t portable and also not actually safe, despite what the name suggests and Microsoft claims.Habitat
en.wikipedia.org/wiki/Format_string_attack is also important to pay attention to. Buffer overflows alone aren't the only vulnerabilities in scans. If you allow the user to input the format string they can use %x to print arbitrary memory locations and %n to write them. Among other things.Synchronism
@synthesizerpatel: As you can see, format is a string literal here, so that is not a problem.Rhomb
Yes, in this particular case perhaps. But, thats the problem with static code analysis - you get a lot of false positives. I try to avoid scanf just because whoever inherits my code might not know what the boundaries of safety are. But, thats just me.Synchronism
@KonradRudolph Really? I knew that it isn't portable but I didn't know about the potential danger. What makes it unsafe?Heshvan
@JurajBlaho Check this out crasseux.com/books/ctutorial/sscanf.html#sscanfHeshvan
@Heshvan Well it presupposes that you already know the buffer size. but this is usually the issue in the first place. sscanf_s doesn’t actually check (and cannot check) whether the buffer size is correct. So it protects only insofar as it makes the buffer size explicit. A far superior method is preventing buffer overflows in the first place, and C++ makes this trivial. (Also, at least one of the “safe” commands – but I don’t remember which – had a buffer overflow bug. Oh the irony.)Habitat
S
7

I am a Cppcheck developer.

Yes this is a weird crash. With "huge data" it means millions of digits.

If you use the --verbose flag then cppcheck will actually write a little example code that usually crashes on linux computers.

Here is an example code that crashes with a segmentation fault on my Ubuntu 11.10 computer:

#include <stdio.h>

#define HUGE_SIZE 100000000

int main()
{
    int i;
    char *data = new char[HUGE_SIZE];
    for (int i = 0; i < HUGE_SIZE; ++i)
        data[i] = '1';
    data[HUGE_SIZE-1] = 0;
    sscanf(data, "%i", &i);
    delete [] data;
    return 0;
}

For your info I don't get a crash when I try this example code on visual studio.

I used g++ version 4.6.1 to compile.

Subshrub answered 15/2, 2012 at 14:49 Comment(3)
The question remains. Why does it crash? I don't see any reason when the code to parse the number could be something like: for each digit in data: result*=10; result+=digit. How could that crash? Why is it not fixed?Rhomb
I primarily wanted to answer "Is that a false positive in Cppcheck?". It is a weird crash so it's easy to think so. I can't answer why technically it crashes. It has been a known and widespread problem for years. I agree that with your code it can't crash so obviously that is not how the data is parsed.Selah
Yes, I understand. Thanks at least for a partial answer. I gave you +1.Rhomb
U
4

The segmentation fault seems to be a bug in glibc.

I've just tested this with a similar program, which crashes in ubuntu 10.04, but works in ubuntu 12.04.

As Daniel Marjamäki said, his program crashes in 11.10, I believe the bug is fixed in between.

Uxoricide answered 12/6, 2012 at 14:34 Comment(0)
T
1

OK, consider this code:

int main(int argc, char *argv[]) {
    const char* data = "9999999999999999999999999.9999999999999999999999//i put alot more 9's there, this just to get the point through
    float a;
    int count = sscanf(data, "%f", &a);
    printf("%f",a);
}

the output of this program is "inf" - no crash. And I put a huge amounts of 9's there. So I suspect Cppcheck is just plain wrong about this.

Towpath answered 15/2, 2012 at 12:48 Comment(4)
Which compilers did you check this with?Equiponderance
compiled with just g++. why, did you get a different result with another compiler?Towpath
Not yet, but I feel that the conclusion "CppCheck is just plain wrong" may be a bit premature when testing on just 1 compiler. (I can only test with VC++2005 where I'm sitting now, sorry.)Equiponderance
Have you tried the sample posted by Daniel Marjamäki in your compiler?Rhomb

© 2022 - 2024 — McMap. All rights reserved.