scanf Cppcheck warning
Asked Answered
H

3

8

Cppcheck shows the following warning for scanf:

Message: scanf without field width limits can crash with huge input data. To fix this error message add a field width specifier:
    %s => %20s
    %i => %3i

Sample program that can crash:

#include 
int main()
{
    int a;
    scanf("%i", &a);
    return 0;
}

To make it crash:
perl -e 'print "5"x2100000' | ./a.out

I cannot crash this program typing "huge input data". What exactly should I type to get this crash? I also don't understand the meaning of the last line in this warning:

perl -e ...

Hervey answered 11/8, 2011 at 7:43 Comment(4)
"Press any key to continue." "Where's the any key??"Extramural
@Dave: ????? Your comment looks like spam :(Hervey
What? No. By the wording of your question, it looked like you were misinterpreting the the phrase "huge input data" -- it's not something you type in, it's a property of the input. It's the same scenario as the classic any key joke, which I was using as a metaphor for your problem.Extramural
@Dave: Well, now I see, my question looks funny... Only one person really answered it. Mybe this is "Where's the huge input data" problem :)Hervey
D
6

The last line is an example command to run to demonstrate the crash with the sample program. It essentially causes perl to print 2.100.000 times "5" and then pass this to the stdin of the program "a.out" (which is meant to be the compiled sample program).

First of all, scanf() should be used for testing only, not in real world programs due to several issues it won't handle gracefully (e.g. asking for "%i" but user inputs "12345abc" (the "abc" will stay in stdin and might cause following inputs to be filled without a chance for the user to change them).

Regarding this issue: scanf() will know it should read a integer value, however it won't know how long it can be. The pointer could point to a 16 bit integer, 32 bit integer, or a 64 bit integer or something even bigger (which it isn't aware off). Functions with a variable number of arguments (defined with ...) don't know the exact datatype of elements passed, so it has to rely on the format string (reason for the format tags to not be optional like in C# where you just number them, e.g. "{0} {1} {2}"). And without a given length it has to assume some length which might be platform dependant as well (making the function even more unsave to use).

In general, consider it possibly harmful and a starting point for buffer overflow attacks. If you'd like to secure and optimize your program, start by replacing it with alternatives.

Doublecheck answered 11/8, 2011 at 7:57 Comment(9)
Scanf knows exactly how large the pointer can be: %d is sizeof(int)*8-bit, %ld is sizeof(long)*8-bit, and so forth. All of these are known at compile-time.Extramural
@Dave: int n[SIZE]; memset(n, 0, sizeof(n)*sizeof(int));Hervey
I don't understand why this answer is accepted. it doesn't answer the question. is the answer that scanf() assumes the number of input string digits defines the bit-width of the integer it's reading into (cppreference would disagree with that)? or is it that some scanf() implementations have internal overflows?Clegg
@Clegg While my answer got slightly off-topic, that's not how functions with variable arguments work. The argument number and type are essentially "lost" and have to be reconstructed based on the format string (or through some other mechanism depending on the function). I'm not 100% sure on what the standard dictates, but with inputs longer than what the variables can accept you're probably entering undefined behavior land.Doublecheck
if so, it seems like a specification defect. as Dave pointed out, the format string contains the information about variable widths, and it seems it would be possible for scanf() to make sure not to overflow them.Clegg
This answer would be vaguely useful if any of the alternatives were mentioned. As it is, it's not.Glorious
@CodeAbominator Then how about either editing the answer – if you can – or post your own? Just complaining about things not mentioned won't help anyone. Feel free to post an alternative and more complete answer and then mention the original creator of the question in a comment to consider swapping the accepted answer.Doublecheck
@Doublecheck the problem is that I'm looking for answers, and all I have found is "don't do the only thing that works". Or, almost as useless "handroll your own alternative to sscanf and hope for the best". I'm reluctant to suggest that, but you're right, it does have the advantage of being something people can actually do, rather than being dead end.Glorious
@CodeAbominator There's no magical solution that works for everyone. Huge part of the problem are potential mismatches between the format string and the memory pointed to in the parameters. For example iostream avoids this thanks to knowing and considering actual types. You can still use scanf(), the solution is in the warning message, it's just a lot harder to use it right.Doublecheck
R
0

I tried running the perl expression against the C program and it did crash here on Linux (segmentation fault).

Rip answered 11/8, 2011 at 7:51 Comment(0)
S
0

Using of 'scanf' (or fscanf and sscanf) function in real-world applications usually is not recommended at all because it's not safe and it's usually a hole for buffer overrun if some incorrect input data will be supplied. There are much more secure ways to input numbers in many commonly used libraries for C++ (QT, runtime libraries for Microsoft Visual C++ etc.). Probably you can find secure alternatives for "pure" C language too.

Siloum answered 11/8, 2011 at 7:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.