Why is printf with a single argument (without conversion specifiers) deprecated?
Asked Answered
K

11

108

In a book that I'm reading, it's written that printf with a single argument (without conversion specifiers) is deprecated. It recommends to substitute

printf("Hello World!");

with

puts("Hello World!");

or

printf("%s", "Hello World!");

Can someone tell me why printf("Hello World!"); is wrong? It is written in the book that it contains vulnerabilities. What are these vulnerabilities?

Keloid answered 8/7, 2015 at 11:6 Comment(7)
Note: printf("Hello World!") is not the same as puts("Hello World!"). puts() appends a '\n'. Instead compare printf("abc") to fputs("abc", stdout)Berberidaceous
You are aware that puts adds a newline? man7.org/linux/man-pages/man3/puts.3.htmlPassifloraceous
by the way, printf is also about 50% slower than fputs.Forlorn
What's that book? I don't think printf is deprecated in the same way that for example gets is deprecated in C99, so you may consider editing your question to be more precise.Harborage
It sounds like the book you're reading is not very good - a good book should not just say something like this is "deprecated" (that's factually false unless the author is using the word to describe their own opinion) and should explain what usage is actually invalid and dangerous rather than showing safe/valid code as an example of something you "shouldn't do".Mauretta
Can you identify the book?Flaunt
Please specify the title of the book, author and page reference. Thx.Bivins
M
131

printf("Hello World!"); is IMHO not vulnerable but consider this:

const char *str;
...
printf(str);

If str happens to point to a string containing %s format specifiers, your program will exhibit undefined behaviour (mostly a crash), whereas puts(str) will just display the string as is.

Example:

printf("%s");   //undefined behaviour (mostly crash)
puts("%s");     // displays "%s\n"
Milden answered 8/7, 2015 at 11:13 Comment(16)
Further to causing the program to crash, there are many other exploits possible with format strings. See here for more info: en.wikipedia.org/wiki/Uncontrolled_format_stringNellnella
Another reason is that puts will be presumably faster.Anonym
puts( "%s" ); would actually display %s\n as puts() appends a newline.Actable
@black: puts is "presumably" faster, and this is probably another reason people recommend it, but it is not actually faster. I just printed "Hello, world!" 1,000,000 times, both ways. With printf it took 0.92 seconds. With puts it took 0.93 seconds. There are things to worry about when it comes to efficiency, but printf vs. puts is not one of them.Reduce
@SteveSummit, that's because gcc compiles printf to puts.Calandracalandria
@KonstantinWeitz: But (a) I was not using gcc, and (b) it doesn't matter why the claim "puts is faster" is false, it's still false.Reduce
@SteveSummit, your experiment does not provide evidence that "puts is the same speed as printf", it provides evidence that a program that calls "printf with a literal format string and no format arguments compiled to assembly runs as fast as an equivalent program using puts, for some unspecified compiler". I just wanted to clarify that.Calandracalandria
@KonstantinWeitz: The claim I provided evidence for was (the opposite of) the claim user black was making. I am just trying to clarify that programmers should not be worried about calling puts for this reason. (But if you wanted to argue about it: I would be surprised if you could find any modern compiler for any modern machne where puts is significantly faster than printf under any circumstances.)Reduce
@SteveSummit Just for the heck of it, I've tested the performance of printf() vs fputs(). It turns out that fputs() really is slightly faster. You can refer to my answer for details. But then it shouldn't really be the bottleneck anyway.Lucy
@SteveSummit Your test was very likely flawed due to the bottleneck of your output being the screen buffer or file buffer. Pipe the output to /dev/null and see the performance difference.Tamah
No, I did not print it 1,000,000 times to the screen. I piped it into tail, which may or may not perturb the result in the same way as redirecting to /dev/null. But in any case: I would not agree that such a test is "flawed". If anything, the test with output redirected to /dev/null is flawed, because it is unrealistic. The interesting claim is not "printf is about the same speed as puts", but rather, "printf is about the same speed as puts in practice". (And again, all we're arguing about is whether "use puts instead of printf because it's faster" is good advice.)Reduce
I can imagine the difference being optimised away when a constant string does not contain %. And I can imagine that outputting a string can be slightly faster than outputting a string, followed by a newline.Oldie
@SteveSummit "Hello World" is not a good test. First, at parity of strings, they don't do the same thing (puts appends a \n, which may incur in further overhead). Second, printf gets slower with longer format strings as it has to go through it searching for %, unlike puts. Finally, did you check the assembly whether printf got optimized to puts (likely if you did printf("Hello World\n"))?Anonym
@black: This is a pointless argument. I should not have implied that printf is always just as fast, but you should not have implied that puts is always faster. (By the way, the argument that printf has to scan the whole string looking for % characters is not convincing, because puts has to scan the whole string -- looking for the terminating `\0' -- also.)Reduce
gcc automatically converts printf to puts when there is only a single argument, the format string doesn't contain any %-field, and it's terminated with '\n'. No need to activate optimizations for that. Just look at the assembly code produced by gcc -S.Hawthorn
Steve Summit's highly upvoted comment is very misleading and the people who gave the correct explanation are barely upvoted compared to the people who doubled down on a bogus argument. fputs is faster than printf. Yes, a good compiler effectively optimizes apropos printf calls to the same assembly as fputs calls so that in practice it doesn't really matter. The comment clearly doesn't make the proper distinction between the language implementation and the compiler in a way that suggests the author was unaware of the difference and is unqualified to be answering these questions.Westley
M
78

printf("Hello world");

is fine and has no security vulnerability.

The problem lies with:

printf(p);

where p is a pointer to an input that is controlled by the user. It is prone to format strings attacks: user can insert conversion specifications to take control of the program, e.g., %x to dump memory or %n to overwrite memory.

Note that puts("Hello world") is not equivalent in behavior to printf("Hello world") but to printf("Hello world\n"). Compilers usually are smart enough to optimize the latter call to replace it with puts.

Mirabel answered 8/7, 2015 at 11:27 Comment(2)
Of course printf(p,x) would be just as problematic if the user has control over p. So the problem is not the use of printf with just one argument but rather with a user-controlled format string.Pianissimo
@HagenvonEitzen That's technically true, but few would deliberately use a user-provided format string. When people write printf(p), it's because they don't realize that it's a format string, they just think they're printing a literal.Glycol
S
34

Further to the other answers, printf("Hello world! I am 50% happy today") is an easy bug to make, potentially causing all manner of nasty memory problems (it's UB!).

It's just simpler, easier and more robust to "require" programmers to be absolutely clear when they want a verbatim string and nothing else.

And that's what printf("%s", "Hello world! I am 50% happy today") gets you. It's entirely foolproof.

(Steve, of course printf("He has %d cherries\n", ncherries) is absolutely not the same thing; in this case, the programmer is not in "verbatim string" mindset; she is in "format string" mindset.)

Sock answered 8/7, 2015 at 13:14 Comment(5)
This is not worth an argument, and I understand what you're saying about the verbatim vs. format string mindset, but, well, not everybody thinks that way, which is one reason one-size-fits-all rules can rankle. Saying "never print constant strings with printf" is just about exactly like saying "always write if(NULL == p). These rules may be useful for some programmers, but not all. And in both cases (mismatched printf formats and Yoda conditionals), modern compilers warn about mistakes anyway, so the artificial rules are even less important.Reduce
@Steve If there are exactly zero upsides to using something, but quite a few downsides, then yes there's really no reason to use it. Yoda conditions on the other hand do have the downside that they make the code harder to read (you'd intuitively say "if p is zero" not "if zero is p").Bidding
@Bidding printf("%s", "hello") is going to be slower than printf("hello"), so there is a downside. A small one, because IO is almost always way slower than such simple formatting, but a downside.Elusion
@Yakk I doubt that would be slowerOlatha
gcc -Wall -W -Werror will prevent bad consequences from such mistakes.Lecia
L
18

I'll just add a bit of information regarding the vulnerability part here.

It's said to be vulnerable because of printf string format vulnerability. In your example, where the string is hardcoded, it's harmless (even if hardcoding strings like this is never fully recommended). But specifying the parameter's types is a good habit to take. Take this example:

If someone puts format string character in your printf instead of a regular string (say, if you want to print the program stdin), printf will take whatever he can on the stack.

It was (and still is) very used to exploit programs into exploring stacks to access hidden information or bypass authentication for example.

Example (C):

int main(int argc, char *argv[])
{
    printf(argv[argc - 1]); // takes the first argument if it exists
}

if I put as input of this program "%08x %08x %08x %08x %08x\n"

printf ("%08x %08x %08x %08x %08x\n"); 

This instructs the printf-function to retrieve five parameters from the stack and display them as 8-digit padded hexadecimal numbers. So a possible output may look like:

40012980 080628c4 bffff7a4 00000005 08059c04

See this for a more complete explanation and other examples.

Logic answered 8/7, 2015 at 13:43 Comment(0)
R
14

This is misguided advice. Yes, if you have a run-time string to print,

printf(str);

is quite dangerous, and you should always use

printf("%s", str);

instead, because in general you can never know whether str might contain a % sign. However, if you have a compile-time constant string, there's nothing whatsoever wrong with

printf("Hello, world!\n");

(Among other things, that is the most classic C program ever, literally from the C programming book of Genesis. So anyone deprecating that usage is being rather heretical, and I for one would be somewhat offended!)

Reduce answered 8/7, 2015 at 12:23 Comment(5)
because printf's first argument is always a constant string I am not exactly sure about what you mean with that.Sparkle
As I said, "He has %d cherries\n" is a constant string, meaning that it is a compile-time constant. But, to be fair, the author's advice was not "don't pass constant strings as printf's first argument", it was "don't pass strings without % as printf's first argument."Reduce
literally from the C programming book of Genesis. Anyone deprecating that usage is being quite offensively heretical - you haven't actually read K&R in recent years. There's a ton of advice and coding styles in there that's not just deprecated but just plain bad practice these days.Bidding
@Voo: Well, let's just say that not everything that's considered bad practice is actually bad practice. (The advice to "never use plain int" springs to mind.)Reduce
@Steve I've no idea where you heard that one, but that's certainly not the kind of bad (bad?) practice we're talking about there. Don't misunderstand me, for the time the code was perfectly fine, but you really don't want to look at k&r for much but as a historical note these days. "It's in k&r" just isn't an indicator of good quality these days, that's allBidding
C
13

Calling printf with literal format strings is safe and efficient, and there exist tools to automatically warn you if your invocation of printf with user provided format strings is unsafe.

The most severe attacks on printf take advantage of the %n format specifier. In contrast to all other format specifiers, e.g. %d, %n actually writes a value to a memory address provided in one of the format arguments. This means that an attacker can overwrite memory and thus potentially take control of your program. Wikipedia provides more detail.

If you call printf with a literal format string, an attacker cannot sneak a %n into your format string, and you are thus safe. In fact, gcc will change your call to printf into a call to puts, so there litteraly isn't any difference (test this by running gcc -O3 -S).

If you call printf with a user provided format string, an attacker can potentially sneak a %n into your format string, and take control of your program. Your compiler will usually warn you that his is unsafe, see -Wformat-security. There are also more advanced tools that ensure that an invocation of printf is safe even with user provided format strings, and they might even check that you pass the right number and type of arguments to printf. For example, for Java there is Google's Error Prone and the Checker Framework.

Calandracalandria answered 8/7, 2015 at 16:41 Comment(0)
C
9

A rather nasty aspect of printf is that even on platforms where the stray memory reads could only cause limited (and acceptable) harm, one of the formatting characters, %n, causes the next argument to be interpreted as a pointer to a writable integer, and causes the number of characters output thus far to be stored to the variable identified thereby. I've never used that feature myself, and sometimes I use lightweight printf-style methods which I've written to include only the features I actually use (and don't include that one or anything similar) but feeding standard printf functions strings received from untrustworthy sources may expose security vulnerabilities beyond the ability to read arbitrary storage.

Coachwhip answered 8/7, 2015 at 16:5 Comment(0)
L
8

Since no one has mentioned, I'd add a note regarding their performance.

Under normal circumstances, assuming no compiler optimisations are used (i.e. printf() actually calls printf() and not fputs()), I would expect printf() to perform less efficiently, especially for long strings. This is because printf() has to parse the string to check if there are any conversion specifiers.

To confirm this, I have run some tests. The testing is performed on Ubuntu 14.04, with gcc 4.8.4. My machine uses an Intel i5 cpu. The program being tested is as follows:

#include <stdio.h>
int main() {
    int count = 10000000;
    while(count--) {
        // either
        printf("qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM");
        // or
        fputs("qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM", stdout);
    }
    fflush(stdout);
    return 0;
}

Both are compiled with gcc -Wall -O0. Time is measured using time ./a.out > /dev/null. The following is the result of a typical run (I've run them five times, all results are within 0.002 seconds).

For the printf() variant:

real    0m0.416s
user    0m0.384s
sys     0m0.033s

For the fputs() variant:

real    0m0.297s
user    0m0.265s
sys     0m0.032s

This effect is amplified if you have a very long string.

#include <stdio.h>
#define STR "qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM"
#define STR2 STR STR
#define STR4 STR2 STR2
#define STR8 STR4 STR4
#define STR16 STR8 STR8
#define STR32 STR16 STR16
#define STR64 STR32 STR32
#define STR128 STR64 STR64
#define STR256 STR128 STR128
#define STR512 STR256 STR256
#define STR1024 STR512 STR512
int main() {
    int count = 10000000;
    while(count--) {
        // either
        printf(STR1024);
        // or
        fputs(STR1024, stdout);
    }
    fflush(stdout);
    return 0;
}

For the printf() variant (ran three times, real plus/minus 1.5s):

real    0m39.259s
user    0m34.445s
sys     0m4.839s

For the fputs() variant (ran three times, real plus/minus 0.2s):

real    0m12.726s
user    0m8.152s
sys     0m4.581s

Note: After inspecting the assembly generated by gcc, I realised that gcc optimises the fputs() call to an fwrite() call, even with -O0. (The printf() call remains unchanged.) I am not sure whether this will invalidate my test, as the compiler calculates the string length for fwrite() at compile-time.

Lucy answered 9/7, 2015 at 4:30 Comment(4)
It won't invalidate your test, as fputs() is often used with string constants and that optimisation opportunity is part of the point you wanted to make.This said, adding a test run with a dynamicly generated string with fputs() and fprintf() would be a nice supplemental data point.Pincus
@PatrickSchlüter Testing with dynamically generated strings seems to defeat the purpose of this question though... OP seems to be interested in only string literals to be printed.Lucy
He doesn't state it explicitly even if his example uses string literals. In fact, I think his confusion about the advice of the book is a result of the use of string literals in the example. With string literals, the books advice is somehow dubious, with dynamic strings it is good advice.Pincus
/dev/null sort of makes this a toy, in that usually when generating formatted output, your goal is for the output to go somewhere, not be discarded. Once you add in "actually not discarding the data" time, how do they compare?Elusion
M
7
printf("Hello World\n")

automatically compiles to the equivalent

puts("Hello World")

you can check it with diassembling your executable:

push rbp
mov rbp,rsp
mov edi,str.Helloworld!
call dword imp.puts
mov eax,0x0
pop rbp
ret

using

char *variable;
... 
printf(variable)

will lead to security issues, don't ever use printf that way!

so your book is actually correct, using printf with one variable is deprecated but you can still use printf("my string\n") because it will automatically become puts

Mokpo answered 8/7, 2015 at 11:28 Comment(2)
This behaviour actually depends entirely on the compiler.Milden
This is misleading. You state A compiles to B, but in reality you mean A and B compile to C.Sparkle
J
7

For gcc it is possible to enable specific warnings for checking printf() and scanf().

The gcc documentation states:

-Wformat is included in -Wall. For more control over some aspects of format checking, the options -Wformat-y2k, -Wno-format-extra-args, -Wno-format-zero-length, -Wformat-nonliteral, -Wformat-security, and -Wformat=2 are available, but are not included in -Wall.

The -Wformat which is enabled within the -Wall option does not enable several special warnings that help to find these cases:

  • -Wformat-nonliteral will warn if you do not pass a string litteral as format specifier.
  • -Wformat-security will warn if you pass a string that might contain a dangerous construct. It's a subset of -Wformat-nonliteral.

I have to admit that enabling -Wformat-security revealed several bugs we had in our codebase (logging module, error handling module, xml output module, all had some functions that could do undefined things if they had been called with % characters in their parameter. For info, our codebase is now around 20 years old and even if we were aware of these kind of problems, we were extremely surprised when we enabled these warnings how many of these bugs were still in the codebase).

Jurisprudent answered 9/7, 2015 at 8:5 Comment(0)
M
1

Beside the other well-explained answers with any side-concerns covered, I would like to give a precise and concise answer to the provided question.


Why is printf with a single argument (without conversion specifiers) deprecated?

A printf function call with a single argument in general is not deprecated and has also no vulnerabilities when used properly as you always shall code.

C Users amongst the whole world, from status beginner to status expert use printf that way to give a simple text phrase as output to the console.

Furthermore, Someone have to distinguish whether this one and only argument is a string literal or a pointer to a string, which is valid but commonly not used. For the latter, of course, there can occur inconvenient outputs or any kind of Undefined Behavior, when the pointer is not set properly to point to a valid string but these things can also occur if the format specifiers are not matching the respective arguments by giving multiple arguments.

Of course, It is also not right and proper that the string, provided as one and only argument, has any format or conversion specifiers, since there is no conversion going to be happen.

That said, giving a simple string literal like "Hello World!" as only argument without any format specifiers inside that string like you provided it in the question:

printf("Hello World!");

is not deprecated or "bad practice" at all nor has any vulnerabilities.

In fact, many C programmers begin and began to learn and use C or even programming languages in general with that HelloWorld-program and this printf statement as first ones of its kind.

They wouldn´t be that if they were deprecated.

In a book that I'm reading, it's written that printf with a single argument (without conversion specifiers) is deprecated.

Well, then I would take the focus on the book or the author itself. If an author is really doing such, in my opinion, incorrect assertions and even teaching that without explicitly explaining why he/she is doing so (if those assertions are really literally equivalent provided in that book), I would consider it a bad book. A good book, as opposed to that, shall explain why to avoid certain kind of programming methods or functions.

According to what I said above, using printf with only one argument (a string literal) and without any format specifiers is not in any case deprecated or considered as "bad practice".

You should ask the author, what he meant with that or even better, mind him to clarify or correct the relative section for the next edition or imprints in general.

Mossback answered 14/2, 2020 at 16:50 Comment(1)
You might add that printf("Hello World!"); is not equivalent to puts("Hello World!"); anyway, which tells something about the author of the recommendation.Lecia

© 2022 - 2024 — McMap. All rights reserved.