Do you use the TR 24731 'safe' functions? [closed]
Asked Answered
S

5

90

The ISO C committee (ISO/IEC JTC1/SC21/WG14) has published TR 24731-1 and is working on TR 24731-2:

TR 24731-1: Extensions to the C Library Part I: Bounds-checking interfaces

WG14 is working on a TR on safer C library functions. This TR is oriented towards modifying existing programs, often by adding an extra parameter with the buffer length. The latest draft is in document N1225. A rationale is in document N1173. This is to become a Technical Report type 2.

TR 24731-2: Extensions to the C Library - Part II: Dynamic allocation functions

WG14 is working on a TR on safer C library functions. This TR is oriented towards new programs using dynamic allocation instead of an extra parameter for the buffer length. The latest draft is in document N1337. This is to become a Technical Report type 2.

Questions

  • Do you use a library or compiler with support for the TR24731-1 functions?
  • If so, which compiler or library and on which platform(s)?
  • Did you uncover any bugs as a result of fixing your code to use these functions?
  • Which functions provide the most value?
  • Are there any that provide no value or negative value?
  • Are you planning to use the library in the future?
  • Are you tracking the TR24731-2 work at all?
Swingeing answered 16/12, 2008 at 22:21 Comment(11)
This is probably a dumb question, but why don't they just add strlen to the code, and not worrying about changing the function definition?Transponder
@MarcusJ: Hmmm — I would need clarification on what you mean about 'add strlen() to the code'. There are definitely times when strlen() is not the right answer, such as when passing a buffer to an I/O function (such as gets_s()). But maybe you can elaborate on what you're thinking of?Swingeing
Why change the function definition to add the size of the buffer, when that function can just call strlen on the buffer itself, and store the size in it's own, internal variable? that way the function definition doesn't change.Transponder
@MarcusJ: Because there are functions where that cannot work reliably. For example, gets() — there's no requirement that the buffer be initialized with a maximum length string before it is used. Ditto sprintf(), strcpy(). I don't think you could append anything with strcat() under the rules you're hypothesizing. It is a non-starter.Swingeing
Why couldn't you use realloc? I mean I've never written my own standard library, so I'm not experienced enough to talk about this stuff, but I don't understand why it's impossible.Transponder
@MarcusJ: You can't use realloc() because the functions that need protection don't allocate. The strcpy() function, for example, doesn't do memory allocation; you can't sanely modify it to do memory allocation, even if you have garbage collection, because people don't generally use the return value but use the value passed as the first argument to strcpy() in further operations. Similar problems arise with gets() and strcat(). Those at least return a char * that might point to reallocated space (not that there's a guarantee that the arguments were allocated). […continued…]Swingeing
[…continuation…] The problem is worse with functions such as sprintf() which don't return a char *; there is no way for them to tell the calling code that they've 'reallocated' the memory where the result was placed. Note that one of the reasons why TR 24731-2 did not make it into C11 was that they would be the first functions to explicitly do memory allocation — other than malloc() et al. Please take time to study what the functions do, what the Annex K / TR 24731-1 functions do, the rationales for why they do it, and so on. There are some sound reasons for the decisions made.Swingeing
@JonathanLeffler: If the target of a "string" pointer were a header or descriptors (looking at the byte there could distinguish those cases) rather than the first byte of text, then it would be possible to have string functions which could write to bounds-checked fixed-size buffers or dynamically-sized buffers automatically. The only major obstacle to doing that in C (but alas it's a big one) is the lack of any nice way to handle string literals. Otherwise, a function with a descriptor for a dynamically-sized string would be able to use an indirect function pointer in the descriptor to...Nodal
...attempt to adjust the length of the string identified thereby and update the length and address in the descriptor, without the caller having to know or care if the string got relocated.Nodal
Hmm, this question doesn't really fit Stack Overflow nowadays ;)Pullover
@AnttiHaapala: possibly not (though I think SO is getting a bit too strict these days). I'd want to argue for at least a historical status for it (historical lock). It could be rephrased along the lines of 'Are the TR24731 (Annex K) functions usable?', but … . In particular, I believe the information in my answer is useful to C programmers, and should be hosted somewhere in the C section of SO. Once upon a time, it might have been incorporated into 'docs' — that won't happen now.Swingeing
A
73

I have been a vocal critic of these TRs since their inception (when it was a single TR) and would never use them in any of my software. They mask symptoms instead of addressing causes and it is my opinion that if anything they will have a negative impact on software design as they provide a false sense of security instead of promoting existing practices that can accomplish the same goals much more effectively. I am not alone, in fact I am not aware of a single major proponent outside of the committee developing these TRs.

I use glibc and as such know that I will be spared having to deal with this nonsense, as Ulrich Drepper, lead maintainer for glibc, said about the topic:

The proposed safe(r) ISO C library fails to address to issue completely. ... Proposing to make the life of a programmer even harder is not going to help. But this is exactly what is proposed. ... They all require more work to be done or are just plain silly.

He goes on to detail problems with a number of the proposed functions and has elsewhere indicated that glibc would never support this.

The Austin Group (responsible for maintaining POSIX) provided a very critical review of the TR, their comments and the committee responses available here. The Austin Group review does a very good job detailing many of the problems with the TR so I won't go into individual details here.

So the bottom line is: I don't use an implementation that supports or will support this, I don't plan on ever using these functions, and I see no positive value in the TR. I personally believe that the only reason the TR is still alive in any form is because it is being pushed hard by Microsoft who has recently proved very capable of getting things rammed though standards committees despite wide-spread opposition. If these functions are ever standardized I don't think they will ever become widely used as the proposal has been around for a few years now and has failed to garner any real community support.

Alva answered 16/12, 2008 at 22:59 Comment(6)
Citing Ulrich Drepper's opinion as any kind of authority is a good way to shoot down your argument on the spot, regardless of any other redeeming circumstances.Subtreasury
@Pavel, I cited Drepper as an authority for glibc. Despite whatever personal issues you may have with him, he is the lead maintainer of glibc and pretty much decides what will and won't be included in glibc, like it or not. I didn't leverage my case against the TR on his opinion at all, your comment appears to be based on strong personal animosity against one individual and if that blinds you from being able to see the bigger picture, that is a fault you should work on.Alva
+1. People that don't know what they're doing should be using VB, not C :-)Semiweekly
I work for a large company (>60,000 employees, mostly engineering) where use of this library is now a standard requirement for all new code. I agree it provides a false sense of security for the uninformed, but a modicum of additional security is better than none.Canzone
For the win: Multiple libupnp buffer overflows. Those safer functions that you don't like would have stopped most of them. Good job on peddling bad advice :)Cryosurgery
^^^ Using programmers who know how to program properly would have stopped all of them...Copyread
S
37

Direct answer to question

I like Robert's answer, but I also have some views on the questions I raised.

  • Do you use a library or compiler with support for the TR24731-1 functions?

    No, I don't.

  • If so, which compiler or library and on which platform(s)?

    I believe the functions are provided by MS Visual Studio (MS VC++ 2008 Edition, for example), and there are warnings to encourage you to use them.

  • Did you uncover any bugs as a result of fixing your code to use these functions?

    Not yet. And I don't expect to uncover many in my code. Some of the other code I work with - maybe. But I've yet to be convinced.

  • Which functions provide the most value?

    I like the fact that the printf_s() family of functions do not accept the '%n' format specifier.

  • Are there any that provide no value or negative value?

    The tmpfile_s() and tmpnam_s() functions are a horrible disappointment. They really needed to work more like mkstemp() which both creates the file and opens it to ensure there is no TOCTOU (time-of-check, time-of-use) vulnerability. As it stands, those two provide very little value.

    I also think that strerrorlen_s() provides very little value.

  • Are you planning to use the library in the future?

    I am in two minds about it. I started work on a library that would implement the capabilities of TR 24731 over a standard C library, but got caught by the amount of unit testing needed to demonstrate that it is working correctly. I'm not sure whether to continue that. I have some code that I want to port to Windows (mainly out of a perverse desire to provide support on all platforms - it's been working on Unix derivatives for a couple of decades now). Unfortunately, to get it to compile without warnings from the MSVC compilers, I have to plaster the code with stuff to prevent MSVC wittering about me using the perfectly reliable (when carefully used) standard C library functions. And that is not appetizing. It is bad enough that I have to deal with most of two decades worth of a system that has developed over that period; having to deal with someone's idea of fun (making people adopt TR 24731 when they don't need to) is annoying. That was partly why I started the library development - to allow me to use the same interfaces on Unix and Windows. But I'm not sure what I'll do from here.

  • Are you tracking the TR24731-2 work at all?

    I'd not been tracking it until I went to the standards site while collecting the data for the question. The asprintf() and vasprintf() functions are probably valuable; I'd use those. I'm not certain about the memory stream I/O functions. Having strdup() standardized at the C level would be a huge step forward. This seems less controversial to me than the part 1 (bounds checking) interfaces.

Overall, I'm not convinced by part 1 'Bounds-Checking Interfaces'. The material in the draft of part 2 'Dynamic Allocation Functions' is better.

If it were up to me, I'd move somewhat along the lines of part 1, but I'd also revised the interfaces in the C99 standard C library that return a char * to the start of the string (e.g. strcpy() and strcat()) so that instead of returning a pointer to the start, they'd return a pointer to the null byte at the end of the new string. This would make some common idioms (such as repeatedly concatenating strings onto the end of another) more efficient because it would make it trivial to avoid the quadratic behaviour exhibited by code that repeatedly uses strcat(). The replacements would all ensure null-termination of output strings, like the TR24731 versions do. I'm not wholly averse to the idea of the checking interface, nor to the exception handling functions. It's a tricky business.


Microsoft's implementation is not the same as the standard specification

Update (2011-05-08)

See also this question. Sadly, and fatally to the usefulness of the TR24731 functions, the definitions of some of the functions differs between the Microsoft implementation and the standard, rendering them useless (to me). My answer there cites vsnprintf_s().

For example, TR 24731-1 says the interface to vsnprintf_s() is:

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdarg.h>
#include <stdio.h>
int vsnprintf_s(char * restrict s, rsize_t n,
                const char * restrict format, va_list arg);

Unfortunately, MSDN says the interface to vsnprintf_s() is:

int vsnprintf_s(
   char *buffer,
   size_t sizeOfBuffer,
   size_t count,
   const char *format,
   va_list argptr 
);

Parameters

  • buffer - Storage location for output.
  • sizeOfBuffer - The size of the buffer for output.
  • count - Maximum number of characters to write (not including the terminating null), or _TRUNCATE.
  • format - Format specification.
  • argptr - Pointer to list of arguments.

Note that this is not simply a matter of type mapping: the number of fixed arguments is different, and therefore irreconcilable. It is also unclear to me (and presumably to the standards committee too) what benefit there is to having both 'sizeOfBuffer' and 'count'; it looks like the same information twice (or, at least, code will commonly be written with the same value for both parameters).

Similarly, there are also problems with scanf_s() and its relatives. Microsoft says that the type of the buffer length parameter is unsigned (explicitly stating 'The size parameter is of type unsigned, not size_t'). In contrast, in Annex K, the size parameter is of type rsize_t, which is the restricted variant of size_t (rsize_t is another name for size_t, but RSIZE_MAX is smaller than SIZE_MAX). So, again, the code calling scanf_s() would have to be written differently for Microsoft C and Standard C.

Originally, I was planning to use the 'safe' functions as a way of getting some code to compile cleanly on Windows as well as Unix, without needing to write conditional code. Since this is defeated because the Microsoft and ISO functions are not always the same, it is pretty much time to give up.


Changes in Microsoft's vsnprintf() in Visual Studio 2015

In the Visual Studio 2015 documentation for vsnprintf(), it notes that the interface has changed:

Beginning with the UCRT in Visual Studio 2015 and Windows 10, vsnprintf is no longer identical to _vsnprintf. The vsnprintf function complies with the C99 standard; _vnsprintf is retained for backward compatibility.

However, the Microsoft interface for vsnprintf_s() has not changed.


Other examples of differences between Microsoft and Annex K

The C11 standard variant of localtime_s() is defined in ISO/IEC 9899:2011 Annex K.3.8.2.4 as:

struct tm *localtime_s(const time_t * restrict timer,
                       struct tm * restrict result);

compared with the MSDN variant of localtime_s() defined as:

errno_t localtime_s(struct tm* _tm, const time_t *time);

and the POSIX variant localtime_r() defined as:

struct tm *localtime_r(const time_t *restrict timer,
                       struct tm *restrict result);

The C11 standard and POSIX functions are equivalent apart from name. The Microsoft function is different in interface even though it shares a name with the C11 standard.

Another example of differences is Microsoft's strtok_s() and Annex K's strtok_s():

char *strtok_s(char *strToken, const char *strDelimit, char **context); 

vs:

char *strtok_s(char * restrict s1, rsize_t * restrict s1max, const char * restrict s2, char ** restrict ptr);

Note that the Microsoft variant has 3 arguments whereas the Annex K variant has 4. This means that the argument list to Microsoft's strtok_s() is compatible with POSIX's strtok_r() — so calls to these are effectively interchangeable if you change the function name (e.g. by a macro) — but the Standard C (Annex K) version is different from both with the extra argument.

The question Different declarations of qsort_r() on Mac and Linux has an answer that also discusses qsort_s() as defined by Microsoft and qsort_s() as defined by TR24731-1 — again, the interfaces are different.


ISO/IEC 9899:2011 — C11 Standard

The C11 standard (December 2010 Draft; you could at one time obtain a PDF copy of the definitive standard, ISO/IEC 9899:2011, from the ANSI web store for 30 USD) does have the TR24731-1 functions in it as an optional part of the standard. They are defined in Annex K (Bounds-checking Interfaces), which is 'normative' rather than 'informational', but it is optional.

The C11 standard does not have the TR24731-2 functions in it — which is sad because the vasprintf() function and its relatives could be really useful.

Quick summary:

  • C11 contains TR24731-1
  • C11 does not contain TR24731-2
  • C18 is the same as C11 w.r.t TR24731.

Proposal to remove Annex K from the successor to C11

Deduplicator pointed out in a comment to another question that there is a proposal before the ISO C standard committee (ISO/IEC JTC1/SC22/WG14)

It contains references to some of the extant implementations of the Annex K functions — none of them widely used (but you can find them via the document if you are interested).

The document ends with the recommendation:

Therefore, we propose that Annex K be either removed from the next revision of the C standard, or deprecated and then removed.

I support that recommendation.

The C18 standard did not alter the status of Annex K. There is a paper N2336 advocating for making some changes to Annex K, repairing its defects rather than removing it altogether.

Swingeing answered 17/12, 2008 at 7:43 Comment(3)
Well, if MS contradicts the standard, it will be MS that needs to change, not the standard...Hypoblast
I'd like to think so too, but they have an installed base and won't break backwards compatibility, so in practice it continues to mean that MS won't support a C standard more modern than C89 (C90) — sadly.Swingeing
You could try using the clang compiler on windows clang.llvm.org/get_started.html . It supports C17, and works rather seamlessly with visual studio tools.Psalter
H
9

Ok, now a stand for TR24731-2:

Yes, I've used asprintf()/vasprintf() ever since I've seen them in glibc, and yes I am a very strong advocate of them.

Why?
Because they deliver precisely what I need over and over again: A powerful, flexible, safe and (relatively) easy to use way to format any text into a freshly allocated string.

I am also much in favor of the memstream functions: Like asprintf(), open_memstream() (not fmemopen()!!!) allocates a sufficiently large buffer for you and gives you a FILE* to do your printing, so your printing functions can be entirely ignorant of whether they are printing into a string or a file, and you can simply forget about how much space you will need.

Hypoblast answered 27/6, 2013 at 21:42 Comment(4)
Thank you for the feedback. TR24731-2 is, sadly, not part of the C2011 standard, but is generally a useful set of functions. I have reservations about the fmemopen() function in POSIX too. The open_memstream() function is interesting. There are some gotchas in using it, I suspect, since you pass pointers to the buffer pointer and the size variable. But, on the whole, TR23731-2 is good.Swingeing
What I'd rather see than vasprintf would be a "general" vformat function that accepts an int(*func)(void*,size_t,char const*) and a void* in addition to the usual vprintf arguments, and invokes the supplied function for each "span" of characters to be output [returning early if the function returns a non-zero value]. One could synthesize an auto-allocating sprintf out of such a function, but the general version would be compatible with custom allocators as well.Nodal
@Nodal Along those lines, there have been a couple of stdio variants that let you specify your own read/write functions. So, for example, you can do something FILE *fp = ffunopen(myreadfunc, mywritefunc), and then call fprintf, or any stdio function, and have your callback(s) called. I've also seen at least one implementation of a fmemopen variant that had auto-allocation built in -- again, meaning that you could get auto allocation for any sequence of output calls, not just *printf.Goulder
@SteveSummit: Having FILE* contain a pointer to a function table would be really useful, but my main point was that library routines shouldn't be reliant upon malloc but instead allow user code to manage memory with whatever means would be most appropriate for the intended usage case.Nodal
M
6

Do you use a library or compiler with support for the TR24731-1 functions? If so, which compiler or library and on which platform(s)?

Yes, Visual Studio 2005 & 2008 (for Win32 development obviously).

Did you uncover any bugs as a result of fixing your code to use these functions?

Sort of.... I wrote my own library of safe functions (only about 15 that we use frequently) that would be used on multiple platforms -- Linux, Windows, VxWorks, INtime, RTX, and uItron. The reason for creating the safe functions were:

  • We had encountered a large number of bugs due to improper use of the standard C functions.
  • I was not satisfied with the information passed into or returned from the TR functions, or in some cases, their POSIX alternatives.

Once the functions were written, more bugs were discovered. So yes, there was value in using the functions.

Which functions provide the most value?

Safer versions of vsnprintf, strncpy, strncat.

Are there any that provide no value or negative value?

fopen_s and similar functions add very little value for me personally. I'm OK if fopen returns NULL. You should always check the return value of the function. If someone ignores the return value of fopen, what is going to make them check the return value of fopen_s? I understand that fopen_s will return more specific error information which can be useful in some contexts. But for what I'm working on, this doesn't matter.

Are you planning to use the library in the future?

We are using it now -- inside our own "safe" library.

Are you tracking the TR24731-2 work at all?

No.

Mediative answered 15/5, 2009 at 16:10 Comment(0)
B
5

No, these functions are absolutely useless and serve no purpose other than to encourage code to be written so it only compiles on Windows.

snprintf is perfectly safe (when implemented correctly) so snprintf_s is pointless. strcat_s will destroy data if the buffer is overflowed (by clearing the concatenated-to string). There are many many other examples of complete ignorance of how things work.

The real useful functions are the BSD strlcpy and strlcat. But both Microsoft and Drepper have rejected these for their own selfish reasons, to the annoyance of C programmers everywhere.

Bishop answered 11/2, 2014 at 19:2 Comment(2)
Thanks for the input. I'm not sure 'complete ignorance' is appropriate, but I agree that the new functions are not always as much of an improvement as it could have been.Swingeing
I would think an strlcat-style function could more have efficiently accepted a pointer to the end of the destination buffer and returned a pointer to the trailing null byte of the destination. That would have allowed the return value from one call to be passed to another to concatenate multiple values onto a string without having to re-scan the destination string each time.Nodal

© 2022 - 2024 — McMap. All rights reserved.