Is it Undefined Behaviour to cast away the constness of a function parameter?
Asked Answered
F

2

20

Imagine I have this C function (and the corresponding prototype in a header file)

void clearstring(const char *data) {
    char *dst = (char *)data;
    *dst = 0;
}

Is there Undefined Behaviour in the above code, casting the const away, or is it just a terribly bad programming practice?

Suppose there are no const-qualified objects used

char name[] = "pmg";
clearstring(name);
Freewheeling answered 31/1, 2012 at 11:54 Comment(6)
If the cast isn't UB, I think it should be :)Freewheeling
you certainly have your foot squarely in the shotgun sights!Rambunctious
@pmg: if the cast itself were UB, then there would be little point the language permitting it - it's easy enough for a compiler to detect that const has been added in a cast, the same way it detects that char *dst = data; is illegal. Obviously there are some pointless things that the standard permits for historical reasons, but I claim that this is not one of them :-)Makedamakefast
Does this answer your question? Is const_cast safe?Calends
@user202729: by the gist of it, yes. It says approximately the same as answers here, but with a flavor of C++.Freewheeling
Sorry, wrong language, retracted.Calends
M
31

The attempt to write to *dst is UB if the caller passes you a pointer to a const object, or a pointer to a string literal.

But if the caller passes you a pointer to data that in fact is mutable, then behavior is defined. Creating a const char* that points to a modifiable char doesn't make that char immutable.

So:

char c;
clearstring(&c);    // OK, sets c to 0
char *p = malloc(100);
if (p) {
    clearstring(p); // OK, p now points to an empty string
    free(p);
}
const char d = 0;
clearstring(&d);    // UB
clearstring("foo"); // UB

That is, your function is extremely ill-advised, because it is so easy for a caller to cause UB. But it is in fact possible to use it with defined behavior.

Makedamakefast answered 31/1, 2012 at 11:59 Comment(3)
+1: (im)mutability is an inherent property of the object itself, regardless of the qualification of the pointer used to access it...Prong
Is this UB because of C99 6.6 §9 or because of C99 6.7.3 §5?Dysphasia
@Lundin: the latter (and 6.4.5/6 rather than 6.7.3/5 in the case of the string literal, since string literals are not const objects in C). Address constants have nothing to do with this.Makedamakefast
C
0

Consider a function like strstr which, if given a pointer to a part of an object containing a string, with return a pointer to a possibly-different part of the same object. If the method is passed a pointer to a read-only area of memory, it will return a pointer to a read-only area of memory; likewise if it is given a pointer to a writable area, it will return a pointer to a writable area.

There is no way in C to have a function return a const char * when given a const char *, and return an ordinary char * when given an ordinary char *. In order to be compatible with the way strstr worked before the idea of a const char * was added to the language, it has to convert a const-qualified pointer into a non-const-qualified pointer. While it's true that as a library function strstr might be entitled to do such a cast even if user code could not, the same pattern comes up often enough in user code that it would be practical to forbid it.

Coact answered 24/4, 2015 at 3:0 Comment(8)
As of C11, C has the _Generic keyword which can be used to define functions which are generic over a type, and as of C23, strstr has been defined to return a pointer which matches the constness of its input.Liber
@JM0: Is it practical to write a function prototype that would allow a function that passes a received argument to strstr, and then returns the output of strstr to the caller, to advertise such semantics when processed by C23, but which would still be compatible with existing compilers?Coact
The paper which introduced the change, N3020, has an example of some macro magic that works on compilers today and makes strstr generic over const.Liber
@JM0: Interesting. I still think there should be a qualifier to indicate that a returned pointer will be based upon a passed-in pointer, so as to allow a compiler given int *q = someFunction(p);, where p is a restrict-qualified pointer, could know whether the return value is always, never, or sometimes based upon p, and whether the function call may have persisted any pointer based upon p other than q.Coact
@JM0: Being able to express such information in a function signature would be especially useful in cases where functions receive and invoke pointers to outside functions, since even a "whole program optimize" compiler would not in general be able to look inside all of the functions that might be invoked via function pointer.Coact
Yeah, the paper introduces the types QChar, QVoid, and QWchar_t to express that, but they're only documentation conventions and aren't actually part of the language. If you want it to actually be part of the language, you're going to have to take it up with the committee. Though good news, compilers can still warn if the return value doesn't match: example.Liber
IMHO, the Committee's priority should be to define meaningful categories of conformance, and recognize more different categories of implementations for different purposes. Back when the compiler marketplace was dominated by compilers that needed their products to appeal to the programmers that would be using them, it made sense for the Standard to waive jurisdiction over anything that might be controversial, treating such things as "quality of implementation" issues, but such an approach breaks down if the only way for a program to have a broad audience is to work around the quirks...Coact
...of a compiler that is deliberately incompatible with code written for better quality compilers, except when processing code in gratuitously inefficient fashion.Coact

© 2022 - 2024 — McMap. All rights reserved.