Does C strict aliasing make untyped static memory pools impossible?
Asked Answered
F

5

23

WG14 member Jens Gustedt says in a blog post on strict aliasing rules:

Character arrays must not be reinterpreted as objects of other types.

Is that, in fact, true? (I guess the corresponding language in the standard is the part saying that if an object has a declared type, then that type is its effective type.) If so, does it mean that an allocator that parcels out memory from a statically declared memory region is unimplementable in standard C?

I know TeX ignores most of Pascal’s type system and treats everything as an array of words because of a similar issue, but I hoped that if I ever found myself in a similar situation in (malloc-less) C, I could just declare a maximally aligned char array and keep using structs the usual way. I also fail to see what the point of _Alignas could possibly be in such a world, except as a standardized device for expressing non-standard requirements (similar to volatile).

Foucault answered 22/7, 2021 at 12:33 Comment(9)
Violating strict aliasing invalidates entire huge portions of optimization techniques, such as but not limited to type-base alias analysis and points-to analysis. I've noted violators of strict aliasing rules often reply with, "But it works." They really should be saying, "I haven't observed it to fail. Yet. Yes, my standards for code are that low." I suspect there's a lack of understanding over what they risk - unexplained failures. Invoking undefined behavior is dangerous. Full stop.Burton
@AndrewHenle I’m not advocating against strict aliasing as such (while I haven’t implemented the corresponding optimizations, I’ve read both papers describing and code implementing them). But in type-based alias analysis as in everything else, a choice of axioms is always a choice, and it’s entirely possible for it to be dumb, the Good Omens argument (“But it izz written!”) notwithstanding. One example is the C11 memory model, which prohibits optimizations it was meant to permit. (cont.)Foucault
(cont.) That said, I asked this question simply because I found the standard’s language on strict aliasing extremely confusing: I’d previously read it a half-dozen times and still did not even suspect the point Gustedt makes before I read his blog post.Foucault
1) Yes, it's true. 2) No, you can't write a portable C implemention that can use a character array as a generic memory pool - not even if you align it.Anticholinergic
@AlexShpilkin, I think the paper you linked is a bit shaky. Enough so to inspire me to pose a question about it (or at least about one small part).Implacable
Also see what is the strict aliasing ruleDardani
@JohnBollinger ... And I’ll gladly watch the answers there, thanks :) My understanding of concurrency extends to more or less following the proofs and no farther, so my judgment as to how good the paper is very limited. (I’ve seen LLVM and Go people mention it, but papers which are flawed but useful, cited and not corrected anywhere in print do exist, and I’m not that immersed in the folklore.)Foucault
@AlexShpilkin: What useful optimizations would be inhibited by requiring that a compiler given something like unsigned get_float_bits(float *fp) { return *(unsigned*)fp; } must recognize the possibility that the function might observe the stored representation of a float? A compiler which treated character types like all others, but made even a modest effort to acknowledge places where cross-type pointer derivation was obvious, could perform more useful optimizations, while being compatible with more programs, than one that will willfully blind to cross-type derivation.Shick
@AlexShpilkin: Recognizing pointer derivations would also eliminate the need for the Effective Type rule, which can't be upheld in all corner cases without making what should be easy optimizations much more difficult (which is probably why clang and gcc fail to handle all such corner cases correctly).Shick
E
14

The rules on aliasing are specified in section 6.5p7 of the C standard:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88)

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

  1. The intent of this list is to specify those circumstances in which an object may or may not be aliased

Note that this list allows any object to be accessed via a char *, but not the reverse, i.e. an object declared as an array of one or more characters can't be accessed as an lvalue of some other type.

This also means that malloc can't be implemented in a standard compliant way, since there's no way to create memory with no effective type without it. However malloc is considered part of the implementation and therefore can take of advantage of its knowledge of implementation internals to return a pointer to a block of memory that a compliant program can use.

Erst answered 22/7, 2021 at 12:51 Comment(23)
This was quite an achievement for the standards body. Now there is one less systems language available.Freeze
It seems like to all-out rush to convert C to Ada is proceeding apace.Thong
What do you mean "now", @mevets? Leaving aside the question of its consequences for system programming, the strict aliasing rule has been in C, in much the same form, since the original ANSI standardization in 1989.Implacable
In fact, a vestigal form of the strict aliasing rule can be seen as far back as the first edition of K&R (1978), which observes "You should also note the implication in the declaration that a pointer is constrained to point to a particular kind of object" (section 5.1; emphasis added). Contrast "constrained to" with "interpreted to".Implacable
I think that is a pretty gratuitous interpretation of that. It was far more likely to indicate that struct stat *s = ..; s->st_size = s->tm_year; was no longer permitted as earlier editions had. You correctly point out that the standard body's fetish about aliases started early.Freeze
@Freeze I wouldn’t go as far as calling it a fetish—aliasing is the bane of the optimizer for a language with unrestricted pointers, and that’s probably why old-school Fortran (which doesn’t have them) was on top of the Benchmarks Game for years (before the Game threw out all the implementations but one for every language, which is when I stopped keeping track of it). Aliasing is also what makes C (not C++) const useless without restrict. It’s just that C’s weak type system is incapable of saying much about it, and optimizers had a decade of lower-hanging fruit to get through first.Foucault
@AlexShpilkin: There has never been any reason why a quality compiler, given something like float *p; *(unsigned*)p |= 0x80000000; should be blind to the possibility that it might affect the value of an object of type float. The authors of the Standard would have thought that sufficiently mind-numbingly obvious that there was no need to waste ink on it.Shick
@AlexShpilkin: Trying to specify the exact circumstances when compilers should notice such things would have been difficult, especially since many combinations of constructs would be processed correctly by nearly all existing compilers, even though individual constructs might not be handled by every individual compiler. The main problems arise with compilers that use the Standard as an excuse to be deliberately obtuse.Shick
@AlexShpilkin: Speaking of restrict, am I the only one who finds the Standard's definition of "based upon" to be needlessly difficult for humans and compilers to process correctly, and full of silly, nonsensical, and unworkable corner cases? A cleaner definition would recognize that ptr+intval and ptr-intval are transitively based upon ptr, regardless of how intval is computed, and that the value of a restrict-qualified pointer should be based upon itself even if it known to be coincidentally equal to some other pointer that isn't based upon it.Shick
@AlexShpilkin: "old-school Fortran (which doesn’t have them) was on top of the Benchmarks Game for years" — That seems to be a myth. For a few tasks there were Fortran programs "on top" but there were other tasks where C & C++ programs were "on top". For example web.archive.org/web/20080905134250/http://…Zebrass
@igouy: An assumption that a program won't do X may facilitate optimizations in cases where a program would have no reason to do X, but at best counter-productive in cases where X would otherwise be the most efficient way of accomplishing the task at hand. When the Committee decides not to mandate that all implementations support some action X, that means compiler writers that know its customers won't need to do X need not support that action, but does not imply any judgment as to when customers might need support for such action, nor invite implementations to ignore such needs.Shick
@igouy: There are many situations where an implementation that offers stronger behavioral guarantees than mandated by the Standard, given code that exploits those guarantees, could easily generate more efficient machine code than could be produced from any strictly conforming C program, especially if the behavioral guarantees would leave some aspects of behavior Unspecified. For example, if a programmer needs to compute x*y/z in a manner that will at worst yield an Unspecified value in case of overflow, an implementation that can freely generate code equivalent to (long long)x*y/z...Shick
...or (int)((unsigned)x*y)/z, depending upon which is more efficient in any particular context, could likely produce more efficient machine code--sometimes much more efficient--than would have been possible if the programmer had written either of the latter expressions.Shick
@Zebrass Myth implies transmission, so at worst I’m creating and not reproducing one here :) And it’s not like I can’t be wrong about a page I occasionally visited ten years ago (as I watched Haskell rise through the ranks in 2010–12 or thereabouts). But I also seem to remember that there was significant variance across benchmarks for all languages, so “there existed a benchmark where the leader was different” is not a valid refutation. (C was the overall leader for the date in 2008 that you linked the archive for, as well, but for a random date in 2011 I just tried it wasn’t, marginally.)Foucault
@Shick I can’t read the second of your comments addressed to me other than as answering the first one, so apparently I fail at reading comprehension, please rephrase :) As to the third one ... (cont.)Foucault
(cont.) I find many of the memory-model parts of the standard overly obtuse, to be honest, combining the worst aspects of mathematical (miss a corner case and you lose) and normative (written in stone, no chance of a later article or review article improving on the definitions) literature. And, well, memory models are genuinely difficult, but lexers and macro processors aren’t, and still the specification of the preprocessing algorithm (excuse me, translation phases 1 through 6) is supremely vague and difficult to convert into code. (cont.)Foucault
@AlexShpilkin: IMHO, the Standard badly needs some meaningful definitions of conformance. It pretends to be normative, but requirements can only be normative if they would make the difference between something being conforming and non-conforming. Under what circumstances could anything an otherwise-conforming implementation (translator) T might do in response to a Conforming C Program (source-text) S render it non-conforming? There is one very narrow case where that could happen, but the Standard's requirements are effectively non-normative in all other cases.Shick
(cont.) That said, AFAIK there are three ways of thinking that can be used for memory issues: memory typing (there is no or almost no memory, only discrete objects, which have type; what is restricted is which pointers can be used to access which objects); pointer typing (there is amorphous memory and typed pointers into it; what is restricted is which pointers can access the same memory); and provenance (pointers are abstract values that have a history—or, in a parallel implementation, several—of being computed from other values; what is restricted is which histories yield valid ones) (cont.)Foucault
@AlexShpilkin: Perhaps we should chat? I think the right way to handle memory issues is to describe them in terms of loads, stores, sequencing, and program structure. Loads and stores of different types are generally unsequenced, but various actions and constructs may establish ordering relationships between them (e.g. converison of a T* to a U* is sequenced after any preceding actions on the T, and before any actions using the U*, and with a couple of caveats, actions using the U* within that context will be sequenced before the first succeeding action on the T.Shick
(cont.) All of these are good, but each is better at expressing some behaviours and worse at expressing others. The C standard (and, from what I’ve seen, the C++ one as well) somehow manages to awkwardly straddle all of them due do historic accumulation, which makes a (fully compatible) rephrasing of it in any single language a largely impossible project. Thus we are stuck.Foucault
@Shick I’m honestly unsure I can say anything about this that hasn’t been said in other places by more competent people. (How difficult those places are to find is another matter.) I have neither a degree in this stuff nor the practical experience of applying it in a compiler backend. But if anyone wants to cut these comments out and chuck them into chat, feel free, that’s probably a good idea. [Musings: What about container_of (prohibited in constexpr C++)? BIBOP allocators and other “is object in this region of memory” checks (unimplementable in strictly conforming C)? etc.]Foucault
@AlexShpilkin: What would be necessary to "unstick" things would be to recognize that while it would be impossible to define any category of non-toy programs that could be usefully processed by all implementations, without forbidding implementations that could usefully process many programs, a Standard could still do something useful: allow programs to specify any features or guarantees they need that aren't universally supportable, and require that implementations must either process the programs as defined or reject them entirely. Any other action would be non-conforming.Shick
Let us continue this discussion in chat.Shick
F
10

The wording “Character arrays must not be reinterpreted as objects of other types” is imprecise. A correct statement is that if you reinterpret a character array as an object of another type (except as allowed by C 2018 6.5 7), the C standard does not define the behavior.

As always, if we want to accomplish a task, and the C standard does not define the behavior we want, we can look to other things to define the behavior we want.

If so, does it mean that an allocator that parcels out memory from a statically declared memory region is unimplementable in standard C?

Such an allocator is unimplementable in strictly conforming C, which is C code that does not rely on an unspecified, undefined, or implementation-defined behavior (and does not exceed any minimum implementation limit). It is entirely possible to write such an allocator in conforming C, which is C with extensions. Quite simply, one could put the memory allocation routines in one source file and compile them with a switch that supports aliasing memory as different types. (This is an extension, such as GCC’s -fno-strict-aliasing switch.) Then, when compiling other source files with common compilers, the compiler is blind to the effective type of the memory in the memory allocation source file, so it cannot be affected by the fact that the memory allocation routines use character arrays. (This is another extension, albeit the behavior arises implicitly from our understanding of how compilers and linkers are designed.)

Fimbriation answered 22/7, 2021 at 13:46 Comment(0)
I
1

WG14 member Jens Gustedt says in a blog post on strict aliasing rules:

Character arrays must not be reinterpreted as objects of other types.

Is that, in fact, true?

Sort of. The language specifications do not actually forbid such reinterpretation via pointer manipulation, but they do specify that accessing a character array or part of one as if it were an object of non-character type produces undefined behavior. If we take avoiding undefined behavior to be of paramount importance then Jens' "must not" follows.

However, "undefined behavior" means that the language specifications do not define the behavior, neither that of the access itself nor that of the entire program that exercises such an access. A program that performs such an access does not conform strictly to the language specifications, but its behavior might nevertheless be perfectly well defined when used with some particular C implementation, perhaps in combination with some other measure, such as specific compilation options. That same program might fail spectacularly -- or very subtly -- with a different C implementation, but that might not be a relevant consideration in some cases.

(I guess the corresponding language in the standard is the part saying that if an object has a declared type, than that type is its effective type.)

Yes.

If so, does it mean that an allocator that parcels out memory from a statically declared memory region is unimplementable in standard C?

As I take you to mean the question, yes. If you declare a large array of some type and hand out pointers (in)to that array, then undefined behavior arises from using those pointers to access regions of the array as if they had types inconsistent with the array's declared type, or where the pointer used for access is not correctly aligned for accessing members of the array as their declared type.

On the other hand, you can write such an allocator to manage access to a pool of a specific type of objects, such that the behavior of accessing allocated objects according to compatible types is well defined.

I hoped that if I ever found myself in a similar situation in (malloc-less) C, I could just declare a maximally aligned char array and keep using structs the usual way.

You might be able to do. That's a question of what your specific C implementation affords, above and beyond the language specifications.

I also fail to see what the point of _Alignas could possibly be in such a world, except as a standardized device for expressing non-standard requirements (similar to volatile).

I take you to be supposing that the role of _Alignas is to ensure correct alignment for pointer-based aliasing. Since such aliased access produces undefined behavior, why should one care about such alignment considerations?

Maybe one shouldn't. Certainly I find _Alignas rarely, if ever, to be of any genuine use in my own programming, and I generally write for hosted environments that, therefore, do provide malloc() and afford objects without declared types. But if you are relying on characteristics of your particular C implementation then you may find that _Alignas serves a useful purpose for you.

Implacable answered 22/7, 2021 at 14:33 Comment(1)
Even if an environment provides a means of creating objects without declared type, the question of whether a pointer identifies "an object with a declared type" cannot be practically answered in the general case, and even when using objects created via malloc() neither clang nor gcc will reliably allow storage to be safely used as different types within its lifetime.Shick
S
0

The Standard clearly allows implementations which are intended to be suitable for tasks requiring static memory pools to extend the semantics of the language to support them, and allows "conforming" (but not strictly conforming) programs to exploit such extensions. In fact, the vast majority of C implementations can be configured to support such tasks in mutually-compatible fashion. The Standard does not require that implementations or configurations which are not intended to be usable for such purposes support such constructs. Implementations which don't support the constructs necessary to accommodate static memory pools would, almost by definition, be unsuitable for tasks requiring static memory pools, but the Standard makes no attempt to require that all implementations be suitable for all purposes.

Consequently, when writing rules about type-based aliasing, the authors of the Standard did not exercise anything near the level of care that would have been appropriate if they intended such rules to serve as a boundary between programs that should be expected to work and programs that shouldn't. It may seem odd that C99 rules which have never been even remotely satisfactory, as evidenced by the confusion and controversy surrounding them for the last 20 years, have remained unchanged, but there's a simple reason for that: changing the rules would require reaching a consensus as to what they're supposed to say, and it would be impossible to write a single set of rules, suitable for all purposes, to distinguish between operations that should or should not be regarded as meaningful, since the question of whether an implementation should be expected to process a construct meaningfully depends upon the purposes for which it is designed and configured.

When the Standard characterizes an action as "undefined behavior", or as violating a constraint, it means nothing more nor less than that the Standard itself imposes no requirements about how implementations process code in the relevant situation. The Standard makes no attempt to distinguish actions which are clearly erroneous from those which may not be portable to every conceivable implementation but should be expected to behave identically on 99% of them. Nor does the Standard make much effort to consider all of the corner cases where an action which would generally invoke UB might (and perhaps should) be processed in the same meaningful way by all implementations.

Code which expects to do "weird" things with memory should be processed using configurations that make allowance for that, even if the code is strictly conforming. Handling all of the tricky corner cases in the rules as written would require foregoing optimizations that would often be useful, and both clang nor gcc will ignore such corner cases rather than forego the optimizations. The question of whether a piece of code will be processed meaningfully thus depends much more strongly on compiler configuration than upon whether the code jumps through all the hoops given in the Standard.

Shick answered 22/7, 2021 at 22:19 Comment(0)
S
0

The Standard, at least as interpreted by clang and gcc in -fstrict-aliasing mode, does not allow for storage which has been used as one type to be reliably reused as a different type within its lifetime, regardless of where the storage came from, and even if such storage was originally obtained via malloc. The inability to reliably form untyped memory pools of static duration flows generally from the inability to have reliable memory pools of any duration.

According to paragraph 6 of N1570 6.5:

The effective type of an object for an access to its stored value is the declared type of the object, if any.87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

Footnote 87 reads:

Footnote 87) Allocated objects have no declared type.

but it doesn't specify where or not there might be other cases where objects would have or not be regarded as having a declared type. Given something like:

extern int x;
extern int *alloc = malloc(sizeof (int));
extern int *unknownExternalFunction(int*p1, int *p2);
extern short s;

int *p = unknownExternalFunction(&x, alloc);
memcpy(p, &s, sizeof (short));

it would seem like *p should have a declared type in cases where unknownExternalFunction happens to return the first argument, but not in cases where it returns the second. Thus, the effective type of the storage at p should be int in the first scenario, but short in the second. From what I can tell, with the rules as written, a compiler would clearly be entitled to generate code that checks whether the returned address is coincidentally equal to x or p, and selects among versions of the downstream code optimized for scenarios where the memcpy does or doesn't alter the value. It turns out, though, that there's an even more interesting subtlety.

Although the notion of "modify" used elsewhere in the Standard indicates that even an action which writes an object with the value it already holds "modifies" it, the meaning here is different, and an object need not be regarded as modified by a sequence of actions that leave it holding the same bit pattern as it held before.

struct s { int x; };
struct t { int x; };
int test(void*p, void *q)
{
    struct s *ps = p;
    ps->x = 1;

    struct t *qt = q;
    qt->x = 2;

    struct t *pt = p;
    int temp = pt->x;
    // Start of sequence that writes to *p
    // but leaves bit pattern unchanged
    ps->x = 49;
    ps->x = temp;
    // End of sequence that writes to *p
    // but leaves bit pattern unchanged
    return ps->x;
}

If p and q identify the same storage with no declared type, the way clang and gcc interpret the -> operator, this code will set the effective type of the storage to struct s and then to struct t. It will then use type struct t to read the storage (perfectly legitimate). Although it will then use type s to write the storage twice, both clang and gcc will recognize that the two writes, taken together, leave the storage holding the same bit pattern as it held before the first write. Because the two writes, taken together, do not modify the storage, they do not alter the Effective Type of the storage, and behavior of the code would be undefined in the scenario where p and q are equal.

Shick answered 27/7, 2021 at 16:22 Comment(2)
Why do you think that the word "modify" means something in this paragraph that is different from its meaning throughout the rest of the standard? The definition of "access" says "NOTE 2 "Modify’’ includes the case where the new value being stored is the same as the previous value." so I don't see why the type of an object should depend on the bit pattern it holds (Unless your point was that the standard is needlessly confusing in which case I would agree)Gandhi
@Niautanor: One of the following statements is true: (1) the authors of clang and gcc are not particularly interested in correctly handling all corner cases mandated by the Standard, or (2) the authors of clang and gcc have adopted a rather severely twisted interpretation of the Standard. I think it's pretty obvious that the authors of the Standard never intended to invite some of the "optimizations" that clang and gcc perform, but a description of the Standard that's not reliably consistent with the way clang and gcc actually behave doesn't seem terribly useful.Shick

© 2022 - 2024 — McMap. All rights reserved.