Is using the result of new char[] or malloc to casted float * is UB (strict aliasing violation)?

Asked 26/10, 2017 at 17:51 Answered 26/10, 2017 at 22:24

Solved c++malloc language-lawyer strict-aliasing object-lifetime

Which code of these has UB (specifically, which violates strict aliasing rule)?

void a() {
    std::vector<char> v(sizeof(float));
    float *f = reinterpret_cast<float *>(v.data());
    *f = 42;
}

void b() {
    char *a = new char[sizeof(float)];
    float *f = reinterpret_cast<float *>(a);
    *f = 42;
}

void c() {
    char *a = new char[sizeof(float)];
    float *f = new(a) float;
    *f = 42;
}

void d() {
    char *a = (char*)malloc(sizeof(float));
    float *f = reinterpret_cast<float *>(a);
    *f = 42;
}

void e() {
    char *a = (char*)operator new(sizeof(float));
    float *f = reinterpret_cast<float *>(a);
    *f = 42;
}

I ask this, because of this question.

I think, that d doesn't have UB (or else malloc would be useless in C++). And because of this, it seems logical, that b, c & e doesn't have it either. Am I wrong somewhere? Maybe b is UB, but c is not?

Bellamy answered 26/10, 2017 at 17:51 Comment(11)

aliasing is usually a problem only when you continue to use both pointers after the alias. I wouldn't think any of those generate a warning. – Bleeder 26/10, 2017 at 17:59

IIRC, the char types are exempt from the strict aliasing issue. Alignment is still a problem, though. – Hollands 26/10, 2017 at 18:6

I think @GarrGodfrey is pointing out that all modern compilers have consistently predictable Behavior in the scenarios in the OP. And that strict aliasing violations issues only really kick-in when Load-Hit-Stores might come into play. But relying on this would not be future proof in any way. – Kampong 26/10, 2017 at 18:7

@molbdnilo: you can use char * to examine any value. The opposite is not necessarily true. – Bellamy 26/10, 2017 at 18:7

related: #46909605 – Middleclass 26/10, 2017 at 18:8

@Bellamy Oh, right. So IRI, then. – Hollands 26/10, 2017 at 18:9

I believe they are all invalid: everything except c() is invalid because you need to use placement new to create an object, not just reinterpret_cast, and all of a() through c() are invalid due to alignment. I’m not that confident about either of those, though. A valid version would be void f() { char *a = (char*)malloc(sizeof(float)); float *f = new(a) float; *f = 42; }; both malloc and the standard library’s operator new are guaranteed to have enough alignment. – Aronow 26/10, 2017 at 18:19

"else malloc would be useless in C++" malloc is useless. Std C++ is broken and useless! (in a strict interpretation, if you refuse the evidence showing that some chapters are garbage and dismissable) – Mordancy 26/10, 2017 at 21:3

@curiousguy: Maybe it's the language barrier, but I don't quite understand you. Do you mean that "malloc is useless" is too harsh? Re-reading my question, yes, I agree on that. Not useless, but very cumbersome to use. Maybe I should delete that paragraph, as the question stands by itself without it. – Bellamy 26/10, 2017 at 21:22

@Bellamy you mean that a char expression can be used to examine any value. The char expression might be obtained by dereferencing a char * expression, although there are other means. A char * expression cannot be used directly to examine any value. – Immeasurable 26/10, 2017 at 22:27

@Bellamy Either you dismiss some std sections as being BS, or you have to accept that no historically accepted C/C++ code using malloc is valid. C/C++ is dead! – Mordancy 27/10, 2017 at 1:42

Preamble: storage and objects are different concepts in C++. Storage refers to memory space, and objects are entities with lifetimes, that may be created and destroyed within a piece of storage. Storage may be re-used for hosting multiple objects over time. All objects require storage, but there can be storage with no objects in it.

c is correct. Placement-new is one of the valid methods of creating an object in storage (C++14 [intro.object]/1), even if there were pre-existing objects in that storage. The old objects are implicitly destroyed by the re-use of the storage, and this is perfectly fine so long as they did not have non-trivial destructors ([basic.life]/4). new(a) float; creates an object of type float and dynamic storage duration within the existing storage ([expr.new]/1).

d and e are undefined by omission in the current object model rules: the effect of accessing memory via a glvalue expression is only defined when that expression refers to an object; and not for when the expression refers to storage containing no objects. (Note: please do not leave non-constructive comments regarding the obvious inadequacy of the existing definitions).

This does not mean "malloc is useless"; the effect of malloc and operator new is to obtain storage. Then you can create objects in the storage and use those objects. This is in fact exactly how standard allocators, and the new expression, work.

a and b are strict aliasing violations: a glvalue of type float is used to access objects of incompatible type char. ([basic.lval]/10)

There is a proposal which would make all of the cases well-defined (other than the alignment of a mentioned below): under this proposal, using *f implicitly creates an object of that type in the location, with some caveats.

Note: There is no alignment problem in cases b through e, because the new-expression and ::operator new are guaranteed to allocate storage correctly aligned for any type ([new.delete.single]/1).

However, in the case of std::vector<char>, even though the standard specifies that ::operator new be called to obtain storage, the standard doesn't require that the first vector element be placed in the first byte of that storage; e.g. the vector could decide to allocate 3 extra bytes on the front and use those for some book-keeping.

Immeasurable answered 26/10, 2017 at 22:24 Comment(6)

@T.C. thanks, updated my answer according to your comment. Are there any existing implementations that do add padding? – Immeasurable 26/10, 2017 at 22:43

I don't think so (an implementation could conceivably stash size/capacity in the allocated block or something, but that seems unlikely to be beneficial), but we are in language-lawyer land :) – Handal 26/10, 2017 at 22:46

a and b are also OK under P0593R1's rules (modulo the theoretical alignment issue with a). A float object would spring into existence just in time for the write to be well-defined. – Handal 26/10, 2017 at 22:49

@Handal I'm not clear about that from the proposal: it says objects can spring into existence where there were no "real" objects but in the (a), (b) cases the storage does include real char objects already. The example in section 1.2 of the proposal only works in space that was allocated by an external API (and therefore could be storage with no real objects yet). But the example in 2.3 does allow an int to spring where there were real chars , it seems to me that further clarification will be required. (E.g. if you spring an int on a short buffer, can you still access the shorts ?) – Immeasurable 26/10, 2017 at 22:58

The proposed rule, which is in 2.2, is quite simple. Implicit-lifetime objects are sprung into existence as necessary to give the program defined behavior. In your int/short example, you'd be able to write to the shorts (springing up new short objects just prior to the write) but not read from them (because any new short object sprung up would have indeterminate values). – Handal 26/10, 2017 at 23:2

So char* pBuffer = new char[16]; float* pFloat = (pFloat*)pBuffer; pFloat = 7 is strict aliasing violation? But char pBuffer = new char[16]; float* pFloat = (pFloat*)pBuffer; new (pFloat) float; *pFloat = 7; is OK? In other words aliasing the char buffer with a float pointer, but using placement new at that memory position I can then alias and access with a float pointer? – Petrolatum 20/8, 2021 at 12:49

Even though it's a discussion between the OP and I that spawned this question, I'll still put my interpretation here.

I believe that all of these save for c() contain strict aliasing violations as formally defined by the standard.

I base this on section 1.8.1 of the standard

... An object is created by a definition (3.1), by a new-expression (5.3.4) or by the implementation (12.2) when needed. ...

reinterpret_cast<>ing memory does not fall under either of these cases.

Kampong answered 26/10, 2017 at 18:22 Comment(4)

reinterpret_cast never creates a new object, but it doesn't mean that that the resulting pointer or reference is always unsafe to use. If the result is of a type that is allowed to alias the original object, it's fine. For example, you can reinterpret a unsigned int as an int. No int object was ever created, but using the result of the cast is well defined. You are legally using an int despite no int ever being created. The question here is if you can use the array of chars as a float. Type aliasing – Jelly 26/10, 2017 at 18:30

I believe your conclusion is correct, but I don't believe the passage you provided explains why. – Jelly 26/10, 2017 at 18:30

Well, I was going to point to 3.10.10, but that only applies if you can establish that there is no float object currently residing at that memory location. So I still feel the core reason is 1.8.1. – Kampong 26/10, 2017 at 18:32

@FrançoisAndrieux The newest c++ standard does not say that signed and unsigned can be aliased, but that similar values have the same representation. So we can expect in a near future that compilers does not any more consider that a pointer to an unsigned int alias a pointer to int. – Halfdan 26/10, 2017 at 19:2

From cppreference:

Type aliasing

Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:

AliasedType and DynamicType are similar.

AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType.

AliasedType is std::byte, (since C++17)char, or unsigned char: this permits examination of the object representation of any object as an array of bytes.

Informally, two types are similar if, after stripping away cv-qualifications at every level (but excluding anything inside a function type), they are the same type.

For example: [...some examples...]

Also cppreference:

a glvalue is an expression whose evaluation determines the identity of an object, bit-field, or function;

The above is relevant for all example except (c). Types are neither similar nor signed/unsigned variants. Also the AliasedType (the type you cast to) is neither of char, unsigned char or std::byte. Hence all of them (but c) exhibit undefined behaviour.

Disclaimer: First of all cppreference is not an official reference, but only the standard is. Secondly, unfortunately I am not even 100% certain if my interpretation of what I read on cppreference is correct.

Fluency answered 26/10, 2017 at 18:30 Comment(6)

Case c has no reinterpret_cast. – Jelly 26/10, 2017 at 18:31

@FrançoisAndrieux is new(a) float; not a glvalue? I am not sure... I dont mention reinterpret_cast – Fluency 26/10, 2017 at 18:33

@tobi303 See my answer for this, section 1.8.1 establishes that a new expression creates the object, making it a glvalue. – Kampong 26/10, 2017 at 18:36

You mention reinterpret_cast in the url in the first line of your answer and the first citation block quote's it's documentation. – Jelly 26/10, 2017 at 18:36

@FrançoisAndrieux strict aliasing is explained there and I think the explanation is more general and not only refers to reinterpret casting. – Fluency 26/10, 2017 at 18:38

In (c), new(a) float; creates an object whose DynamicType is float – Immeasurable 26/10, 2017 at 22:38

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags