Based on your comments, it seems you want to perform a bona fide conversion -- that is, to produce a distinct, new, separate value of a different type. This is a very different thing than a reinterpretation, such as the lead-in to your question suggests you wanted. In particular, you posit variables declared like this:
uint8x16_t a;
uint8x8x2_t b;
// code to set the value of a ...
and you want to know how to set the value of b
so that it is in some sense equivalent to the value of a
.
Speaking to the C language:
The strict aliasing rule (C2011 6.5/7) says,
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
- a type compatible with the effective type of the object, [...]
- an aggregate or union type that includes one of the aforementioned types among its members [...], or
- a character type.
(Emphasis added. Other enumerated options involve differently-qualified and differently-signed versions of the of the effective type of the object or compatible types; these are not relevant here.)
Note that these provisions never interfere with accessing a
's value, including the member value, via variable a
, and similarly for b
. But don't overlook overlook the usage of the term "effective type" -- this is where things can get bolluxed up under slightly different circumstances. More on that later.
Using a union
C certainly permits you to perform a conversion via an intermediate union
, or you could rely on b
being a union member in the first place so as to remove the "intermediate" part:
union {
uint8x16_t x1;
uint8x8_2_t x2;
} temp;
temp.x1 = a;
b = temp.x2;
Using a typecast pointer (to produce UB)
However, although it's not so uncommon to see it, C does not permit you to type-pun via a pointer:
// UNDEFINED BEHAVIOR - strict-aliasing violation
b = *(uint8x8x2_t *)&a;
// DON'T DO THAT
There, you are accessing the value of a
, whose effective type is uint8x16_t
, via an lvalue of type uint8x8x2_t
. Note that it is not the cast that is forbidden, nor even, I'd argue, the dereferencing -- it is reading the dereferenced value so as to apply the side effect of the =
operator.
Using memcpy()
Now, what about memcpy()
? This is where it gets interesting. C permits the stored values of a
and b
to be accessed via lvalues of character type, and although its arguments are declared to have type void *
, this is the only plausible interpretation of how memcpy()
works. Certainly its description characterizes it as copying characters. There is therefore nothing wrong with performing a
memcpy(&b, &a, sizeof a);
Having done so, you may freely access the value of b
via variable b
, as already mentioned. There are aspects of doing so that could be problematic in a more general context, but there's no UB here.
However, contrast this with the superficially similar situation in which you want to put the converted value into dynamically-allocated space:
uint8x8x2_t *c = malloc(sizeof(*c));
memcpy(c, &a, sizeof a);
What could be wrong with that? Nothing is wrong with it, as far as it goes, but here you have UB if you afterward you try to access the value of *c
. Why? because the memory to which c
points does not have a declared type, therefore its effective type is the effective type of whatever was last stored in it (if that has an effective type), including if that value was copied into it via memcpy()
(C2011 6.5/6). As a result, the object to which c
points has effective type uint8x16_t
after the copy, whereas the expression *c
has type uint8x8x2_t
; the strict aliasing rule says that accessing that object via that lvalue produces UB.
std::memcpy
. Any other (C++) way requires a compiler extension which you would have to research. – Oysmemcpy
ing from/to different datatypes is not different than an assignment with a cast: it invokes UB (there are few exceptions forchar
types. – Convulsionmemcpying from/to different datatypes is not different than an assignment with a cast
That's not true.memcpy
has defined behaviour, see discussion here. – Unaccomplishedchar *
and thenmemcpy
. – Oysx2
forms are split over two registers. So it makes sense that you can't just reinterpret it. I'm not sure how it's actually supposed to be done other than a load/store which seems silly.. – RobinC
(tagged) but I think inC
bothunion
and cast are defined behaviour - someone else need to answer this part. – OysC
"If the member used to access the contents of a union is not the same as the member last used to store a value, the object representation of the value that was stored is reinterpreted as an object representation of the new type (this is known as type punning)." source: en.cppreference.com/w/c/language/union - I don't have the standard to hand. – Oysmemcpy
does not chage that (see my comment ^). I'd actually be surprise it is different in C++, but afaik this language does not allow aliasing viaunion
, so maybe it does allow it. However, in C aliasing is only allowed viaunion
. This is guaranteed by the standard. Not via pointer. And modern compilers can expliot this. gcc e.g. is well known to strictly follow the standard without additional guarantees. That's why "old-style" embedded programmers not used to highly optimizing compilers tend to run into problems with gcc. – Convulsionmemcpy
, casting any pointer to(void*)
is always valid, I do not understand at which step you see strict aliasing getting broken. – Unaccomplishedvoid *
are problematic, as the tell teh compiler "shut up, I know what I'm doing". So you better really know, otherwise you invoke UB. – Convulsionmemcpy
implies. – Unaccomplisheduint8x16_t a; uint8x8x2_t b; /*Placeholder, some code to assign something to a*/ memcpy(&b,&a,sizeof a);
The only conversion happening here are fromuint8x16_t*
tovoid*
and fromuint8x8x2_t*
tovoid*
– UnaccomplishedT*
can be cast tochar*
orunsigned char*
, to allow examinations of an object's raw bytes. This is specified in[basic.lval/10.8]
.std::memcpy()
aliases both pointers asunsigned char*
, which is a well-defined operation. 2) Both types areTriviallyCopyable
, preventing other UB. – Alanizmemcpy()
can be used to convert a value to a different type without violating strict aliasing rules. – Alanizuint8x16_t*
is not cast touint8x8x2_t*
, nor is auint8x8x2_t*
cast touint8x16_t*
; rather, each is cast tounsigned char*
and then back to its original type, which is entirely valid. If this was invalid, then any operation that modifies an object's raw bytes through achar*
orunsigned char*
would be equally invalid, because modifying an object's raw bytes through anunsigned char*
is the operationmemcpy()
performs. – Alanizmemcpy
, but the destination type. I explicitly talk about C, not C++. To repeat the obvious: they are different languages and in C aliasing viamemcpy
is not allowed. However, as you both think you know better and prefer picking raisins from the standard instead of working through the whole chain of evidence, this discussion is pointless. Please don't ping me for this again. – Convulsionstruct int32x4_t {int32_t val[4];};
andstruct int64x2_t {int64_t val[2];};
. – UnaccomplishedT*
tochar*
and reading data from it is UB. 2) Nowhere does it say that converting aT*
tochar*
and writing data to it is UB. 3) Nowhere does it say that writing data from onechar*
to anotherchar*
is UB. 4) The question isn't solely about C, but about both C and C++. – Alanizmemcpy()
explicitly applies to "an object having no declared type", which is an allocated object; as both objects here have declared types, it does not apply. – Alanizuint8x16_t*
being cast touint8x8x2_t*
, and at no point is auint8x8x2_t*
being cast touint8x16_t*
; this doesn't even happen insidememcpy()
, since it takes both pointers asvoid*
. According to both the C and C++ standards, convertingT*
tochar*
(albeit indirectly, in this case) is well-defined, writing data from onechar*
to anotherchar*
is well-defined, and both languages permit the use ofmemcpy()
to convert a value of typeU
into a value of typeV
. – Alanizstd::memcpy
may be used to convert the values." on the CPP page, and "Where strict aliasing prohibits examining the same memory as values of two different types,memcpy
may be used to convert the values." on the C page. The reason for this is thatmemcpy()
examines both asunsigned char*
, which is legal. – Alanizunion
. The reasonm for using NEON is to speed up code and I would not rely onmemcpy
being optimised out. Aunion
is more likely. It also would make the intention more clear. – ConvulsionT*
through achar*
, I consider this to mean both read and write access are allowed. As this permission to access an object through achar*
is intended to allow access to the object's raw byte representation, allowing write access thus means that it's permissible to write a byte sequence to the object which results in a valid object of its type. – Alanizmemcpy()
; as such, it has no need to mentionmemcpy()
in relation to objects with declared types, because their effective type is stated in the first sentence to be their declared type. It can be read as anif-else
statement: If an object has a declared type, then its effective type is its declared type. Else, ... if a value is copied into the object withmemcpy()
, then its effective type is that value's effective type. – Alanizmemcpy()
over aunion
: While usingmemcpy()
is valid in both C and C++, type punning through aunion
is only guaranteed to be valid in C; in C++, it's a mess, and may or may not be UB depending on exactly how you use theunion
. Considering that the question asks about both languages,memcpy()
is thus the safer option. – Alaniz