This is a quote from the C11 Standard:
6.5 Expressions
...6 The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using
memcpy
ormemmove
, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
Does this imply that memcpy
cannot be used for type punning this way:
double d = 1234.5678;
uint64_t bits;
memcpy(&bits, &d, sizeof bits);
printf("the representation of %g is %08"PRIX64"\n", d, bits);
Why would it not give the same output as:
union { double d; uint64_t i; } u;
u.d = 1234.5678;
printf("the representation of %g is %08"PRIX64"\n", d, u.i);
What if I use my version of memcpy
using character types:
void *my_memcpy(void *dst, const void *src, size_t n) {
unsigned char *d = dst;
const unsigned char *s = src;
for (size_t i = 0; i < n; i++) { d[i] = s[i]; }
return dst;
}
EDIT: EOF commented that The part about memcpy()
in paragraph 6 doesn't apply in this situation, since uint64_t bits
has a declared type. I agree, but, unfortunately, this does not help answer the question whether memcpy
can be used for type punning, it just makes paragraph 6 irrelevant to assess the validity of the above examples.
Here here is another attempt at type punning with memcpy
that I believe would be covered by paragraph 6:
double d = 1234.5678;
void *p = malloc(sizeof(double));
if (p != NULL) {
uint64_t *pbits = memcpy(p, &d, sizeof(double));
uint64_t bits = *pbits;
printf("the representation of %g is %08"PRIX64"\n", d, bits);
}
Assuming sizeof(double) == sizeof(uint64_t)
, Does the above code have defined behavior under paragraph 6 and 7?
EDIT: Some answers point to the potential for undefined behavior coming from reading a trap representation. This is not relevant as the C Standard explicitly excludes this possibility:
7.20.1.1 Exact-width integer types
1 The typedef name
int
N_t
designates a signed integer type with width N, no padding bits, and a two’s complement representation. Thus,int8_t
denotes such a signed integer type with a width of exactly 8 bits.2 The typedef name
uint
N_t
designates an unsigned integer type with width N and no padding bits. Thus,uint24_t
denotes such an unsigned integer type with a width of exactly 24 bits.These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.
Type uint64_t
has exactly 64 value bits and no padding bits, thus there cannot be any trap representations.
memcpy()
the value is the same since you assign thedouble
then do amemcpy()
to theuint64_t
so the value copied is of typedouble
. Similarly with theunion
you assign a value to thedouble
part of the union so the value in the memory area is of typedouble
and you access it via theuint64_t
. Either way there is no type conversion of the actual value. So no conversion fromdouble
touint64_t
.memcpy()
copies the specified number of bytes. Why do you think the two examples would be different? – Meinyunion
version ensures alignment restrictions of both fields are met by the starting address ofu
. Thememcpy
version doesn't guarantee thatbits
is positioned on adouble
boundary. It could fail ifdouble
alignment was more restrictive thanuint64_t
, e.g. with a bus error whenprintf
attempts to compute thedouble
string representation ofbits
. – Randolphrandomdouble
as auint64_t
. I am perplexed by the wording of paragraph 6, especially regarding the copy done viamemcpy
. I tough it should be safe, but other savvy C experts differ. – Emissionmemcpy
is explicitly safe to copy between non aligned blocks and the values passed toprintf
are read from their effective types. – Emissionunion
, usingmemcpy()
copies a value from a memory location whose value is of a particular type and puts it in a different location and the value is still the same type. However when accessed through an lvalue of a different type, the value is retrieved as the type of the lvalue and not the original type. – Meinymemcpy
can be used for type punning. – Emissionmemcpy()
safe for type punning. And actually I have seen a lot of embedded code that depends on it withmemcpy()
used to divide upstruct
objects with offset address calculations and really weird stuff. Thestruct
were defined withpragma
for byte alignment so it all works. That was not C11 but older C98 and some of it old K&R C. This might actually be an improvement in the spec to actually specify the behavior that has been canonized as the defacto behavior for years. – Meinyunion
vmemcpy()
shouldsizeof double != sizeof uint64_t
- but I take their equal size is taken as a given. – Bobbobbmemcpy()
the number of bytes specified is thesizeof
the destination so the max number of bytes copied depends on the size of the destination. If the source type is different in aunion
you will still have the same issue of interpretation of the bytes when you access the bytes using a type different from the type used to store those bytes. – Meinymemcpy
is safe. But for a compiler pushing auint64_t
on the stack forprintf
to know it should be aligned on adouble
boundary, it would have to inspect the format string, push every varargs argument at the lcm of all alignment boundaries, or dynamically check eachprintf
argument for alignment and copy as needed on the fly. All that seems very un-C-like. But as I said, I'm not an expert. – Randolphrandommemcpy()
in paragraph 6 doesn't apply in this situation, sinceuint64_t bits
has a declared type. – Carduaceousmemcpy
is his preferred way of doing it and enables the majority of compilers to generate optimal object code. – Physiqueuint64_t bits = *pi;
-->uint64_t bits = *pbits;
– Bobbobbint*
is cast to ashort*
then any cachedint
values which might be identified by that pointer must be flushed. Code reordering complicates things, but the proper remedy for that would be... – Preponderaterestrict
. I really doubt that most of the people voting for the languages in the standard would expect that quality compilers would not recognize that something like((uint16_t*)someUint32Ptr)[IS_BIG_ENDIAN];
might modify a value of typeuint32_t
[and might be the most efficient way to clear the lower half of it], or would think that optimization is furthered by requiring programmers to write code that would force a compiler to assume a pointer might alias almost anything, anywhere, of any type. – Preponderate