Casting pointers on embedded devices

Asked 25/12, 2012 at 16:50 Answered 26/12, 2012 at 2:14

I encountered a strange problem when casting and modifying pointers on a 32bit embedded system (redbee econotag running contiki OS to be specific).

uint32_t array[2];
array[0] = 0x76543210;
array[1] = 0xfedcba98;

uint8_t* point = ((uint8_t*)array)+1;

printf("%08x \n", *(uint32_t*)point );

output on my computer:

98765432

output on embedded device:

10765432

My computer behaves as I expect it to, the embedded device however seems to wrap around when it reaches the end of the word. Why does this happen?

Roybal answered 25/12, 2012 at 16:50 Comment(0)

Your target "redbee econotag" is stated as an ARM7 which has ARMv4 architecture. ARMv4 doesn't provide unaligned memory access like an ARMv7 or an intel machine.

Quoting from ARM's documentation:

On ARMv4 and ARMv5 architectures, and on the ARMv6 architecture depending on how it is configured, care needs to be taken when accessing unaligned data in memory, lest unexpected results are returned. For example, when a conventional pointer is used to read a word in C or C++ source code, the ARM compiler generates assembly language code that reads the word using an LDR instruction. This works as expected when the address is a multiple of four, for example if it lies on a word boundary. However, if the address is not a multiple of four, the LDR returns a rotated result rather than performing a true unaligned word load. Generally, this rotation is not what the programmer expects.

Screak answered 26/12, 2012 at 2:14 Comment(0)

With this code, you break the strict aliasing rule: the object pointed by point is accessed by an lvalue expression that has uint32_ttype.

C11 (n1570), § 6.5 Expressions
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualiﬁed version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualiﬁed version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.

This leads to an undefined behavior, so anything can happen.

C11 (n1570), § 4. Conformance
If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint or runtimeconstraint is violated, the behavior is undeﬁned.

Patrica answered 25/12, 2012 at 16:53 Comment(1)

This answer is another thing wrong with the above code (if something most compilers would let you get away with), but it is not the cause of the described behaviour. The cause is the casting of a byte-aligned value into a 4-byte-aligned pointer - as correctly identified in auselen's answer. – Gametogenesis 27/12, 2012 at 9:43

Your target "redbee econotag" is stated as an ARM7 which has ARMv4 architecture. ARMv4 doesn't provide unaligned memory access like an ARMv7 or an intel machine.

Quoting from ARM's documentation:

On ARMv4 and ARMv5 architectures, and on the ARMv6 architecture depending on how it is configured, care needs to be taken when accessing unaligned data in memory, lest unexpected results are returned. For example, when a conventional pointer is used to read a word in C or C++ source code, the ARM compiler generates assembly language code that reads the word using an LDR instruction. This works as expected when the address is a multiple of four, for example if it lies on a word boundary. However, if the address is not a multiple of four, the LDR returns a rotated result rather than performing a true unaligned word load. Generally, this rotation is not what the programmer expects.

Screak answered 26/12, 2012 at 2:14 Comment(0)

Because of the +1 you do an unaligned access of a 32-bit value, i.e. the address is not a multiple of four.

x86 works independently of alignment, because its roots go all the way back to 8-bit machines (maybe performance is slightly worse).

ARM requires alignment (as do many other processors), so 32-bit values should be placed at addresses that are a multiple of four bytes. Various bad things may happen if this isn't the case (wrong values, faults). For the array the compiler takes care of that, but when you explicitly cast the pointer you force it to violate alignment.

Genevieve answered 25/12, 2012 at 21:34 Comment(0)

printf("%08x \n", *(uint32_t*)point );

The * expression in this statement invokes undefined behavior: it violates aliasing rules and may do unaligned access.

Rebozo answered 25/12, 2012 at 16:53 Comment(0)

EDIT: please note that the body of this answer is made irrelevant by the comments it prompted

The theory of the other answers is fine, but probably doesn't help you. The actual problem is that you wrote:

uint8_t* point = ((uint8_t*)array)+1;

When you should have written something such as

uint8_t* point = (uint8_t*)(array+1);

because you need to increment the pointer as a pointer to an appropriate type (so that the increment operation will add the size of an element), before you cast it to something else.

But one might ask if you really intend to have a byte pointer to a 32-bit value. Perhaps you do intend to access it in bytewise fashion (beware that the byte ordering will vary between systems!). Or perhaps you really intended for point to be a pointer to a 32-bit value which is in turn a pointer to an 8-bit value somewhere else...

Cutis answered 25/12, 2012 at 20:59 Comment(6)

I receive a char* that points to 32bit values for AES registers. The char* can't be of another type as it points to an area in the memory which has a variable number of bytes in front of the AES blocks. As this is a low power embedded davice, using the least possible amount of memory and energy (operations) is favorable. – Roybal 25/12, 2012 at 21:37

While there may be more optimal solutions for specific cases, the most straightforward thing that comes to mind would be to perform a bytewise copy to an aligned location. However it really depends on if the unaligned source data is stored in the system's endian order. If not, you'll probably need to access the data in bytewise fashion and then arithmetically recombine it into a 32-bit word. – Cutis 26/12, 2012 at 15:1

The right way™ to do this is to shift the bytes one by one into the correct positions in a 32-bit value. Dirty tricks like pointer casts or using union for type conversion will get you straight into portability hell. – Genevieve 26/12, 2012 at 18:47

@Genevieve - please re-read what was suggested, as only proper access and no such "dirty tricks" were involved. How one handles portability depends on what is known about the source endianess - if it is known absolutely (network protocols, etc), you can use an explicit combination to form the destination value. But if it is only known to be the same as the destination, you are better off with a bytewise copy. – Cutis 26/12, 2012 at 19:15

@Chris Stratton As you write, "bytewise copy to an aligned location" will bite you when the endianness changes, so don't do it that way. "access the data in bytewise fashion and then arithmetically recombine it into a 32-bit word" is the only clean solution, as the compiler takes care of all the dirty details. – Genevieve 26/12, 2012 at 20:3

@starblue, you are mistaken, because you neglect to consider that the appropriate answer depends on what is known. Bytewise access and arithmatic recombination will not work if the source endianess is not known absolutely. In the case where it is known only to be the same as the destination (ie, the source data was native endian within the same system, but has been misaligned), then bytewise access on both ends is the right answer - and one that is used both literally and figuratively in countless places within systems. In short, don't let the fact that the data is misaligned distract you. – Cutis 26/12, 2012 at 20:24

Recommended topics

Hot tags