Fix for dereferencing type-punned pointer will break strict-aliasing

Asked 11/1, 2012 at 18:28 Answered 17/5, 2022 at 16:44

I'm trying to fix two warnings when compiling a specific program using GCC. The warnings are:

warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]

and the two culprits are:

unsigned int received_size = ntohl (*((unsigned int*)dcc->incoming_buf));

and

*((unsigned int*)dcc->outgoing_buf) = htonl (dcc->file_confirm_offset);

incoming_buf and outgoing_buf are defined as follows:

char                    incoming_buf[LIBIRC_DCC_BUFFER_SIZE];

char                    outgoing_buf[LIBIRC_DCC_BUFFER_SIZE];

This seems subtly different than the other examples of that warning I've been examining. I would prefer to fix the problem rather than disable strict-aliasing checks.

There have been many suggestions to use a union - what might be a suitable union for this case?

Beograd answered 11/1, 2012 at 18:28 Comment(3)

Interesting... strict-aliasing shouldn't apply to char*. Or am I missing something? – Inedited 11/1, 2012 at 18:30

@Inedited Yes, what you are missing is there is no aliasing violation when an object of type T1 is accessed with a lvalue of type T2 and T2 is char, but when T1 is char and T2 is not of a signed/unsigned variant of char, there is an aliasing violation. – Corny 11/1, 2012 at 19:8

@Mysticial: You got it the wrong way round! – Gamma 11/1, 2012 at 19:14

First off, let's examine why you get the aliasing violation warnings.

Aliasing rules simply say that you can only access an object through its own type, its signed / unsigned variant type, or through a character type (char, signed char, unsigned char).

C says violating aliasing rules invokes undefined behavior (so don't!).

In this line of your program:

unsigned int received_size = ntohl (*((unsigned int*)dcc->incoming_buf));

although the elements of the incoming_buf array are of type char, you are accessing them as unsigned int. Indeed the result of the dereference operator in the expression *((unsigned int*)dcc->incoming_buf) is of unsigned int type.

This is a violation of the aliasing rules, because you only have the right to access elements of incoming_buf array through (see rules summary above!) char, signed char or unsigned char.

Notice you have exactly the same aliasing issue in your second culprit:

*((unsigned int*)dcc->outgoing_buf) = htonl (dcc->file_confirm_offset);

You access the char elements of outgoing_buf through unsigned int, so it's an aliasing violation.

Proposed solution

To fix your issue, you could try to have the elements of your arrays directly defined in the type you want to access:

unsigned int incoming_buf[LIBIRC_DCC_BUFFER_SIZE / sizeof (unsigned int)];
unsigned int outgoing_buf[LIBIRC_DCC_BUFFER_SIZE / sizeof (unsigned int)];

(By the way the width of unsigned int is implementation defined, so you should consider using uint32_t if your program assumes unsigned int is 32-bit).

This way you could store unsigned int objects in your array without violating the aliasing rules by accessing the element through the type char, like this:

*((char *) outgoing_buf) =  expr_of_type_char;

char_lvalue = *((char *) incoming_buf);

EDIT:

I've entirely reworked my answer, in particular I explain why the program gets the aliasing warnings from the compiler.

Corny answered 11/1, 2012 at 19:15 Comment(1)

the warning can silenced by double casting through void *. see my answer – Rawhide 17/5, 2022 at 16:46

To fix the problem, don't pun and alias! The only "correct" way to read a type T is to allocate a type T and populate its representation if needed:

uint32_t n;
memcpy(&n, dcc->incoming_buf, 4);

In short: If you want an integer, you need to make an integer. There's no way to cheat around that in a language-condoned way.

The only pointer conversion which you are allowed (for purposes of I/O, generally) is to treat the address of an existing variable of type T as a char*, or rather, as the pointer to the first element of an array of chars of size sizeof(T).

Gamma answered 11/1, 2012 at 19:14 Comment(4)

I'm not sure sizeof(uint32_t) is guaranteed to be 4, so you may need to adjust your memcpy – Copse 24/8, 2020 at 15:27

@OmarL: correct. uint32_t is guaranteed to have exactly 32 value bits and no padding bits, but sizeof(uint32_t) can be 2 if unsigned char has 16 bits and even 1 if unsigned char has 32 bits. The example should be amended as memcpy(&n, dcc->incoming_buf, sizeof(n)); – Yearly 17/5, 2022 at 17:19

@Yearly I just tried to find an example of architecture or configuration with a char size of 16 or 32 but I failed. Examples ? – Girlfriend 14/6 at 10:4

@dargaud: some DSPs do not have byte addressable memory. See https://mcmap.net/q/14101/-what-platforms-have-something-other-than-8-bit-char – Yearly 15/6 at 22:20

union
{
    const unsigned int * int_val_p;
    const char* buf;
} xyz;

xyz.buf = dcc->incoming_buf;
unsigned int received_size = ntohl(*(xyz.int_val_p));

Simplified explanation 1. c++ standard states that you should attempt to align data yourself, g++ goes an extra mile to generate warnings on the subject. 2. you should only attempt it if you completely understand the data alignment on your architecture/system and inside your code (for example the code above is a sure thing on Intel 32/64 ; alignment 1; Win/Linux/Bsd/Mac) 3. the only practical reason to use the code above is to avoid compiler warnings , WHEN and IF you know what you are doing

Overvalue answered 6/9, 2016 at 17:30 Comment(0)

If I may, IMHO, for this case, the problem is the design of the ntohl and htonl and related function APIs. They should not have been written as numeric argument with numeric return. (and yes, I understand the macro optimization point) They should have been designed as the 'n' side being a pointer to a buffer. When this is done, the whole problem goes away and the routine is accurate whichever endian the host is. For example (with no attempt to optimize):

inline void safe_htonl(unsigned char *netside, unsigned long value) {
    netside[3] = value & 0xFF;
    netside[2] = (value >> 8) & 0xFF;
    netside[1] = (value >> 16) & 0xFF;
    netside[0] = (value >> 24) & 0xFF;
};

Ern answered 16/3, 2018 at 18:10 Comment(1)

If the Standard were to include a standard set of big-endian and little-endian "fetch" and "stuff" routines, such routines could usefully have behavior defined in terms of octets even on machines where CHAR_BIT isn't eight, thus enhancing the portability of networking code on such machines. – Lipread 21/2, 2019 at 0:23

If you have reasons that do not allow you to change type of source object (like it was in my case), and you are absolutely confident that the code is correct and it does what intended to do with that char array, to avoid warnings you may do the following:

unsigned int* buf = (unsigned int*)dcc->incoming_buf;
unsigned int received_size = ntohl (*buf);

Fukuoka answered 27/2, 2019 at 11:53 Comment(1)

Casting away warnings is bad. This code does nothing about violating strict aliasing and can also violate alignment restrictions. – Weight 20/4, 2021 at 14:42

I recently upgraded a project from GCC 6 to GCC 9, and started seeing this warning. The project is on a 32-bit microcontroller, and I had created a struct to access the individual bytes of a 32-bit machine register:

struct TCC_WEXCTRL_t
{
    byte    OTMX;
    byte    DTIEN;
    byte    DTLS;
    byte    DTHS;
};

and then coded:

((TCC_WEXCTRL_t *)&TCC0->WEXCTRL)->DTLS = PwmLoDeadTime;

which produced the warning in the new compiler. I found I could eliminate the warning by combining my struct in a union with the original type:

union TCC_WEXCTRL_t
{
    TCC_WEXCTRL_Type std;
    struct  
    {
        byte    OTMX;
        byte    DTIEN;
        byte    DTLS;
        byte    DTHS;
    };    
};

where TCC_WEXCTRL_Type is the type of the WEXCTRL member as provided in the manufacturer's header files.

I'm not sure if this is considered a fully compliant fix, or if GCC is just failing to catch it. If this hadn't worked (or gets caught in another GCC upgrade), I would move on to using a union of the pointer types, as described on this thread by Real Name.

Guertin answered 13/10, 2020 at 1:15 Comment(0)

-1

If you are certain you know what you are doing, do this:

void *tmp = dcc->incoming_buf;
unsigned int received_size = ntohl (*((unsigned int*) tmp));

or just:

unsigned int received_size = ntohl (*((unsigned int*) ((void *) dcc->incoming_buf)));

Rawhide answered 17/5, 2022 at 16:44 Comment(1)

I'm finding that the g++ compiler with args std=c++11 and -Wall is still producing this warning for me even when double casting via (void*). Honestly if the compiler maintainers persist in telling people off for taking advantage of byte-level optimisations that make lower level languages like C attractive in the first place, they're really just degrading the value of using languages like C in the first place. Another solution is to not use -Wall ;) – Albertson 14/1, 2023 at 13:38

-3

Cast pointer to unsigned and then back to pointer.

unsigned int received_size = ntohl (*((unsigned *)((unsigned) dcc->incoming_buf)) );

Olsewski answered 10/7, 2015 at 13:20 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags