Calling a free() wrapper: dereferencing type-punned pointer will break strict-aliasing rules
Asked Answered
R

3

5

I've tried to read up on the other questions here on SO with similar titles, but they are all a tiny bit too complex for me to be able to apply the solution (or even explanation) to my own issue, which seems to be of a simpler nature.

In my case, I have a wrapper around free() which sets the pointer to NULL after freeing it:

void myfree(void **ptr)
{
    free(*ptr);
    *ptr = NULL;
}

In the project I'm working on, it is called like this:

myfree((void **)&a);

This makes gcc (4.2.1 on OpenBSD) emit the warning "dereferencing type-punned pointer will break strict-aliasing rules" if I crank up the optimization level to -O3 and add -Wall (not otherwise).

Calling myfree() the following way does not make the compiler emit that warning:

myfree((void *)&a);

And so I wonder if we ought to change the way we call myfree() to this instead.

I believe that I'm invoking undefined behaviour with the first way of calling myfree(), but I haven't been able to wrap my head around why. Also, on all compilers that I have access to (clang and gcc), on all systems (OpenBSD, Mac OS X and Linux), this is the only compiler and system that actually gives me that warning (and I know emitting warnings is a nice optional).

Printing the value of the pointer before, inside and after the call to myfree(), with both ways of calling it, gives me identical results (but that may not mean anything if it's undefined behaviour):

#include <stdio.h>
#include <stdlib.h>

void myfree(void **ptr)
{
    printf("(in myfree) ptr = %p\n", *ptr);
    free(*ptr);
    *ptr = NULL;
}

int main(void)
{
    int *a, *b;

    a = malloc(100 * sizeof *a);
    b = malloc(100 * sizeof *b);

    printf("(before myfree) a = %p\n", (void *)a);
    printf("(before myfree) b = %p\n", (void *)b);

    myfree((void **)&a);  /* line 21 */
    myfree((void *)&b);

    printf("(after myfree) a = %p\n", (void *)a);
    printf("(after myfree) b = %p\n", (void *)b);

    return EXIT_SUCCESS;
}

Compiling and running it:

$ cc -O3 -Wall free-test.c
free-test.c: In function 'main':
free-test.c:21: warning: dereferencing type-punned pointer will break strict-aliasing rules

$ ./a.out
(before myfree) a = 0x15f8fcf1d600
(before myfree) b = 0x15f876b27200
(in myfree) ptr = 0x15f8fcf1d600
(in myfree) ptr = 0x15f876b27200
(after myfree) a = 0x0
(after myfree) b = 0x0

I'd like to understand what is wrong with the first call to myfree() and I'd like to know if the second call is correct. Thanks.

Refund answered 25/7, 2016 at 13:53 Comment(4)
I would not recommend always setting pointers to NULL after freeing, as you wont see double free, (which shouldn't happend)Tinctorial
@mou Well, that's a valid point. The counter argument is that not setting it to NULL may hide accesses to freed memory.Refund
yes but valgrind & stuff should tell us, as we are talking about the heap memTinctorial
@mou Calling free(NULL) is harmless, dereferencing NULL is not. Thus I believe the reasoning is (this is in a collaborative project that's been running for a number of years) it's better to be told about referencing freed memory than to catch a double free() call. I've been thinking about using the Boehm GC for memory leak detection, and/or Valgrind, but haven't come around to do that just yet as I'm fairly new in the project.Refund
C
9

Since a is an int* and not a void*, &a cannot be converted to a pointer to a void*. (Suppose void* were wider than a pointer to an integer, something which the C standard allows.) As a result, neither of your alternatives -- myfree((void**)a) and myfree((void*)a) -- is correct. (Casting to void* is not a strict aliasing issue. But it still leads to undefined behaviour.)

A better solution (imho) is to force the user to insert a visible assignment:

void* myfree(void* p) {
    free(p);
    return 0;
}

a = myfree(a);

With clang and gcc, you can use an attribute to indicate that the return value of my_free must be used, so that the compiler will warn you if you forget the assignment. Or you could use a macro:

#define myfree(a) (a = myfree(a))
Cheryle answered 25/7, 2016 at 14:14 Comment(5)
Does your first sentence mean that both calls are incorrect (due to ptr being of type void **)?Refund
@kusalananda: yes. I should add that to the answer.Cheryle
We're trying to stay as C99 as possible, so using compiler-specific attributes would be a no-no. Need to think this over a bit and discuss with collaborators.Refund
@kusalanada: the usual approach to attributes is to #define away __attribute__ on implementations which don't support it. (It doesn't change program behaviour; it's just a warning.) I've been avoiding saying this, but personally I dislike the idiom you are trying to implement, especially when the pointer being freed is itself inside a block of memory about to be freed. If you want to debug double-free or use-after-free, use valgrind. But many people disagree with me.Cheryle
I definitely appreciate the comments. This is old code which is still being developed (a bioinformatics application), and I'm cleaning bits of it up. It's not me trying to implement something, it's me trying to make it correct.Refund
R
4

Here's a suggestion that:

  1. Does not violate the strict aliasing rule.
  2. Makes the call more natural.
void* myfree(void *ptr)
{
    free(ptr);
    return NULL;
}

#define MYFREE(ptr) ptr = myfree(ptr);

you can use the macro simply as:

int* a = malloc(sizeof(int)*10);

...

MYFREE(a);
Resemble answered 25/7, 2016 at 14:15 Comment(0)
W
2

There are basically a few ways to have a function work with and modify a pointer in a fashion agnostic to the pointer's target type:

  1. Pass the pointer into the function as void* and return it as void*, applying appropriate conversions in both directions at the call site. This approach has the disadvantage of tying up the function's return value, precluding its use for other purposes, and also precludes the possibility of performing the pointer update within a lock.

  2. Pass a pointer to function which accepts two void*, casts one of them into a pointer of the appropriate type and the other to a double-indirect pointer of that type, and possibly a second function that can read a passed-in pointer as a void*, and use those functions to read and write the pointer in question. This should be 100% portable, but likely very inefficient.

  3. Use pointer variables and fields of type void* elsewhere and cast them to real pointer types whenever they're actually used, thus allowing pointers of type void** to be used to modify the pointer variables.

  4. Use memcpy to read or modify pointers of unknown type, given double-indirect pointers which identify them.

  5. Document that code is written in a dialect of C, popular in the 1990s, which treated "void**" as a double-indirect pointer to any type, and use compilers and/or settings that support that dialect. The C Standard allows for implementations to use different representations for pointers to things of different types, and because those implementations couldn't support a universal double-indirect pointer type, and because implementations which could easily allow void** to be used that way already did so before the Standard was written, there was no perceived need for the Standard to describe that behavior.

The ability to have a universal double-indirect pointer type was and is extremely useful on the 90%+ of implementations that could (and did) readily support it, and the authors of the Standard certainly knew that, but the authors were far less interested in describing behaviors that sensible compiler writers would support anyway, than in mandating behaviors which would be on the whole beneficial even on platforms where they could not be cheaply supported (e.g. mandating that even on a platform whose unsigned math instructions wrap mod 65535, a compiler must generate whatever code is needed to make calculations wrap mod 65536). I'm not sure why modern compiler writers fail to recognize that.

Perhaps if programmers start overtly writing for sane dialects of C, the maintainers of standards might recognize that such dialects have value. [Note that from an aliasing perspective, treating void** as a universal double-indirect pointer will have far less severe performance costs than forcing programmers to use any of the alternatives 2-4 above; any claims by compiler writers that treating void** as a universal double-indirect pointer would kill performance should thus be treated skeptically].

Willie answered 27/7, 2016 at 15:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.