Is it legal C to obtain the pointer to a struct from the pointer to its 2nd member?
Asked Answered
S

2

2

I'm wondering if the line preceded by the comment "Is this legal C?" (in the function dumpverts() at the bottom) is legal C or not:

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>

struct  stvertex 
    {
    double  x;
    double  y;
    char    tag;
    };
    
struct  stmesh
    {
    size_t      nverts;
    struct stvertex verts[]; /* flexible array member */
    };
    

void    dumpverts(struct stvertex *ptr);

int main(int argc, char **argv)
    {
    size_t f;
    size_t usr_nverts=5; /* this would come from the GUI */
    
    struct stmesh *m = malloc(sizeof(struct stmesh) + usr_nverts*sizeof(struct stvertex));
    if(m==NULL) return EXIT_FAILURE;
    
    m->nverts=usr_nverts;
    for(f=0;f<m->nverts;f++)
        {
        m->verts[f].x = f*10.0; /* dumb values just for testing */
        m->verts[f].y = f*7.0;
        m->verts[f].tag = 'V';
        }
    
    dumpverts( &(m->verts[0]) );
    
    return EXIT_SUCCESS;
    }


void    dumpverts(struct stvertex *ptr) /* Here is were the juice is */
    {
    size_t f;
    
    /* Is this legal C? */
    struct stmesh   *themesh = (struct stmesh *)((char *)ptr - offsetof(struct stmesh, verts));
    
    for(f=0;f<themesh->nverts;f++)
        {
        printf("v[%zu] = (%g,%g) '%c'\n", f, themesh->verts[f].x, themesh->verts[f].y, themesh->verts[f].tag);
        }
    fflush(stdout);
    }

I tend to believe it's legal, but I'm not 100% sure if the strict aliasing rule would permit the cast from char * to struct stmesh * like the interesting line in the dumpverts() function body is doing.

Basically, that line is obtaining the pointer to the struct stmesh from the pointer to its second member. I don't see any alignment-related potential issues, because the memory for the whole struct stmesh came from malloc(), so the beginning of the struct is "suitably aligned". But I'm not sure about the strict aliasing rule, as I said.

If it breaks strict aliasing, can it be made compliant without changing the prototype of the dumpverts() function?

If you wonder what I want this for, it's mainly for learning where are the limits of offsetof(). Yes, I know dumpverts() should be receiving a pointer to struct stmesh instead. But I'm wondering if obtaining the struct stmesh pointer programmatically would be possible in a legal way.

Soembawa answered 24/4, 2022 at 23:45 Comment(7)
C’s aliasing rules are entirely irrelevant here. The aliasing rules say that if something is a foo, then you will only access it as a foo or certain other allowed types. If you have somehow calculated a pointer to a struct stmesh and access it as a struct stmesh, then the aliasing rules are satisfied. The only question is whether the pointer arithmetic is defined to produce a result that points to the struct stmesh.Debor
First: get rid of the casts; they are not needed.Gorski
You really ought to adopt a conventional coding style. Your brace placement isn't in one of the two ways that some 99% of all other C programmers use. Also you are inconsistently swaying between K&R style and brace on its own line style.Sato
@Gorski I removed the two only cast which in my opinion could be removed: the casts from int to double (because there's a multiplication where the int is promoted to double anyway). Anyway, I sometimes prefer to write (some) superfluous casts when I want to use them as a comment of what I intentionally want to do.Soembawa
@EricPostpischil You were right: My original pointer arithmetic was calculating the distance from the nverts field to the verts field (that's why I was subtracting the offsets). This was because my first code actually obtained nverts directly, rather than the pointer to the struct. However, later I realized it was better to obtain the pointer to the struct. So, just the subtraction of the verts offset is necessary. I modified the code accordingly. Thanks!Soembawa
@Sato You were right: My indentation style is Whitesmiths. It's a conventional one, and very well established (it's also supported by most code style beautification tools). You can argue you like it or not, but it's a standard in many businesses (specially commercial). You were right that my structs didn't follow Whitesmiths. I modified the code accordingly, and it's now 100% Whitesmiths.Soembawa
I fully agree, Lundin, not a place for style wars. And, for the same reason, saying that everything apart from Allman and K&R is non-standard, implies opening the can of worms, because that's a sentence based on personal preference, not backed by any source I know of. You know, I could run astyle on code before posting it, but I stick to the opinion in the K&R book: consistency is better than the indentation choice.Soembawa
M
2

Yes, it's valid. You can convert any non-function pointer to and from char *: there's an explicit part of the standard allowing that:

C17, section 6.3.2.3, clause 7:

When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

The reason this is allowed is exactly so you can do tricks like the one you're showing. Note, however, that this is only valid if the pointer comes from a struct stmesh in the first place (even if you don't have that struct in scope when you're doing that).

Sidenote: you don't need offsetof(struct stmesh, nverts) at all in your example. It's guaranteed to be zero. Section 6.7.2.1, clause 15:

A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

Maegan answered 24/4, 2022 at 23:54 Comment(3)
"You can convert any non-function pointer to and from char *" No, only to a character pointer. The other way around is not well-defined and can cause alignment problems or pointer aliasing violations.Sato
Yes, it's valid. You can convert any non-function pointer to and from char *: there's an explicit part of the standard allowing that The author is asking not about validity of converting to char*, but about validity of subtracting from the result of such cast. Your quote from the standard speaks only about increments, not decrements/subtractions.Gondolier
After giving this some thought, I don't think it is well-defined at all. Also the quoted parts here don't seem relevant. I've posted a different answer.Sato
S
0

Pedantically, there is nothing in the C standard explicitly stating that the code is well-defined. I'd say that it's somewhere between questionable and undefined behavior.

  • Strict aliasing concerns: not a problem. To de-reference some address through a pointer to struct is fine as far as strict aliasing goes, as long as what's actually stored at that location is of the correct effective type (C17 6.5 §6 and §7).

  • Character pointer conversion: questionable. Any type in C may be inspected byte by byte through the use of a character pointer. This is in line with "Strict aliasing" C17 6.5 §7 and also the pointer conversion rules in C17 6.3.2.3, emphasis mine:

    A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

    Your pointer does not point to the lowest addressed byte in the surrounding struct type. Nor do you use successive increments. Alignment is another issue but I don't think it will be a problem in your case.

  • Pointer arithmetic: questionable. Pointer arithmetic is defined by the additive operators C17 6.5.6, which strictly speaking only allow pointer arithmetic on array types. Where a single struct variable may be regarded as an array of 1 such struct item. To make sense of the previously quoted 6.3.2.3 in terms of pointer arithmetic, I think it must be interpreted as a character array of sizeof(the_struct) bytes. Decreasing a character pointer pointing into the middle of a struct is not covered by the rules of pointer arithmetic - strictly speaking it sorts under §8 "...otherwise, the behavior is undefined".

  • Initial struct member/initial common sequence rules: do not apply. There's a special rule allowing us to convert between a struct pointer and a pointer to its first element (C17 6.7.2.1 §15) but that does not apply here. There is also a special rule for "common initial sequence" of two structs in a union, also does not apply here.


This might be a more well-defined version:

dumpverts( (uintptr_t) &(m->verts[0]) );
...
void dumpverts (uintptr_t ptr) 
{
  struct stmesh* themesh = (struct stmesh *)(ptr - offsetof(struct stmesh, verts));

This is plain integer arithmetic. Your only concerns here are alignment and strict aliasing, which should be ok. Integer to/from pointer conversions with uintptr_t are otherwise fine (impl.defined), C17 6.3.2.3 §5 and §6.

Sato answered 25/4, 2022 at 10:57 Comment(5)
This is plain integer arithmetic. Your only concerns here are alignment and strict aliasing, which should be ok. Integer to/from pointer conversions with uintptr_t are otherwise fine (impl.defined) When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmeticGondolier
According to this other answer in SO, pointer arithmetic should be always performed with char * pointers, and never with uintptr_t: https://mcmap.net/q/1095089/-c-why-cast-to-uintptr_t-vs-char-when-doing-pointer-arithmeticSoembawa
@LanguageLawyer That's rather a (poor) quality of implementation issue of gcc then. Of course we need to be able to covert between integers and pointers (as allowed by 6.3.2.3) or we can't write C for the use on computers. Because anything low-level and hardware related may rely on address calculations being performed as plain integers. In the ISA, ADD etc instructions don't really care if you provide index registers or data as inputs. The "magical and fantastic" attributes of pointers is C doesn't exist in a sane ISA, where everything is just numbers.Sato
@Soembawa If you are concerned about portability to wildly exotic systems and you are willing to spend significant effort in ensuring such portability, sure... Do go and tell your boss that you have implemented portability to Cray-1 computers from 1975 instead of working on the purpose of your application and see how happy they will be about it.Sato
I think GCC relies on this pointer cast restrictions way before it lowers an intermediate representation to CPU instructions.Gondolier

© 2022 - 2024 — McMap. All rights reserved.