Are all pointers derived from pointers to structure types the same?
Asked Answered
G

2

28

The Question

The question of whether all pointers derived from pointers to structure types are the same, is not easy to answer. I find it to be a significant question for the following two primary reasons.

A. The lack of a pointer to pointer to 'any' incomplete or object type, imposes a limitation on convenient function interfaces, such as:

int allocate(ANY_TYPE  **p,
             size_t    s);

int main(void)
{
    int *p;
    int r = allocate(&p, sizeof *p);
}

[Complete code sample]

The existing pointer to 'any' incomplete or object type is explicitly described as:

C99 / C11 §6.3.2.3 p1:

A pointer to void may be converted to or from a pointer to any incomplete or object type. [...]

A pointer derived from the existing pointer to 'any' incomplete or object type, pointer to pointer to void, is strictly a pointer to pointer to void, and is not required to be convertible with a pointer derived from a pointer to 'any' incomplete or object type.


B. It is not uncommon for programmers to utilize conventions based on assumptions that are not required, related to the generalization of pointers, knowingly or unknowingly, while depending on their experience with their specific implementations. Assumptions such as being convertible, being representable as integers, or sharing a common property: object size, representation, or alignment.


The words of the standard

According to C99 §6.2.5 p27 / C11 §6.2.5 p28:

[...] All pointers to structure types shall have the same representation and alignment requirements as each other. [...]

Followed by C99 TC3 Footnote 39 / C11 Footnote 48:

The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.

Although the standard doesn't say: "A pointer to a structure type" and the following words have been chosen: "All pointers to structure types", it doesn't explicitly specify whether it applies to a recursive derivation of such pointers. In other occasions where special properties of pointers are mentioned in the standard, it doesn't explicitly specify or mention recursive pointer derivation, which means that either the 'type derivation' applies, or it doesn't- but it's not explicitly mentioned.

And although the phrasing "All pointers to" while referring to types is used only twice, (for structure and union types), as opposed to the more explicit phrasing: "A pointer to" which is used throughout the standard, we can't conclude whether it applies to a recursive derivation of such pointers.

Guerra answered 19/6, 2014 at 10:32 Comment(5)
Whatever the answer to your main question is, allocate is not a convenient interface by any stretch of imagination.Cham
Furthermore, it looks like your example has nothing to do with the question. If it had TYPE1 in a declaration and TYPE2 at the call site, it could be considered a legitimate example of pointers to different structs having the same representation. But as it stands now, your example only has pointers to the same struct type, and pointers to these pointers.Cham
@WouterHuysentruit I prefer answers that explain it all. If that becomes long, a ## tl;dr section makes sense. But remember that both questions and answers on SO are also supposed to be an example for others and help people with similar (even if not exact same) questions.Usance
Fair enough, upvote for the effort. Keep it up :)Douty
@n.m. if this worked it'd be a nicer form of realloc than the current one (reallocate the pointer in-place and use return value to check whether it succeeded or not) - currently it's a bit cumbersome to write code that recovers from realloc failureHudspeth
G
29

Background

The assumption that the standard implicitly requires all pointers to structure types, (complete, incomplete, compatible and incompatible), to have the same representation and alignment requirements, began at C89- many years before the standard required it explicitly. The reasoning behind it was the compatibility of incomplete types in separate translation units, and although according to the C standards committee, the original intent was to allow the compatibility of an incomplete type with its completed variation, the actual words of the standard did not describe it. This has been amended in the second Technical corrigendum to C89, and therefore made the original assumption concrete.


Compatibility and Incomplete Types

While reading the guidelines related to compatibility and incomplete types, thanks to Matt McNabb, we find further insight of the original C89 assumption.

Pointer derivation of object and incomplete types

C99 / C11 §6.2.5 p1:

Types are partitioned into object types, function types, and incomplete types.

C99 / C11 §6.2.5 p20:

A pointer type may be derived from a function type, an object type, or an incomplete type, called the referenced type.

C99 / C11 §6.2.5 p22:

A structure or union type of unknown content is an incomplete type. It is completed, for all declarations of that type, by declaring the same structure or union tag with its defining content later in the same scope.

Which means that pointers may be derived from both object types and incomplete types. Although it isn't specified that incomplete types are not required to be completed; in the past the committee responded on this matter, and stated that the lack of a prohibition is sufficient and there's no need for a positive statement.

The following pointer to pointer to incomplete 'struct never_completed', is never completed:

int main(void)
{
    struct never_completed *p;
    p = malloc(1024);
}

[Complete code sample]

Compatible types of separate translation units

C99 / C11 §6.7.2.3 p4:

All declarations of structure, union or enumerated types that have the same scope and use the same tag declare the same type.

C99 / C11 §6.2.7 p1:

Two types have compatible type if their types are the same. Two structure types declared in separate translation units are compatible if their tags (are) the same tag. [trimmed quote] [...]

This paragraph has a great significance, allow me to summarize it: two structure types declared in separate translation units are compatible if they use the same tag. If both of them are completed- their members have to be the same (according to the specified guidelines).

Compatibility of pointers

C99 §6.7.5.1 p2 / C11 §6.7.6.1 p2:

For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.

If the standard mandates that two structures under specified conditions, are to be compatible in separate translation units whether being incomplete or complete, it means that the pointers derived from these structures are compatible just as well.

C99 / C11 §6.2.5 p20:

Any number of derived types can be constructed from the object, function, and incomplete types

These methods of constructing derived types can be applied recursively.

And due to the fact that pointer derivation is recursive, it makes pointers derived from pointers to compatible structure types, to be compatible with each other.

Representation of compatible types

C99 §6.2.5 p27 / C11 §6.2.5 p28:

pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements.

C99 / C11 §6.3 p2:

Conversion of an operand value to a compatible type causes no change to the value or the representation.

C99 / C11 §6.2.5 p26:

The qualified or unqualified versions of a type are distinct types that belong to the same type category and have the same representation and alignment requirements.

This means that a conforming implementation can't have a distinct judgement concerning the representation and alignment requirements of pointers derived from incomplete or complete structure types, due to the possibility that a separate translation unit might have a compatible type, which will have to share the same representation and alignment requirements, and it is required to apply the same distinct judgement with either an incomplete or a complete variation of the same structure type.

The following pointer to pointer to incomplete 'struct complete_incomplete':

struct complete_incomplete **p;

Is compatible and shares the same representation and alignment requirements as the following pointer to pointer to complete 'struct complete_incomplete':

struct complete_incomplete { int i; } **p;


C89 related

If we wonder about the premise concerning C89, defect report #059 of Jun 93' questioned:

Both sections do not explicitly require that an incomplete type eventually must be completed, nor do they explicitly allow incomplete types to remain incomplete for the whole compilation unit. Since this feature is of importance for the declaration of true opaque data types, it deserves clarification.

Considering mutual referential structures defined and implemented in different compilation units makes the idea of an opaque data type a natural extension of an incomplete data type.

The response of the committee was:

Opaque data types were considered, and endorsed, by the Committee when drafting the C Standard.


Compatibility versus Interchangeability

We have covered the aspect concerning the representation and alignment requirements of recursive pointer derivation of pointers to structure types, now we are facing a matter that a non-normative footnote mentioned, 'interchangeability':

C99 TC3 §6.2.5 p27 Footnote 39 / C11 §6.2.5 p28 Footnote 48:

The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.

The standard says that the notes, footnotes, and examples are non-normative and are "for information only".

C99 FOREWORD p6 / C11 FOREWORD p8:

[...] this foreword, the introduction, notes, footnotes, and examples are also for information only.

It's unfortunate that this confusing footnote was never changed, because at best- the footnote is specifically about the direct types referring to it, so phrasing the footnote as-if the properties of "representation and alignment requirements" are without the context of these specific types, makes it easy to interpret as being a general rule for all types that share a representation and alignment. If the footnote is to be interpreted without the context of specific types, then it's obvious that the normative text of the standard doesn't imply it, even without the need to debate the interpretation of the term 'interchangeable'.

Compatibility of pointers to structure types

C99 / C11 §6.7.2.3 p4:

All declarations of structure, union or enumerated types that have the same scope and use the same tag declare the same type.

C99 / C11 §6.2.7 p1:

Two types have compatible type if their types are the same.

C99 §6.7.5.1 p2 / C11 §6.7.6.1 p2:

For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.

This states the obvious conclusion, different structure types are indeed different types, and because they are different they are incompatible. Therefore, two pointers to two different and incompatible types, are incompatible just as well, regardless of their representation and alignment requirements.

Effective types

C99 / C11 §6.5 p7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

a type compatible with the effective type of the object

C99 / C11 §6.5 p6:

The effective type of an object for an access to its stored value is the declared type of the object, if any.

Incompatible pointers are not 'interchangeable' as arguments to functions, nor as return values from functions. Implicit conversions and specified special cases are the exceptions, and these types are not part of any such exception. Even if we decide to add an unrealistic requirement for said 'interchangeability', and say that an explicit conversion is required to make it applicable, then accessing the stored value of an object with an incompatible effective type breaks the effective types rules. For making it a reality we need a new property that currently the standard doesn't have. Therefore sharing the same representation and alignment requirements, and being convertible, is simply not enough.

This leaves us with being interchangeable 'as members of unions', and although they are indeed interchangeable as members of union- it bears no special significance.

Official interpretations

1. The first 'official' interpretation belongs to a member of the C standards committee. His interpretation for: "are meant to imply interchangeability", is that it doesn't actually imply that such an interchangeability exists, but actually makes a suggestion for it.

As much as I would like it to become a reality, I wouldn't consider an implementation that took a suggestion from a non-normative footnote, not to mention an unreasonably vague footnote, while contradicting normative guidelines- to be a conforming implementation. This obviously renders a program that utilizes and depends on such a 'suggestion', to be a non-strictly conforming one.

2. The second 'official' interpretation belongs to a member/contributor to the C standards committee, by his interpretation the footnote doesn't introduce a suggestion, and because the (normative) text of standard doesn't imply it- he considers it to be a defect in the standard. He even made a suggestion to change the effective types rules for addressing this matter.

3. The third 'official' interpretation is from defect report #070 of Dec 93`. It has been asked, within the context of C89, whether a program that passes an 'unsigned int' type, where the type 'int' is expected, as an argument to a function with a non-prototype declarator, to introduce undefined behavior.

In C89 there's the very same footnote, with the same implied interchangeability as arguments to functions, attached to:

C89 §3.1.2.5 p2:

The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same.

The committee responded that they encourage implementors to allow this interchangeability to work, but since it's not a requirement, it renders the program to be a non-strictly conforming one.


The following code sample is not strictly conforming. '&s1' and 'struct generic **' are sharing the same representation and alignment requirements, but nevertheless they are incompatible. According to the effective types rules, we are accessing the stored value of the object 's1' with an incompatible effective type, a pointer to 'struct generic', while its declared type, and therefore effective type, is a pointer to 'struct s1'. To overcome this limitation we could've used the pointers as members of a union, but this convention damages the goal of being generic.

int allocate_struct(void    *p,
                    size_t  s)
{
    struct generic **p2 = p;
    if ((*p2 = malloc(s)) == NULL)
        return -1;
    
    return 0;
}

int main(void)
{
    struct s1 { int i; } *s1;

    if (allocate_struct(&s1, sizeof *s1) != 0)
        return EXIT_FAILURE;
}

[Complete code sample]


The following code sample is strictly conforming, to overcome both issues of effective types and being generic, we're taking advantage of: 1. a pointer to void, 2. the representation and alignment requirements of all pointers to structs, and 3. accessing the pointer's byte representation 'generically', while using memcpy to copy the representation, without affecting its effective type.

int allocate_struct(void    *pv,
                    size_t  s)
{
    struct generic *pgs;

    if ((pgs = malloc(s)) == NULL)
        return -1;
    
    memcpy(pv, &pgs, sizeof pgs);
    return 0;
}

int main(void)
{
    struct s1 { int i; } *s1;

    if (allocate_struct(&s1, sizeof *s1) != 0)
        return EXIT_FAILURE;
}

[Complete code sample]


The Conclusion

The conclusion is that a conforming implementation must have the same representation and alignment requirements, respectively, for all recursively derived pointers to structure types, whether they are incomplete or complete, and whether they are compatible or incompatible. Although whether the types are compatible or incompatible is significant, but due to the mere possibility of a compatible type, they must share the fundamental properties of representation and alignment. It would've been preferred if we could access pointers that share representation and alignment directly, but unfortunately the current effective types rules do not require it.

Guerra answered 19/6, 2014 at 10:32 Comment(3)
Like we have array of integers, array of pointer etc, we can also have array of structure variables. And to make the use of array of structure variables efficient, we use pointers of structure type. We can also have pointer to a single structure variable, but it is mostly used with array of structure variables.Panettone
The language defined by K&R made it possible to have a function accept an array of pointers to structures and operate upon the array elements without having to worry about what kind of structures they identified. A cast would be required when passing an array to such a function, but otherwise things would work. C89 did not require that implementations continue to support such ability, but since that ability was useful implementations supported it until someone decided the language would be made "better" by removing features which, while useful, weren't mandated by the Standard.Astronomical
Would the version of allocate_struct that uses memcpy) be guaranteed to work if the target were in allocated storage? By my reading of the Standard, following the memcpy the effective type of the storage holding the pointer in question would be struct generic*,Astronomical
W
1

My answer is "no."

There is no wording in any standard of C that I'm aware of which suggests otherwise. The fact that all pointers to structure types have the same representation and alignment requirements has no bearing on any derived type.

This makes complete sense and any other reality would seem to be inconsistent. Consider the alternative:

Let's call the alignment and representation requirements for pointers to structure types "A". Suppose that any "recursively derived type" shares the requirements "A".

Let's call the alignment and representation requirements for pointers to union types "B". Suppose that any "recursively derived type" shares the requirements "B".

Let's suppose that "A" and "B" are not the same[1]. Furthermore, let's suppose that they cannot be satisfied at the same time. (A 4-byte representation and an 8-byte representation, for example.)

Now derive a type from both:

  1. A type with requirements "A"
  2. A type with requirements "B"

Now you have a type whose requirements are impossible to satisfy, because it must satisfy "A" and "B", but they cannot both be satisfied at once.

Perhaps you're thinking of derived types as having a flat lineage all the way back to a single ancestor, but that's not so. Derived types can have many ancestors. The standard definition of "derived types" discusses this.

[1] While it might seem unreasonable, unlikely and silly, it's allowed.

Wyon answered 24/12, 2015 at 6:15 Comment(2)
"Suppose that any "recursively derived type" shares the requirements "A"." - what's your justification for assuming that? A conforming implementation could have struct X ** with different size/rep to struct X *.Hudspeth
M. M: Please reconsider what the word "suppose" means. The purpose of the thought-exercise was to argue against the possibility of such a requirement by demonstrating its inconsistency. We're on the same team.Wyon

© 2022 - 2024 — McMap. All rights reserved.