What does this C syntax mean?
Asked Answered
C

6

6

This is from a 'magic' array library that I'm using.

void
sort(magic_list *l, int (*compare)(const void **a, const void **b))
{
    qsort(l->list, l->num_used, sizeof(void*),
         (int (*)(const void *,const void *))compare);
}

My question is: what on earth is the last argument to qsort doing?

(int (*)(const void *, const void*))compare) 

qsort takes int (*comp_fn)(const void *,const void *) as it's comparator argument, but this sort function takes a comparator with double pointers. Somehow, the line above converts the double pointer version to a single pointer version. Can someone help explain?

Claymore answered 8/9, 2011 at 20:44 Comment(3)
That C syntax means undefined behavior.Goodale
How the heck is someone supposed to make this work? If I were shown the prototype of that sort function and asked to write a compare function for it, I'd cast the arguments to int ** and double dereference them to get to the value, which would, in all likelihood, crash the program. Or give incorrect results.Permeance
something looks strange. the compare function might end up doing (**a > **b) but qsort will call compare with only pointers to elements. so it may dereference it one time too much. or maybe the elements in the array are pointers. and sort is sorting pointers. in that case a typedef would have been nice.Beetroot
J
2

On most hardware you can assume that pointers all look the same at the hardware level. For example, in a system with flat 64bit addressing pointers will always be a 64bit integer quantity. The same is true of pointers to pointers or pointers to pointers to pointers to pointers.

Therefore, whatever method is used to invoke a function with two pointers will work with any function that takes two pointers. The specific type of the pointers doesn't matter.

qsort treats pointers generically, as though each is opaque. So it doesn't know or care how they're dereferenced. It knows what order they're currently in and uses the compare argument to work out what order they should be in.

The library you're using presumably keeps lists of pointers to pointers about. It has a compare function that can compare two pointers to pointers. So it casts that across to pass to qsort. It's just syntactically nicer than, e.g.

qsort(l->list, l->num_used, sizeof(void*), compare);

/* elsewhere */

int compare(const void *ptr1, const void *ptr2)
{
    // these are really pointers to pointers, so cast them across
    const void **real_ptr1 = (const void **)ptr1;
    const void **real_ptr2 = (const void **)ptr2;

    // do whatever with real_ptr1 and 2 here, e.g.
    return (*real_ptr2)->sort_key - (*real_ptr1)->sort_key;
}
Josiah answered 8/9, 2011 at 20:53 Comment(9)
There are lots of things that happen to "look the same at hardware level" at some specific platform. In general, however, this is not true. On top of that, the language does not care about things looking "the same at hardware level". Language clearly states that this is a hack leading to undefined behavior.Function
The language specification doesn't state anything of the sort — it's not the place of language specifications to make value judgments about code. If the C specification clearly stated that was "a hack leading to undefined behaviour" then the C specification would explicitly rule out the possibility of anyone creating a strict superset of the language.Josiah
"it's not the place of language specifications to make value judgments about code" Yes, it is! Language specifications are designed to divide language implementations (and the specific details of those implementations) into two categories: conformant (or "good") and non-conformant (or "bad"). Non-conformant code is sometimes useful (for when you can rely on that particular implementation), but in general is bad, and should not be recommended. Anyway, -1 for "The specific type of the pointers doesn't matter." It clearly does.Lymanlymann
No, a language specification defines somethings and leaves other things undefined, either implicitly or explicitly. That allows, for example, Objective-C to be a strict superset of C — it follows all the C rules and adds some new ideas where C leaves things undefined. But per your claim, that means that every time someone writes an Objective-C method call, that's a hack per the C99 specification rather than simply a concept it doesn't define.Josiah
@Josiah - It means it's outside of the C language specification. The fact that Objective-C is a strict superset of C (or that C++ is almost one) doesn't make Objective-C (or C++) code any more applicable to C. You wouldn't put a few lines of Perl in the middle of a Java application and expect any implementation to support that behavior.Lymanlymann
@Tommy: Yes, the language specification does state that explicitly. I provided the link to the exact place in the standard where it is stated. I don't understand the "superset" comment. Quite the opposite, it is the freedom provided by "undefined", "unspecified" and "implementation-defined" behaviors in C that allows one to create C superset languages in the first place. A superset is perfectly free to "define the undefined" so to say. Defined behavior is just one possible form of undefined behavior.Function
So you should have said: "Language clearly states that this leads to undefined behaviour", not "Language clearly states that this is a hack [...]". The latter would explicitly preclude strict supersets of the language — a non-trivial distinction that should be appreciated by someone precise enough to obey the tight wording of language specifications. Please feel free to have the last word since I'll otherwise just be repeating myself. I think my answer is correct to the question set, though it's a case of "the library code quoted assumes this undefined behaviour is reliable for this reason"Josiah
@Tommy: A hack takes place when someone relies on a specific manifestation of undefined behavior in a C program. So, that is a folded form of two-part statement: 1. Language says it's UB. 2 Relying on specific form of UB in the code is a hack. That way within the context on the original code one can draw the line from "language" to "hack". That's what I meant, no more no less.Function
@AndreyT: I agree with your two-stage test. Your original statement collapsed it to a one-stage test, where the logic in step 2 is written in the specification relied upon in step 1. That was overreaching. To put it another way: I agree with your point (2), but the C specification doesn't require me to.Josiah
F
8

That's exactly what the cast you quoted does: it converts a pointer of type

int (*)(const void **, const void **)

to a pointer of type

int (*)(const void *, const void *)

The latter is what is expected by qsort.

Thing like this are encountered rather often in bad quality code. For example, when someone wants to sort an array of ints, they often write a comparison function that accepts pointers to int *

int compare_ints(const int *a, const int *b) {
  return (*a > *b) - (*a < *b);
}

and when the time comes to actually call qsort they forcefully cast it to the proper type to suppress the compiler's complaints

qsort(array, n, sizeof *array, (int (*)(const void *,const void *)) compare_ints);

This is a "hack", which leads to undefined behavior. It is, obviously, a bad practice. What you see in your example is just a less direct version of the same "hack".

The proper approach in such cases would be to declare the comparison function as

int compare_ints(const void *a, const void *b) {
  int a = *(const int *) a;
  int b = *(const int *) b;
  return (a > b) - (a < b);
}

and then use it without any casts

qsort(array, n, sizeof *array, compare_ints);

In general, if one expects their comparison functions to be used as comparators in qsort (and similar functions), one should implemnent them with const void * parameters.

Function answered 8/9, 2011 at 20:52 Comment(28)
Can you comment a bit more on why that gives undefined behaviour? It sounds like there may be a subtlety here that I've missed and which renders answers including mine inaccurate.Josiah
That's not undefined behavior in C, is it? And is it in C++? I'm interested, would you shed some light on it?Morey
@K-ballo: That is undefined behavior in C. Type int(const void*, const void*) is not compatible with type int(const int*, const int*), which means that calling the function through a forcefully converted pointer leads to undefined behavior.Function
@Tommy: Converting one function pointer type to another and then calling the function through the converted value is only allowed when the function types are compatible. The full definition of compatibility is given in 6.7.5.3/15. And the types above is not compatible, since const int * is not compatible with const void *.Function
IMHO there is nothing wrong with it. The program wants to sort an array of pointers. The cast could have been put inside the callback function: (a void*) can be cast to any pointer, including (void**). This is more-or-less a member function, containing the glue logic (including casts) between the std library and the stuff that is inside the magic_list thing.Beerbohm
@wildplasser: I think AndreyT is making the point that whether casting any pointer to/from void * is permissible is irrelevant (though it's probably what leads to the confusion) — what's happening here is that the code is implicitly assuming a pointer cast is a no-op, which the specification doesn't mandate.Josiah
@wildplasser: There's no casting of void * to void ** in the original code. The original code casts int (*)(const int*, const int*) to int (*)(const void*, const void*) which is an entirely different thing, not even remotely related to "casting void * to void **".Function
@Tommy: Not really. If we ignore the "undefined behavior" for a second, the cast of int (*)(const int*, const int*) to int (*)(const void*, const void*) eventually boils down to memory reinterpretation of const void * object as const int * object. Memory reinterpretation is not a cast. Casting int * to void * (and vice versa) is allowed by the language. However, memory-reinterpreting void * as int * is not allowed by the language. It will "work" in this case because physical representations of these pointers are the same, but the language does not guarantee it.Function
@AndreyT: wildplasser clearly refers to the original question, not to your example, as you're not sorting an array of pointers. The original program seems to assume that where you'd properly have to do a void * to void ** cast to call the custom compare function, a no-op will do the same job.Josiah
qsort is a nasty beast. It accepts one base pointer, an element size, and the count of elements plus a compare function needed to compare two elements, passed to it by pointers. There is a contract between the caller and qsort, that the compare function actually takes the elements of the given type. (we've all been bitten by that ;-) Given the (void*) pointers, the casting is never necessary, but it will become necessairy if some dereferencing has to take place. It is a programmer's choice to put the dereferencing at the right place. But it basically still is a matter of taste or style, IMHO.Beerbohm
@Tommy: Well, that doesn't change much in principle. That would simply mean that we are talking about int (*)(const void **, const void**) to int (*)(const void*, const void*) cast. The rest is the same. The language says it is UB.Function
@Tommy: indeed I was referring to the OP. What else can I do, I am not allowed to make stylistic remarks and pose them as answers ...Beerbohm
@wildplasser: The key point here is that when you implement it as shown in my example, you transform the void * pointer to the specific T * pointer (for some type T) by using a conversion (with or without a cast - does not matter). But when you cast the function type as in the original code, the conversion of function parameters does not take place anymore. The function parameters are reinterpreted in the original code. There's a world of difference between a conversion and reinterpretation.Function
@wildplasser: For example, you can do this float f = 5; int i = f; and the language guarantees that i will receive the value of 5. This is a conversion at work. Now, you can also do that float f = 5; int i = *(int *) &f;. This is memory reinterpretation. I hope I don't need to explain to you that in this latter case i is very unlikely to become 5 :) This is exactly what the "hack" with the function pointer cast leads to. It performs memory reinterpretation of void * parameter value as void ** parameter.Function
@wildplasser: This will "work" in practice, since the physical representations and parameter frame memory layout are usually the same, but it is still a "hack" from the language point of view. From the language point of view doing this is exactly the same as doing float f = 5; int i = *(int *) &f; and expecting i to become 5. So no, it is not a "matter of personal preference". From the language point of view the original code is a hack, there can't be any debate about it. The personal preferences come into the picture only when you decide how "hackish" you want to get.Function
There is nothing wrong with it, it is just a bit clumsy. The actual compare function takes (* * thing) as arguments, and the cast from (void * *) to (void *) posted here is just here to silence the compiler. There still is some unchecked, invisible coordination needed between the actual arguments given to the compare function and what it expects.Beerbohm
No, sorry I don't chat. I comment here because it is about this topic, dammit!Beerbohm
There is no conversion involved. It is only about the callbacks's signature. and a (void * ) pointer is convertable to any pointer type. (as long as the caller and callee agree, what I called the "contract" of the qsort interface)Beerbohm
@wildplasser: Er.. I don't see what you are talking about. What "cast (void **) to (void *) posted here" are you talking about? As for the original function pointer cast, again, the fact that the behavior is undefined (i.e. there's a lot wrong with it) is really a hard fact explicitly spelled out in the language specification. It's not really debatable. What "There is no conversion involved... a (void * ) pointer is convertable to any pointer type..." is supposed to mean is not clear to me. It sounds self-contradictory. What good is "convertable" if "there is no conversion involved"? :)Function
So it winds down to that you think a function pointer for a function taking two (void * * ) cannot (legally) be cast to a function pointer taking two (void *) arguments. I think it can. We differ.Beerbohm
@wildplasser: Firstly, to be precise, it is not casing per se that causes the problem. It is an attempt to actually call the function through a casted pointer that leads to undefined behavior (i.e. it is illegal in that sense). Secondly, we don't get to "differ" in this case, since it is not a matter of opinion and not what I "think". What I'm saying is just some fairly basic facts from the language specification. You don't get to "differ" with facts from the language specification. You appeared simply not to know these facts. Now you do (or so I hope).Function
@wildplasser: The comments under the OP also offer a link to a good extensive explanation of the issue #560081Function
Folks can you move this to a chat room please. Comments are not intended for extended discussion.Aleutian
IMO the arguments (void *) and (void * *) are comparable. (void *) and (some_type *) are not WRT function pointer arguments. @Kev: no. This discussion belongs here.Beerbohm
@wildplasser: The term is compatible. And no, void * is not compatible with void **.Function
@Beerbohm - nope it doesn't, it now belongs on chat because that's what you are now doing. Chatting.Aleutian
Sorry, I hit the return. Bothered by the CMS, and the lognazis.Beerbohm
Sorry folks but I'm going to step in and lock this post now. You've both got plenty rep to carry this on in one of our nice cosy chat rooms. Thanks.Aleutian
M
3

The last argument to qsort is casting a function pointer taking double pointers, to one taking single pointers that qsort will accept. It's simply a cast.

Morey answered 8/9, 2011 at 20:49 Comment(5)
Seems odd, though; why not cast it in sort's declaration?Tricornered
@Tom because the one in sort's declaration allows you to change the pointers so you can use them in the cast, that's why they're double pointersAorangi
@Tom To cast it in sort's declaration would burden the user of the function with the warning about the cast. Tony, good point, I'd overlooked the (in)convenience of just dereferencing with &Saks
I would guess that cast has to do with the fact that sort is using qsort as an implementation detail. If it were to use the single pointer version in the declaration of sort, then its consumers would have to be the ones doing the cast when calling it.Morey
@Tom_Zych, because the caller may be using a different type of args(the current arg type in sort()) which if directly passed to qsort() will generate a warning/error. which would mean fix every caller with a cast. that means ugly code. maybe the compare() function is part of legacy code. or too many callers.Beetroot
J
2

On most hardware you can assume that pointers all look the same at the hardware level. For example, in a system with flat 64bit addressing pointers will always be a 64bit integer quantity. The same is true of pointers to pointers or pointers to pointers to pointers to pointers.

Therefore, whatever method is used to invoke a function with two pointers will work with any function that takes two pointers. The specific type of the pointers doesn't matter.

qsort treats pointers generically, as though each is opaque. So it doesn't know or care how they're dereferenced. It knows what order they're currently in and uses the compare argument to work out what order they should be in.

The library you're using presumably keeps lists of pointers to pointers about. It has a compare function that can compare two pointers to pointers. So it casts that across to pass to qsort. It's just syntactically nicer than, e.g.

qsort(l->list, l->num_used, sizeof(void*), compare);

/* elsewhere */

int compare(const void *ptr1, const void *ptr2)
{
    // these are really pointers to pointers, so cast them across
    const void **real_ptr1 = (const void **)ptr1;
    const void **real_ptr2 = (const void **)ptr2;

    // do whatever with real_ptr1 and 2 here, e.g.
    return (*real_ptr2)->sort_key - (*real_ptr1)->sort_key;
}
Josiah answered 8/9, 2011 at 20:53 Comment(9)
There are lots of things that happen to "look the same at hardware level" at some specific platform. In general, however, this is not true. On top of that, the language does not care about things looking "the same at hardware level". Language clearly states that this is a hack leading to undefined behavior.Function
The language specification doesn't state anything of the sort — it's not the place of language specifications to make value judgments about code. If the C specification clearly stated that was "a hack leading to undefined behaviour" then the C specification would explicitly rule out the possibility of anyone creating a strict superset of the language.Josiah
"it's not the place of language specifications to make value judgments about code" Yes, it is! Language specifications are designed to divide language implementations (and the specific details of those implementations) into two categories: conformant (or "good") and non-conformant (or "bad"). Non-conformant code is sometimes useful (for when you can rely on that particular implementation), but in general is bad, and should not be recommended. Anyway, -1 for "The specific type of the pointers doesn't matter." It clearly does.Lymanlymann
No, a language specification defines somethings and leaves other things undefined, either implicitly or explicitly. That allows, for example, Objective-C to be a strict superset of C — it follows all the C rules and adds some new ideas where C leaves things undefined. But per your claim, that means that every time someone writes an Objective-C method call, that's a hack per the C99 specification rather than simply a concept it doesn't define.Josiah
@Josiah - It means it's outside of the C language specification. The fact that Objective-C is a strict superset of C (or that C++ is almost one) doesn't make Objective-C (or C++) code any more applicable to C. You wouldn't put a few lines of Perl in the middle of a Java application and expect any implementation to support that behavior.Lymanlymann
@Tommy: Yes, the language specification does state that explicitly. I provided the link to the exact place in the standard where it is stated. I don't understand the "superset" comment. Quite the opposite, it is the freedom provided by "undefined", "unspecified" and "implementation-defined" behaviors in C that allows one to create C superset languages in the first place. A superset is perfectly free to "define the undefined" so to say. Defined behavior is just one possible form of undefined behavior.Function
So you should have said: "Language clearly states that this leads to undefined behaviour", not "Language clearly states that this is a hack [...]". The latter would explicitly preclude strict supersets of the language — a non-trivial distinction that should be appreciated by someone precise enough to obey the tight wording of language specifications. Please feel free to have the last word since I'll otherwise just be repeating myself. I think my answer is correct to the question set, though it's a case of "the library code quoted assumes this undefined behaviour is reliable for this reason"Josiah
@Tommy: A hack takes place when someone relies on a specific manifestation of undefined behavior in a C program. So, that is a folded form of two-part statement: 1. Language says it's UB. 2 Relying on specific form of UB in the code is a hack. That way within the context on the original code one can draw the line from "language" to "hack". That's what I meant, no more no less.Function
@AndreyT: I agree with your two-stage test. Your original statement collapsed it to a one-stage test, where the logic in step 2 is written in the specification relied upon in step 1. That was overreaching. To put it another way: I agree with your point (2), but the C specification doesn't require me to.Josiah
O
1

It is casting a function pointer. I imagine that the reason is so that compare can be applied to the pointers that are dereferenced rather than whatever they are pointing to.

Opportuna answered 8/9, 2011 at 20:51 Comment(0)
A
0

(int (*)(const void *,const void *))compare is a C style cast to cast the function pointer compare to a function pointer with two const void * args.

Aorangi answered 8/9, 2011 at 20:51 Comment(0)
S
0

The last argument is a function pointer. It specifies that it takes a pointer to a function that returns an int and takes two const void ** arguments.

Snout answered 8/9, 2011 at 20:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.