Problem trying to use the C qsort function
Asked Answered
M

3

14
#include <stdio.h>
#include <stdlib.h>

float values[] = { 4, 1, 10, 9, 2, 5, -1, -9, -2,10000,-0.05,-3,-1.1 };

int compare (const void * a, const void * b)
{
    return ( (int) (*(float*)a - *(float*)b) );
}

int main ()
{

    int i;

    qsort (values, 13, sizeof(float), compare);

    for (i = 0; i < 13; i++)
    {
        printf ("%f ",values[ i ]);
    }
    putchar('\n');

    return 0;
}

The result is:

-9.000000 -3.000000 -2.000000 -1.000000 -1.100000 -0.050000 1.000000 2.000000 4.000000 5.000000 9.000000 10.000000 10000.000000

It's wrong because the order of -1 and -1.1 is changed. I believe it is happening because my "compare" function.

How can I fix this?

Thanks

Merilyn answered 7/10, 2010 at 22:47 Comment(1)
qsort works fine. Your call to qsort is broken.Recess
A
43

Your comparison function is broken. It says, for example, that -1.0 is equal (equivalent) to -1.1, since (int) ((-1.0) - (-1.1)) is zero. In other words, you yourself told qsort that the relative order of -1.0 and -1.1 does not matter. Why are you surprised that in the resultant ordering these values are not sorted?

In general, you should avoid comparing numerical values by subtracting one from another. It just doesn't work. For floating-point types it might produce imprecise results for quite a few different reasons, one of which you just observed yourself. For integer types it might overflow.

The generic idiom for comparing two numerical values a and b for qsort looks as (a > b) - (a < b). Remember it and use it. In your case that would be

int compare (const void * a, const void * b)
{
  float fa = *(const float*) a;
  float fb = *(const float*) b;
  return (fa > fb) - (fa < fb);
}

In C code it might make perfect sense to define a macro

#define COMPARE(a, b) (((a) > (b)) - ((a) < (b)))

and use it instead of spelling out the comparisons explicitly.

Aurore answered 7/10, 2010 at 22:56 Comment(9)
+1 There needs to be more plus ones and this needs to be accepted as the answer.Spiccato
return (fa > fb) - (fa < fb) is elegant, yet return (fa < fb) ? -1 : (fa > fb); may be faster. YMMV.Valonia
@chux: Why would it be faster?Aurore
In many embedded and other environments, FP operations are performance expensive. The (fa < fb) ? -1 : (fa > fb) approach performs 1 or 2 FP compares and (fa > fb) - (fa < fb) always performs 2. Given random order: 1.5 compares is faster than 2 on an operation that may dominate qsort(). Optimization, etc. may affect how well the each approach works. A good compiler may be able to only do 1 compare either way, identifying it can re-use the compare results. But doubtful (fa < fb) ? -1 : (fa > fb) would ever be slower.Valonia
@chux: Only a low quality compiler would perform two comparisons when doing (fa > fb) - (fa < fb). Most CPUs compare values by using a CPU instruction that sets certain CPU state flags. These state flags fully describe the result of the comparison. A single fa vs. fb comparison generates the flags that cover all relational comparisons between fa and fb. I.e. one comparison immediately gives you the answer to both fa > fb and fa < fb. All that is needed is to extract these results from the CPU flags and perform the subtraction.Aurore
@chux: Your (fa < fb) ? -1 : (fa > fb) is potentially branching. The fact that it is branching suggests that it might end up being much slower.Aurore
@AndreyT Glad you agree there could be a difference and that on some compilers two comparisons (vs. 1.5) are done. Concerning some "low quality compiler would perform two comparisons when doing", that I covered before in "A good compiler may be able to only do 1 compare either way". It depends on the platform, complier, data set, etc. as to which will be faster. Concerning "branching ... being much slower": it would be interesting to see a situation when the qsort() took 1/3 longer.Valonia
@AnT See my answer for an automated way to find such bugs.Mixon
why use float fa = *(const float*) a; instead of float fa = *(float*) a; ?Gerhard
J
1

By rounding the difference to the integer you lose the precision.

EDIT:

Modify the compare function to

return (*(float*)a >= *(float*)b) ? 1 : -1;

Edit for AndreyT: I don't think that returning only 1 or -1 will cause an infinite loop or incorrect ordering (it will just exchange equal values that didn't require it).

Having an explicit case for returning 0 will cost an additional float compatation, and they are rarely equal. So, the comparation for equallity could be omitted if the collision rate in the input data is small.

Jetport answered 7/10, 2010 at 22:51 Comment(3)
Will not work. This function will return -1 for equal values, meaning that for equal a and b comparing a to b will say that a < b, yet comparing b to a will say that b < a. qsort will not work correctly with such comparison function.Aurore
Yor edit didn't change anything, except that now equal values will always return 1. Standard qsort is designed for a comparator that is a tri-value function. It is not generally possible to reduce it to a two-value function, regardless of what you do. You have to return -1, 0, +1.Aurore
It is not unusual for a debug implementation of qsort to check the correctness of the comparison function. If your comparison function will return 1 for (a, b) comparison and at the same time return 1 for (b, a) comparison, such debug qsort implementation will typically abort immediately with an asserion failure. The non-debug implementation will simply produce undefined behavior.Aurore
M
0

To add to existing answer by @AnT, you can automatically verify your qsort callback via SortChecker:

$ LD_PRELOAD=$HOME/sortcheck-master/bin/libsortcheck.so ./a.out
a.out[7133]: qsort: comparison function is not transitive (comparison function 0x4005cd (/home/iuriig/a.out+0x4005cd), called from 0x400614 (/home/iuriig/a.out+0x400614), cmdline is "./a.out")
-9.000000 -3.000000 -2.000000 -1.000000 -1.100000 -0.050000 1.000000 2.000000 4.000000 5.000000 9.000000 10.000000 10000.000000

This warning says that compare reports x < y, y < z and not x < z for some inputs. To further debug this issue, run with

export SORTCHECK_OPTIONS=raise=1

and examine generated codedump.

Mixon answered 26/4, 2018 at 9:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.