Using qsort for character array in C
Asked Answered
H

3

9

I'm trying to use qsort to sort a character array. I can't see why this is not working. I have a pointer to the compare function as the man pages specifies. Can someone please tell me what's wrong? Thanks. My code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int cmpfunc( const void *a, const void *b) {
  return *(int*)a - *(int*)b;
}

void AlphabetSoup( char str[] ) {
  qsort(str, (size_t) strlen(str), (size_t) sizeof(char), cmpfunc);
  printf("%s\n", str);
}


int main() {
  char str1[] = "bcead";

  AlphabetSoup(str1);

  return 0;
}

outputs: dabce when I expect abcde.

Huskamp answered 18/4, 2014 at 4:52 Comment(0)
D
9

Simple error.

Use char* instead of int* in cmpfunc.

int cmpfunc( const void *a, const void *b) {
  return *(char*)a - *(char*)b;
}

When you use int*, instead of char*, the address pointed to by a is interpreted as an address to an int, not a char.

Your input has the following characters:

+---+---+---+---+---+
| b | c | e | a | d |
+---+---+---+---+---+

In hex, those are:

+----+----+----+----+----+
| 62 | 63 | 65 | 61 | 64 |
+----+----+----+----+----+
^    ^
|    |
a    b

If you treat the addresses pointed to a and b as int*, assuming an int takes 4 bytes in your system, *(int*)a can be either

0X62*2^24 + 0X63*2^16 + 0X65*2^8 + 0X61

or

0X62 + 0X63*2^8 + 0X65*2^16 + 0X61*2^24

depending on whether you have big endian system or a little endian system.

You can similarly compute what *(int*)b would be. As you can see, you are already running into unexpected number comparisons. By the time you start comparing the values that are at other byte locations of your input, you are also using memory that you are not supposed to use, and you are reaching the realms of undefined behavior.

Desimone answered 18/4, 2014 at 4:59 Comment(0)
S
3

You have at least two problems here.

First, you are trying to sort the contents of a statically defined literal, which the compiler is free to store in immutable ram.

Second, and most importantly, you are casting the void* in your comparison function to an int*. Assuming sizeof(int) == 4, and sizeof(char) == 1, you are effectively comparing characters 0-3 "as an integer" to characters 1-4 "as an integer".

In the case of sizeof(int) = 8 (i.e., 64 bit compilers), then you would be doing even worse. Cast the void* to type char* and you should be okay.

Semang answered 18/4, 2014 at 5:0 Comment(1)
str1 is initialized with "bcead\0" and has size 6. It is not in "immutable ram". Right about the int part.Troubadour
C
1

The problem is with the type cast operator in the the compare function comfunc.

int cmpfunc(const void *a, const void *b) {
  // error. casting to int * instead of char *
  return *(int*)a - *(int*)b; 
}

Casting the void pointer a to int * and then dereferencing it means it will read the sizeof(int) bytes from the start of the address contained in a. So the expression in the return statement is comparing the sizeof(int) number of bytes from the address in a with the sizeof(int) number of bytes from the address in b, instead of comparing the characters at the addresses contained in the pointers a and b. To illustrate this, I changed the compare function to

int cmpfunc(const void *a, const void *b) {
  printf("comparing %c and %c\n", *((char *)a), *((char *)b));
  printf("compare as int %d - %d = %d\n", *(int *)a, *(int *)b, *(int *)a - *(int *)b);
  printf("compare as char %d - %d = %d\n", *(char *)a, *(char *)b, *(char *)a - *(char *)b);
  return *(char *)a - *(char *)b;
}

And this is the output I get

comparing b and c
compare as int 1634034530 - 1684104547 = -50070017
compare as char 98 - 99 = -1
comparing a and d
compare as int 25697 - 100 = 25597
compare as char 97 - 100 = -3
comparing e and a
compare as int 6578533 - 25697 = 6552836

You can see the difference in the values read when comparison is done after typecasting to int *, and after typecasting to char *. The compare function should be changed to

int cmpfunc(const void *a, const void *b) {
      // typecast the void pointers to correct type
      return *(char *)a - *(char *)b; 
}

Also, you don't need to cast the result of strlen function and sizeof operator as they already return values of type size_t. Also, it is more readable and maintainable to use sizeof on the array element. You should simply call qsort as

qsort(str, strlen(str), sizeof str[0], cmpfunc);
Conclave answered 18/4, 2014 at 5:9 Comment(1)
Thank you for providing the solution to the problem, not just information about why (which is also helpful of course).Willson

© 2022 - 2024 — McMap. All rights reserved.