About qsort() in C, difference between ** buf and buf[][]
Asked Answered
P

2

1

When I use qsort() in the C on my Mac, these code works well, It can sort every lines in one file well.

int compare(const void *p, const void *q) {
    return strcmp(p,q);
}

void function_name(){
        char buf[1024][1024];
        int i=0;
        FILE * fp;
        if(!(fp=fopen(filename,"r"))){
            perror("Open error!");
            exit(0);
        }
        while(fgets(buf[i],1024,fp)){
            //printf("%s",buf[i]);
            i++;
        }
        qsort(buf, i, sizeof(buf[0]), compare);
    }

However, when I use malloc to assign the space, it sucks. Why is that? The code below shows I use malloc to make one two-dimension array. I print the content of buf before and after. It seems that lose all the information.

    int i=0;
    FILE * fp;
    char ** buf;
    buf = (char **)malloc(sizeof(char*)*1024);
    if(!(fp=fopen(filename,"r"))){
        perror("Open error!");
        exit(0);
    }
    buf[0]=(char *)malloc(sizeof(char)*1024);
    while(fgets(buf[i],1024,fp)){
        i++;
        buf[i]=(char *)malloc(sizeof(char)*1024);
    }
    for(int j=0;j<i;j++){
        printf("%s",buf[j]);
    }
    printf("hehe%ld\n",sizeof(char)*1024);

    qsort(buf, i, sizeof(char)*1024, compare);

    printf("hehe\n");
    for(int j=0;j<i;j++){
        printf("%s",buf[j]);
    }

output:

a
A
b
c
d
D
C

E
e
B
d
e
f
a
hehe1024
hehe
(null)(null)(null)(null)(null)(null)(null)(null)(null)(null)(null)(null)(null)(null)(null)(null)

Most important, how to fix my malloc version?

Pelvic answered 7/9, 2016 at 2:29 Comment(2)
Fundamental char buf[][] is an array of arrays of type char, char **buf is a pointer to a pointer to type char.(the values are not guaranteed to be contiguous in memory)Warder
so, I can't use malloc here?Pelvic
W
1

When you attempt to qsort buf declared as char **buf, you are using the wrong comparison function. As I noted in the comment, char buf[x][y] is an x array of character arrays of y chars, e.g. char (*)[y] when passed as a parameter. When you declare buf as char **buf;, you declare a pointer to a pointer to type char. In either case, you have a pointer to string, and you must dereference each value passed to qsort one additional level of indirection., e.g.

int cmpstrings (const void *a, const void *b) {
    return strcmp (*(char * const *)a, *(char * const *)b);
}

A short example using char **buf; would be:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* qsort string comparison - for pointer to pointer to char */
int cmpstrings (const void *a, const void *b) {
    return strcmp (*(char * const *)a, *(char * const *)b);
}

int main (void) {

    char *ap[] = { "This is a tale",
                    "of captian Jack Sparrow",
                    "a pirate so brave",
                    "on the seven seas." },
        **buf = NULL;
    int n = sizeof ap/sizeof *ap;    

    if (!(buf = malloc (n * sizeof *buf))) { /* allocate pointers */
        fprintf (stderr, "error: virtual memory exhausted.\n");
        return 1;
    }

    for (int i = 0; i < n; i++)
        buf[i] = strdup (ap[i]);    /* allocate/copy strings */

    qsort (buf, n, sizeof *buf, cmpstrings);

    for (int i = 0; i < n; i++) {   /* print and free */
        printf ("buf[%d] : %s\n", i, buf[i]);
        free (buf[i]);
    }
    free (buf);

    return 0;
}

Example

$ ./bin/qsortptp
buf[0] : This is a tale
buf[1] : a pirate so brave
buf[2] : of captian Jack Sparrow
buf[3] : on the seven seas.
Warder answered 7/9, 2016 at 2:57 Comment(2)
note: it'd also be possible to pass ap directly to qsort. buf is unnecessary but presumably provided to be illustrative to OPBatho
Yes, that's how I would do it, but I wanted to fully provide an example of allocating and sorting on the pointer to pointer to char. Good point.Warder
D
1

qsort, memcpy and similar functions assumes that the pointer passed points at an array. The very definition of an array is x number of items allocated contiguously, in adjacent memory cells. For example char array [x][y];.

There is wide-spread confusion about the use of pointer-to-pointers. There's an old trick you can use to declare a look-up table, where each item points to an array of variable length, which goes like this:

char** ptr = malloc(x * sizeof(char*));
for(int i=0; i<x; i++)
{
  ptr[i] = malloc(y * sizeof(char));
}

This allows you to access the look-up table with array-like syntax, ptr[i][j]. But that does not mean that this is an array, there is no relation whatsoever beween a pointer-to-pointer and a 2D array! This is rather x number of segments, where each segment allocated at any place on the heap.

The only reason why you would ever use the above pointer-to-pointer trick is when you must have variable length for each item in the look-up table. If you don't need that, then the above method is plain bad and always incorrect - it is much slower than a real array, both in terms of allocation overhead, heap fragmentation and poor data cache use. And as you discovered, it can't even be used as an array... because it isn't one.

Now there are unfortunately numerous incompetent would-be gurus all over the world incorrectly teaching the above as the way to allocate a multi-dimensional array dynamically. Which is complete nonsense. It is not a multi-dimensional array and can't be used as one. See this for an explanation of how you actually should allocate multi-dimensional arrays dynamically.

Drainage answered 7/9, 2016 at 7:38 Comment(1)
Well, I'd point out that qsort swapping items of 1024 bytes is going to be slower on any practical architecture than swapping pointers...Butcher

© 2022 - 2024 — McMap. All rights reserved.