I have not been able to get locale-dependent functions such as strcoll() to work in C. I am wondering whether I am doing something wrong and/or how to get this to work. Here is a sample program from this book: Prinz, Peter, and Tony Crawford. 2016. C in a Nutshell, 2nd edn., p. 574. Beijing-Boston-Farnham-Sebastopol-Tokyo: O'Reilly. ISBN-13: 978-1-491-90475-6.
#include <stdio.h>
#include <string.h>
#include <locale.h>
int main(void) {
char *samples[ ] = { "curso", "churro" };
setlocale(LC_COLLATE, "es_ES.UTF-8");
int result = strcoll(samples[0], samples[1]);
if(result == 0) {
printf("The strings \"%s\" and \"%s\" are "
"alphabetically equivalent.\n",
samples[0], samples[1]);
} else if(result < 0) {
printf("The string \"%s\" comes before \"%s\" "
"alphabetically.\n",
samples[0], samples[1]);
} else if(result > 0) {
printf("The string \"%s\" comes after \"%s\" "
"alphabetically.\n",
samples[0], samples[1]);
}
return(0);
}
The book says that "curso" should come BEFORE "churro", because in Spanish "ch" is considered a separate letter for purposes of alphabetization. However, when I run this program it prints that "curso" comes AFTER "churro". I do not know Spanish, but I have tested this program with several other languages that I do know, and the result is always that of strcmp(), a strictly numerical comparison.
$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
$ locale -a | grep es_ES.utf8
es_ES.utf8
I am aware of this question: Getting locale functions to work in glibc The author says that locale-dependent functions such as strcoll perform poorly in glibc, and that he was writing his own modifications of it.
Am I missing something? Does this simply not work?
The string "curso" comes before "churro" alphabetically.
but on Ubuntu it saysThe string "curso" comes after "churro" alphabetically.
. I made sure to havees_ES.UTF-8
installed on both systems. – Sheughsetlocale()
. If it returnsNULL
it means that "es_ES.UTF-8" was not honored, and leaves local unchanged. – Kaplanch
is not considered a single letter since 1994. See rae.es/dpd/abecedario. "en el X Congreso de la Asociación de Academias de la Lengua Española, celebrado en 1994, se acordó adoptar el orden alfabético latino universal, en el que la ch y la ll no se consideran letras independientes." – Snowmobilech
andll
special, while the traditional one does. Linux/glibc implements the standard collation. You can check that your locale collation is working with accented characters. – Snowmobile