Locale environment variables: difference between C and C.UTF-8
Asked Answered
P

2

7

What is the difference between languages that end in UTF-8 and those that don't? In particular between it_IT and it_IT.UTF-8, and then the one that interests me most which is between C and C.UTF-8. What should I put between C and C.UTF-8 in the variable "LC_ALL" for example?

Here is the list that appears when I run the locale -a command, which is to make you better understand what my concerns are.

C
C.utf8
en_AG
en_AG.utf8
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IL
en_IL.utf8
en_IN
en_IN.utf8
en_NG
en_NG.utf8
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZM
en_ZM.utf8
en_ZW.utf8
it_CH.utf8
it_IT.utf8
POSIX
Puritanical answered 29/7, 2022 at 16:4 Comment(1)
What should you use? What you need and failed to say. Long story short, if you can only use ASCII characters, the C (ascii charset) locale is guaranteed to give the expected results on any common system, because it is compatible with most charset. But if you want to be able to use non ASCII characters (éèçà...) you have to use a non ASCII charset and UTF-8 is kind of universal, and is nowadays a de facto standard.Lisk
S
2

I'd recommend to use UTF-8 locale which is more versatile.

For example, in Git Bash :

LC_ALL=C grep -P hello /dev/null
# output :
# grep: -P supports only unibyte and UTF-8 locales

LC_ALL=C.UTF-8 grep -P hello /dev/null
# No output
Sassoon answered 4/8, 2022 at 11:11 Comment(0)
G
2

The main difference between languages that end in UTF-8 and those that don't is that the former supports Unicode, which is a character encoding that can represent a wide range of characters from different scripts. This allows for a more internationalized environment, as it allows for text to be displayed in a variety of languages.

LC_ALL should be set to "it_IT.UTF-8" to enable Unicode support for the Italian language.

Gibun answered 5/8, 2022 at 0:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.