I am trying to use Unicode variable names in g++, but it does not appear to work.
Does g++ not support Unicode variable names? Or is there some subset of Unicode (from which I'm not testing in)?
I am trying to use Unicode variable names in g++, but it does not appear to work.
Does g++ not support Unicode variable names? Or is there some subset of Unicode (from which I'm not testing in)?
You have to specify the -fextended-identifiers
flag when compiling. You also have to use \uXXXX or \uXXXXXXXX for Unicode (at least in GCC, it's Unicode).
Identifiers (variable/class names, etc.) in g++ can't be of UTF-8/UTF-16 or whatever encoding. They have to be:
identifier:
nondigit
identifier nondigit
identifier digit
A nondigit is
nondigit: one of
universalcharactername
_ a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
And a universalcharactername is
universalcharactername:
\UXXXXXXXX
\uXXXX
Thus, if you save your source file as UTF-8, you cannot have a variable like:
int hΓΈyde = 10;
It had to be written like:
int h\u00F8yde = 10;
(which, in my opinion, would defeat the purpose. So just stick with a-z)
-fextended-identifiers -finput-charset=UTF-8
. (For reference, also MSVC++ does fine, either with -utf-8 or with a BOM in the source.) See also: https://mcmap.net/q/261447/-128515-and-other-unicode-characters-in-identifiers-not-allowed-by-g β
Powered A one-line patch to the C++ preprocessor allows UTF-8 input. Details for GCC are given at UTF-8 Identifiers in GCC.
However, since the preprocessor is shared, the same patch should work for g++ as well. In particular, the patch needed, as of gcc-5.2 is
diff -cNr gcc-5.2.0/libcpp/charset.c gcc-5.2.0-ejo/libcpp/charset.c
Output:
*** gcc-5.2.0/libcpp/charset.c Mon Jan 5 04:33:28 2015
--- gcc-5.2.0-ejo/libcpp/charset.c Wed Aug 12 14:34:23 2015
***************
*** 1711,1717 ****
struct _cpp_strbuf to;
unsigned char *buffer;
! input_cset = init_iconv_desc (pfile, SOURCE_CHARSET, input_charset);
if (input_cset.func == convert_no_conversion)
{
to.text = input;
--- 1711,1717 ----
struct _cpp_strbuf to;
unsigned char *buffer;
! input_cset = init_iconv_desc (pfile, "C99", input_charset);
if (input_cset.func == convert_no_conversion)
{
to.text = input;
Note that for the above patch to work, a recent version of iconv needs to be installed that supports C99 conversions. Type iconv --list
to verify this. Otherwise, you can install a new version of iconv along with GCC as described in the link above.
Change the configure command to
../gcc-5.2.0/configure -v --disable-multilib \
--with-libiconv-prefix=/usr/local/gcc-5.2 \
--prefix=/usr/local/gcc-5.2 \
--enable-languages="c,c++"
if you are building for x86 and want to include the C++ compiler as well.
© 2022 - 2024 β McMap. All rights reserved.