"cout" fails to work with Chinese characters
Asked Answered
L

4

3

My codes are as simple as these:

#include <iostream>
using namespace std;
//Some codes here...
bool somefunction(){
    cout<<"单元格";
    return false;
}

and this is what I got:

error C2143: syntax error: missing ';' before 'return';
error C2001: newline is constant;

Moreover, if i change "单元格" into an English version like "cell", it works perfectly;

Leyla answered 18/9, 2012 at 1:1 Comment(3)
I think your compiler just doesn't understand what encoding your source file is in.Sakmar
What are you expecting to happen exactly? Are you expecting a multibyte string? Or are you expecting the compiler to output a UTF-8 string?Startle
Try std::wcout insteadEnedina
M
5

The compiler errors indicate that your compiler doesn't support Unicode characters in source code. You'll have to escape them, use wide-character constants, and wcout:

wcout << L"\x5355\x5143\x683c";

If you need to output characters in a specific encoding (e.g. gb2312), use that encoding in the string literal:

cout << "\xb5\xa5\xd4\xaa\xb8\xf1"; // string encoded with GB2312
Maladjusted answered 18/9, 2012 at 1:7 Comment(0)
E
1

To work with non-english character sets you should use std::wcout to print wide-characters, like so

#include <iostream>
using namespace std;
//Some codes here...
bool somefunction(){
  wcout<< L"单元格";
  return false;
}

And be sure not to mix both cout and wcout in the same program.

Enedina answered 18/9, 2012 at 1:6 Comment(2)
Not as far as I'm aware of :) Just the time of night at this part of the globe.. heheEnedina
It does not really matter if you use cout or wcout in this case. The code is failing to compile as the compiler is choking on that string as written (i.e. in the given character set)Courage
D
0

Use wcout and Unicode literals (L"单元格"). This good practice even if you're only dealing with English characters. Also use wstring.

Edit: Actually another problem may be that you're storing the file in a non-Unicode encoding, so the characters are lost. Tell your editor to store the file as Unicode.

Another problem may be that the console (or wcout) doesn't display Unicode characters correctly. If you display them in a message box (with MessageBoxW) they are displayed correctly.

Dachau answered 18/9, 2012 at 1:8 Comment(1)
Wide character strings (L"") are not necessarily Unicode. Depends on the OS, the compiler etc.Centimeter
G
0

You should always save your source code as UTF-8 with BOM.

Gokey answered 30/4, 2016 at 9:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.