I've struggled myself for a long time with the same problem, but I believe I have finally found an, if not very pretty, solution.
As far as I can tell, the problem actually consists of 2 issues:
- The Windows console code page is incorrect by default
System.out
uses an incorrect encoding by default
Adjust code page from Java
The first issue can be observed using cmd
or powershell
and running chcp
:
Active code page: 850.
This should be 65001
for UTF-8, which can be set using chcp 65001
. This only works though if you can run a command in the shell your program runs in, or if you edit the registry Autorun field (both aren't great options imo).
And no, you can't run Runtime.getRuntime().exec("chcp.com 65001")
, because that doesn't affect the calling console, but just the one created by running the command.
My suggestion is to use the native Windows function SetConsoleOutputCP() which means you can change the code page from within and isolated for your application. I simply used JNA, but it would probably be cleaner to write some native C wrapper so that you only get the one function:
Kernel32.INSTANCE.SetConsoleOutputCP(65001)
Change encoding of System.out
I found this issue when printing System.getProperties()
:
...
stdout.encoding=Cp1252
...
Different from the actual encoding (for me at least, which was 850
), and not UTF-8 (mind you this was tested using Java 21, which aparently uses UTF-8 by default, but clearly not everywhere).
Again, this could probably be fixed by adding some startup parameter to set these properties, but you may as well create your own print stream:
new PrintStream(System.out, true, StandardCharsets.UTF_8)
which you may set as global System.out using System.setOut()
.
Putting it all together
This is my suggestion for fixing System.out and System.err in a platform-independent manner:
public static void fixSystemOutEncoding() {
if(
System.console() == null || // No interactive terminal connected (maybe you still want to do it there?)
!System.getProperty("os.name").toLowerCase().contains("win") // Not on Windows
) {
// No console or no Windows -> nothing to do
return;
}
try {
// Set console code page to 65001 = UTF-8
if(Kernel32.INSTANCE.SetConsoleOutputCP(65001)) {
// Replace System.out and System.err with PrintStreams using UTF-8
System.setOut(new PrintStream(System.out, true, StandardCharsets.UTF_8));
System.setErr(new PrintStream(System.err, true, StandardCharsets.UTF_8));
}
else {
// SetConsoleOutputCP() failed, throw exception with error message,
// handle it in catch (you may want to do something else here or
// just ignore it)
throw new RuntimeException(Kernel32Util.getLastErrorMessage());
}
} catch(Throwable t) {
// Something went wrong, probably with the native library
// Probably just ignore it and deal with UTF-8 not being available
}
}
Requires JNA and JNA Platform (net.java.dev.jna:jna-platform
).
WriteConsoleW
, but it obviously only works on Windows, it needs careful handling of detecting whether you're actually talking to the Windows console, some other console, or a file or pipe, and you can't call it in pure Java (you need JNA). – Cuneiform