Why do I need to escape unicode in java source files? - McMap

About

Why do I need to escape unicode in java source files?

Asked 27/6, 2012 at 13:4 Answered 6/7, 2013 at 10:50

Solved java unicode eclipse-rcp unicode-escapes

C

2

12

Please note that I'm not asking how but why. And I don't know if it's a RCP specific problem or if it's something inherent to java.

My java source files are encoded in UTF-8.

If I define my literal strings like this :

    new Language("fr", "Français"),
    new Language("zh", "中文")

It works as I expect when I use the string in the application by launching it from Eclipse as an Eclipse application :

enter image description here

But if fails when I launch the .exe built by the "Eclipse Product Export Wizard" :

enter image description here

The solution I use is to escape the chars like this :

    new Language("fr", "Fran\u00e7ais"), // Français
    new Language("zh", "\u4e2d\u6587") // 中文

There is no problem in doing this (all my other strings are in properties files, only the languages names are hardcoded) but I'd like to understand.

I thought the compiler had to convert the java literal strings when building the bytecode. So why is the unicode escaping necessary ? Is it wrong to use use high range unicode chars in java source files ? What happens exactly to those chars at compilation and in what it is different from the handling of escaped chars ? Is the problem just related to RCP cache ?

Conlon answered 27/6, 2012 at 13:4 Comment(14)

It appears that the Eclipse Product Export Wizard is not interpreting your files as UTF-8. Perhaps you need to run Eclipse's JVM with the encoding set to UTF-8 (-Dfile.encoding=UTF8 in eclipse.ini)? – Dogged 27/6, 2012 at 13:11

While this does not really explain why it happens it does suggest an alternative solution and indicates that the export wizard for whatever reason doesn't seem to honor the project's encoding properly: #6891579 – Kilah 27/6, 2012 at 13:15

To confirm @Matt Ball's explanation, witch I think is correct, try setting the following option in the wizard: "Use class files compiled in the workspace" – Belmonte 27/6, 2012 at 13:19

@Jiddo: it does explain why it happens: "not interpreting your files as UTF-8", so it's interpreting them as another encoding incompatible with UTF-8. – Deakin 27/6, 2012 at 13:20

@MattBall It works. Please build an answer. But I'd like to understand why Eclipse doesn't know what encoding use when exporting even while UTF-8 is the encoding format defined in preferences/general/workspace and it knows how to compile them. At the very least an option in the export wizard or the .plugin file seems to be needed. – Klina 27/6, 2012 at 13:24

@brunoconde Please can you precise where is this option ? – Klina 27/6, 2012 at 13:25

@Deakin Indeed. What I meant was that it didn't explain why it is not interpreting your files as UTF-8, which I interpreted as what the question was about. Sorry about the confusion. – Kilah 27/6, 2012 at 13:26

@dystroy it is the "Export wizard" > "Options" tab – Belmonte 27/6, 2012 at 13:31

@brunoconde I use the "Eclipse Product Export Wizard" from the .product file. I don't have tabs :\ – Klina 27/6, 2012 at 13:37

@dystroy, sorry I have a plugin environment not RCP. I seems the RCP wizard doesn't have this option. – Belmonte 27/6, 2012 at 13:45

OK, thanks for your help. Your observation points to a need similar to the one I was referring at. – Klina 27/6, 2012 at 13:46

@Jiddo: it's not interpreting the file as UTF-8 because that's not their encoding when imported into/created in Eclipse. – Deakin 27/6, 2012 at 14:7

@Deakin those files are generally considered by Eclipse as UTF-8, according to the correct display. This is due to the preference set in preferences/general/workspace. – Klina 27/6, 2012 at 14:9

@dystroy it's probably just a bug in Eclipse's Product Export Wizard. Such things are disturbingly common in a lot of tools. Many developers just don't understand or test encoding issues. – Dialectical 27/6, 2012 at 16:50

D

10

It appears that the Eclipse Product Export Wizard is not interpreting your files as UTF-8. Perhaps you need to run Eclipse's JVM with the encoding set to UTF-8 (-Dfile.encoding=UTF8 in eclipse.ini)?

_{(Copypasta'd at OPs request)}

Dogged answered 27/6, 2012 at 13:26 Comment(0)

P

4

When exporting a plug-in, it gets compiled through a process separate from the normal build process within the IDE. There is a known bug that the build process (PDE.Build) disregards the text encoding used by the IDE.

The export can be made to work properly by specifying the text encoding in the build.properties file of your plugin

javacDefaultEncoding.. =UTF-8

Panthia answered 6/7, 2013 at 10:50 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.