I'm pretty new, so don't be too harsh :)
Question(tl;dr)
I'm facing a problem passing an unicode String
from an embedded javax.swing.JApplet
in a web page to the Java Script part. I'm not sure this is whether a bug or a misunderstanding of the involved technologies:
Problem
I want to pass a unicode string from a Java Applet to Java Script, but the String gets messed up. Strangely, the problem doesn't occur not in Internet Explorer 10 but in Chrome (v26) and Firefox (v20). I haven't tested other browsers though.
The returned String seems to be okay, except for the last unicode character. The result in the Java Script Debugger and Web Page would be:
- abc → abc
- 表示 → 表��
- ま → ま
- ウォッチリスト → ウォッチリス��
- アップロード → アップロー��
- ホ → ��
- ホ → ホ (Not deterministic)
- アップロードabc → アップロードabc
The string seems to get corrupted at the last bytes. If it ends with an ASCII character the string is okay. Additionally the problem doesn't occur within every combination and also not every time (not sure on this). Therefore I suspect a bug and I'm afraid I might be posting an invalid question.
Test Set Up
A minimalistic set up includes an applet that returns some unicode (UTF-8) strings:
/* TestApplet.java */
import javax.swing.*;
public class TestApplet extends JApplet {
private String[] testStrings = {
"abc", // OK (because ASCII only)
"表示", // Error on last Character
"表示", // Error on last Character
"ホーム ", // OK (because of *space* after ム)
"アップロード", ... };
public TestApplet() {...}; // Applet specific stuff
...
public int getLength() { return testStrings.length;};
String getTestString(int i) {
return testStrings[i]; // Build-in array functionality because of IE.
}
}
The corresponding web page with java script could look like this:
/* test.html */
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<span id="output"/>
<applet id='output' archive='test.jar' code=testApplet/>
</body>
<script type="text/javascript" charset="utf-8">
var applet = document.getElementById('output');
var node = document.getElementById("1");
for(var i = 0; i < applet.getLength(); i++) {
var text = applet.getTestString(i);
var paragraphNode = document.createElement("p");
paragraphNode.innerHTML = text;
node.appendChild(paragraphNode);
}
</script>
</html>
Environment
I'm working on Windows 7 32-Bit with the current Java Version 1.7.0_21 using the "Next Generation Java Plug-in 10.21.2 for Mozilla browsers". I had some problems with my operating system locale, but I tried several (English, Japanese, Chinese) regional settings.
In case of an corrupt String chrome shows invalid characters (e.g. ��). Firefox, on the other hand, drops the string completly, if it would be ending with ��.
Internet explorer manages to display the strings correctly.
Solutions?
I can imagine several workarounds, including escaping/unescaping and adding a "final char" which then is removed via java script. Actually I'm planning to write against Android's Webkit, and I haven't tested it there.
Since I would like to continue testing in Chrome, (because of Webkit technology and comfort) I hope there is a trivial solution to the problem, which I might have overlooked.
javac
and/orjar
uses UTF8 encoding - if you don't specify it, it uses the machine default (which could be a problem) – BrodeurgetTestString(int i)
. The JavaScript call isapplet.getTestSring(i)
. – Viceroy