Trying to show arabic characters in Java
Asked Answered
C

7

7

I am trying to show arabic characters in a Java applet but I always get Questions marks '?????'.

I tried many solutions with no success:

I am using Windows 7 in a spanish language environment.

Some solutions work when running Netbeans, but they do not work outside this environment. Here it is Netbeans project with sources and .jar.

This is simple code I am using:

package javaapplication4;

import java.io.ByteArrayOutputStream;
import java.nio.charset.Charset;
import javax.swing.JApplet;
import javax.swing.JOptionPane;

public class JavaApplication4 extends JApplet{

@Override
public void init(){
    try {

        String str1 = new String("تعطي يونيكود رقما فريدا لكل حرف".getBytes(), "UTF-8");
        JOptionPane.showMessageDialog(rootPane, str1);

        String str2 = new String("تعطي يونيكود رقما فر");  
        ByteArrayOutputStream os = new ByteArrayOutputStream();
        os.write(str2.getBytes());
        JOptionPane.showMessageDialog(rootPane, os.toString("UTF-8"));

    } catch (Exception ex) {
        JOptionPane.showMessageDialog(rootPane, ex.toString());
    }
}
}

Any idea of what is happening?

Crave answered 21/2, 2013 at 9:38 Comment(4)
For better help sooner, post an SSCCE.Destruction
@Jayamohan Question marks: ????? ???? ?????Crave
Both of those methods do not do alot - a String is a String. Encoding is only applicable if you are reading bytes from a file and need them to become chars.Encode
Your code contains arabic, not hebrew characters, though it probably makes no difference in your case.Bilski
H
2

My original Answer is wrong: getBytes() produces a bytearray using the system's default encoding, which netbeans sets to UTF-8.

Correct answer: Do not use ByteArrayOutputStream and new String(byte[], Charset) at all. Only use Strings. Should work fine.

EDIT: See comments for the actual problem and explanation why solution is not completely possible.

Hyla answered 21/2, 2013 at 11:39 Comment(5)
Well, it works if I do JOptionPane.showMessageDialog(str) but It does not work if I do System.out.println(str).Crave
Well then that is your actual problem. The System.out stream as well uses the default system encoding. On Windows the default system encoding is sadly not equal to the encoding the command line (cmd) uses. You need to use a new PrintWriter(System.out, "Actual Command Line Encoding") to get this to work. And depending on the system, there is NO WAY at all to print arabic characters, since windows for example cannot use any unicode encoding in the command line. You might change your system language (in windows settings) to an an arabic encoding but that is probably not what you want^^Hyla
The netbeans console window uses utf-8, and since your code prints utf-8 it works.Hyla
Anyway, if you really need to convert a String to byte[], you have to call str.getBytes("UTF-8") indicating charset to avoid default one.Crave
Same Problem I am facing, but I am using winforms and taking input in textfield and here it is working fine, but when I am setting data in textfield I am getting question marks, what to doShaper
B
4

The easiest solution would be using strings normally and changing the default encoding in your workspace for example eclipse.

Windows-->Preferences-->General-->workspace-->Text file encoding

Change the encoding to UTF-8.

There is no magic here.

Burbage answered 12/5, 2015 at 14:18 Comment(1)
it is working in my caseInhumane
H
2

My original Answer is wrong: getBytes() produces a bytearray using the system's default encoding, which netbeans sets to UTF-8.

Correct answer: Do not use ByteArrayOutputStream and new String(byte[], Charset) at all. Only use Strings. Should work fine.

EDIT: See comments for the actual problem and explanation why solution is not completely possible.

Hyla answered 21/2, 2013 at 11:39 Comment(5)
Well, it works if I do JOptionPane.showMessageDialog(str) but It does not work if I do System.out.println(str).Crave
Well then that is your actual problem. The System.out stream as well uses the default system encoding. On Windows the default system encoding is sadly not equal to the encoding the command line (cmd) uses. You need to use a new PrintWriter(System.out, "Actual Command Line Encoding") to get this to work. And depending on the system, there is NO WAY at all to print arabic characters, since windows for example cannot use any unicode encoding in the command line. You might change your system language (in windows settings) to an an arabic encoding but that is probably not what you want^^Hyla
The netbeans console window uses utf-8, and since your code prints utf-8 it works.Hyla
Anyway, if you really need to convert a String to byte[], you have to call str.getBytes("UTF-8") indicating charset to avoid default one.Crave
Same Problem I am facing, but I am using winforms and taking input in textfield and here it is working fine, but when I am setting data in textfield I am getting question marks, what to doShaper
H
1

os.toString(...) is the wrong method. It assumes that the characters inside the ByteArrayOutputStream are utf-8, which is not correct since java uses utf-16. The output of the method on the other hand is a valid java String which is again: utf-16.

So you use an array that contains utf-16 characters interpret it as utf-8 and convert it to utf-16. There you have your problem ^^

EDIT: same problem with the line:

new String("تعطي يونيكود رقما فريدا لكل حرف".getBytes(), "UTF-8");

getBytes() produces UTF-16 [THIS IS WRONG, SEE MY OTHER ANSWER], and you use it to create a String that interpretes the array as UTF-8

Hyla answered 21/2, 2013 at 9:58 Comment(4)
If I do new String("...", "UTF-16") I get strange Asian symbols: "諙裙蛙諙菙裘꼠". In fact, if I do new String("...", "UTF-8") in Netbeans it works (as I explained in the post) but it does not work outside Netbeans.Crave
Allright, my mistake: from the javadoc of getBytes(): "Encodes this String into a sequence of bytes using the platform's default charset" which netbeans set's to utf-8. How about you do not use getBytes() and Bytearrayoutputstream at all. Should work fine without it.Hyla
when i read arabic text from text files and print it out it appears like this الأسÙالأسود, how i can return the writing to its to arabic charactersTravis
@SGaber I have the same problem, did you find a solution?Dion
P
1

If your source code is encoded in UTF-8, you must set the -encoding parameter when compiling. Otherwise the compiler will use the system's default encoding, which is probably cp1252 in your case (Windows 7, Spanish), and doesn't support Arabic.

You should remove all the conversions to bytes, they can only make the matters worse. This is how it should work:

String str1 = "تعطي يونيكود رقما فريدا لكل حرف";
JOptionPane.showMessageDialog(rootPane, str1);

If you can't set compiler options you can use escape codes to encode the characters in ASCII. The native2ascii command line tool can do this conversion for you. For example, the code generated for the above two lines would be:

String str1 = "\u062a\u0639\u0637\u064a \u064a\u0648\u0646\u064a\u0643\u0648\u062f \u0631\u0642\u0645\u0627 \u0641\u0631\u064a\u062f\u0627 \u0644\u0643\u0644 \u062d\u0631\u0641";
JOptionPane.showMessageDialog(rootPane, str1);
Pub answered 21/2, 2013 at 11:55 Comment(0)
P
1

change the set encoding to UTF-8 from the Edit menus on Eclipse

Polyandrist answered 18/10, 2016 at 16:30 Comment(0)
S
0

i wanna show you a very usefull code, about the arabic in jOptionPane message, i use jlabel to resolve this problem, try this:

JLabel jlbl1 = new JLabel("الملف غير موجود");
JOptionPane.showMessageDialog(null,jlbl1,"تنبيه", JOptionPane.ERROR_MESSAGE);
Surprise answered 21/2, 2018 at 15:23 Comment(0)
A
0

I faced this issue in Windows Server 2016, my Arabic characters where passed as "???" in Netbeans.

Solution:

  1. Go to Control Panel -> Region.
  2. Open Administrative tab.
  3. Click “Change System Local” to Arabic.
  4. Restart the server.
Argentite answered 19/3, 2018 at 10:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.