org.apache.commons.codec.DecoderException: Odd number of characters

Asked 12/1, 2015 at 14:46 Answered 21/6, 2023 at 0:35

Sending hex string in url parameter and trying to convert it in to string at server side. Converting user input string by using following javascript encoding code

function encode(string) {
    var number = "";
    var length = string.trim().length;
    string = string.trim();
    for (var i = 0; i < length; i++) {
        number += string.charCodeAt(i).toString(16);
    }
    return number;
}

Now I'm trying to parse hex string 419 for russian character Й in java code as follows

byte[] bytes = "".getBytes();
     
try {
    bytes = Hex.decodeHex(hex.toCharArray());
    sb.append(new String(bytes,"UTF-8"));
} catch (DecoderException e) {      
    e.printStackTrace(); // Here it gives error 'Odd number of characters'
} catch (UnsupportedEncodingException e) {           
    e.printStackTrace();
}

but it gives following error

"org.apache.commons.codec.DecoderException: Odd number of characters."

How it can be resolved. As there are many russian character have hex code 3 digit and due to this it is not able to convert it to .toCharArray().

Veronicaveronika answered 12/1, 2015 at 14:46 Comment(1)

Have you tried encodeURI() on JS side and use on Java side normally? – Damascus 13/12, 2023 at 10:8

Use Base64 instead

val aes = KeyGenerator.getInstance("AES")
aes.init(128)
val secretKeySpec = aes.generateKey()
val base64 = Base64.encodeToString(secretKeySpec.encoded, 0)
val bytes = Base64.decode(base64, 0)
SecretKeySpec(bytes, 0, bytes.size, "AES") == secretKeySpec

Oaks answered 7/2, 2020 at 12:19 Comment(0)

In the case you mentioned Й is U+0419 and most cyrillic characters start with a leading 0. This apparently means that adding a 0 before odd numbered character arrays before converting would help.

Testing the javascript seems that this could be safe only for 1 letter long strings: Ѓ(U+0403) returned 403, Ѕ(U+0405) returned 405, but ЃЅ returned 403405 instead of 04030405 or 4030405, which is even worse, becouse it is even and would not trigger the exception and could decode to something completely different.

This question dealing with padding with leading zeros may help with the javascript part.

Bung answered 7/6, 2020 at 1:43 Comment(0)

Hi you can use Unicode encoding. In your case, char Й will be converted to \u0419 in the client side. Then in the server side you can use Java like:

import org.apache.commons.lang.StringEscapeUtils;

// Russian char = "Й"
String hex = "\u0419";
String unescapeJava = StringEscapeUtils.unescapeJava(hex);

System.out.println("unescapeJava => " + unescapeJava);

Xenogamy answered 18/3, 2023 at 3:43 Comment(0)

The problem is in this line:

number += string.charCodeAt(i).toString(16);

When the loop reaches the character 'Й', string.charCodeAt(i) returns 1049 in decimal base, but when you convert it to hex (base 16), it becomes '419' and you append that directly. Appending the char code up to 0x80 happens to be correct, but it's not correct beyond this point. What you want to append in this case is the 2-byte encoding for characters according to the UTF-8 specification. In wikipedia there's a good summary and a few examples of how to properly encode text in UTF-8: https://en.wikipedia.org/wiki/UTF-8

This link explains how to get a UTF-8 byte array from a string in js: How to convert UTF8 string to byte array?

Tunicate answered 21/6, 2023 at 0:35 Comment(0)

-3

Instead of

    sb.append(new String(bytes,"UTF-8"));

Try this

    sb.append(new String(bytes,"Windows-1251"));

Neocolonialism answered 3/4, 2017 at 23:53 Comment(0)

Recommended topics

Hot tags