Encoding::UndefinedConversionError
Asked Answered
B

5

44

I keep getting an Encoding::UndefinedConversionError - "\xC2" from ASCII-8BIT to UTF-8 every time I try to convert a hash into a JSON string. I tried with [.encode | .force_encoding](["UTF-8" | "ASCII-8BIT" ]), chaining .encode with .force_encoding, backwards, switching parameters but nothing seemed to work so I caught the error like this:

begin
  menu.to_json
rescue Encoding::UndefinedConversionError
  puts $!.error_char.dump
  p $!.error_char.encoding
end

Where menu is a sequel's dataset.to_hash with content from a MySQL DB, utf8_general_ci encoding and returned this:

"\xC2"

<#Encoding:ASCII-8BIT>

The encoding never changes, no matter what .encode/.force_encoding I use. I've even tried to replace the string .gsub!(/\\\xC2/) without luck.

Any ideas?

Boarder answered 21/10, 2012 at 23:34 Comment(2)
1.Did you try this? menu.force_encoding("ISO-8859-1").encode("UTF-8") 2. add a "# encoding 'utf-8'` string at the top of all your .rb files. 3. Check your environment settings. what does $ echo LC_CTYPE in your terminal say?Emasculate
Did step 1 fail with an error? Did step 2 work? For step 3, thegreyblog.blogspot.in/2012/02/… this link has the env settings that your program must run with incase you want to avoid the issue.Emasculate
B
96
menu.to_s.encode('UTF-8', invalid: :replace, undef: :replace, replace: '?')

This worked perfectly, I had to replace some extra characters but there are no more errors.

Boarder answered 3/1, 2013 at 6:21 Comment(2)
Fantastic solution - solved my problem dealing with strange types in SQL Server. Thank you!Higginbotham
Thanks! It works for me too, official ruby doc for future reference hereAccused
F
21

What do you expect for "\xC2"? Probably a Â

With ASCII-8BIT you have binary data, and ruby cant decide, what should be.

You must first set the encoding with force_encoding.

You may try the following code:

Encoding.list.each{|enc|
  begin
    print "%-10s\t" % [enc]
    print "\t\xC2".force_encoding(enc)
    print "\t\xC2".force_encoding(enc).encode('utf-8')
  rescue => err
    print "\t#{err}"
  end
  print "\n"
}

The result are the possible values in different encodings for your "\xC2".

The result may depend on your Output format, but I think you can make a good guess, which encoding you have.

When you defined the encoding you need (probably cp1251) you can

menu.force_encoding('cp1252').to_json

See also Kashyaps comment.

Fraley answered 22/10, 2012 at 9:31 Comment(2)
this is what I did: ´Encoding.list.each{|enc| begin print "%-10s\t" % [enc] print menu.to_json.force_encoding(enc) print menu.to_json.force_encoding(enc).encode('utf-8') rescue => err print "\t#{err}" end print "\n" }´ and this is what I've got for each result: ´SJIS-KDDI "\xC2" from ASCII-8BIT to UTF-8´Boarder
This discovery loop for all encodings is brilliant. Helped me solve my own variant on this problem. Thanks!Akene
R
12

If you don't care about losing the strange characters, you can blow them away:

str.force_encoding("ASCII-8BIT").encode('UTF-8', undef: :replace, replace: '')
Rafe answered 30/12, 2012 at 21:11 Comment(2)
Didn't worked :( Encoding::UndefinedConversionError at /menu "\xC2" from ASCII-8BIT to UTF-8Boarder
menu.to_s.encode('UTF-8', {:invalid => :replace, :undef => :replace, :replace => '?'}) -> this worked! :DBoarder
W
10

Your auto-accepted solution doesn't work, there are effectively no errors, but it is NOT JSON.

I solved the problem using the oj gem, it now works find. It is also faster than the standard JSON library.

Writting :

   menu_json = Oj.dump menu

Reading :

   menu2 = Oj.load menu_json

https://github.com/ohler55/oj for more details. I hope it will help.

Waspish answered 20/9, 2013 at 15:47 Comment(4)
The problem was the error, not the JSON part. Therefore, my auto-accepted answers works. Anyway, I'll upvote you for giving an alternative solution.Boarder
Well, I agree with you, there are no longer errors, but it's not a json string. I don't know what was your purpose, but I needed to load back my json, and I wanted a valid JSON String. Or maybe I have missed something in your proposed solution?Waspish
This question was only about the error, I'm not saying my answer is the best choice, clearly it isn't for your purpose, but solves the problem presented: the encoding error. The JSON I mention in my question is for contextualization purposes.Boarder
Thanks @Waspish ! Hopefully others googling for this error will find this solution...Rexanne
O
1

:fallback option can be useful if you know what chars you want to replace

"Text 🙂".encode("ASCII", "UTF-8", fallback: {"🙂" => ":)"})
#=> hello :)

From docs:

Sets the replacement string by the given object for undefined character. The object should be a Hash, a Proc, a Method, or an object which has [] method. Its key is an undefined character encoded in the source encoding of current transcoder. Its value can be any encoding until it can be converted into the destination encoding of the transcoder.

Ortiz answered 29/1, 2020 at 12:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.