MySQL changes UTF-8 to ASCII-8BIT
Asked Answered
M

2

6

I've this scenario.

A movie title:

$ title = "La leyenda de Osaín"

With this encoding:

$ title.encoding.name
>> UTF-8

I then saves it to the database.

$ movie = Movie.create!(:title => title)

Then I try to get the movie.

$ Movie.find(movie.id).title.encoding.name
>> "ASCII-8BIT"

$ Movie.find(movie.id).title
>> "La leyenda de Osa\xC3\xADn"

All other movies works that does not contain special characters like í and û.

This is my database.yaml file:

development:
  adapter: mysql
  database: development
  username: linus
  password: my_password
  socket: /tmp/mysql.sock
  encoding: UTF8

I'm getting the right sort of data when using forced_encoding.

$ Movie.find(movie.id).title.force_encoding("UTF-8")
>> "La leyenda de Osaín"

I'm using Rails 3.0.5.rc1 with MySQL 14.14.

Anyone knows what the problem may be?

Maritamaritain answered 25/2, 2011 at 21:18 Comment(0)
M
9

I found a solution to my problem. Now I'm using the newer mysql2 gem.

I replaced gem "mysql" with gem "mysql2" inside the Gemfile.

Then I changed the database adapter inside the database.yaml file.

From:

development:
  adapter: mysql
  database: development
  username: linus
  password: my_password
  socket: /tmp/mysql.sock
  encoding: UTF8

To:

  development:
    adapter: mysql2
    database: development
    username: linus
    password: my_password
    socket: /tmp/mysql.sock
    encoding: UTF8

I think this was the deal breaker in my case:

Taken from Github MySQL2

[...]It also forces the use of UTF-8 [or binary] for the connection [and all strings in 1.9[...]

Maritamaritain answered 25/2, 2011 at 21:56 Comment(0)
C
0

According to this link, rails scaffolding creates varchar(255) columns in mysql. The mysql documentation says the following about varchar(255):

For example, a VARCHAR(255) column can hold a string with a maximum length of 255 characters. Assuming that the column uses the latin1 character set (one byte per character), the actual storage required is the length of the string (L), plus one byte to record the length of the string.

My guess is that the column type in the database doesn't support characters that are represented by more than one byte. This link has more information about common pitfalls in rails when dealing with unicode strings and more specifically, it says you need to create your database as utf8 like so:

CREATE_DATABASE my_web_two_zero_development DEFAULT CHARSET utf8;
Copalite answered 25/2, 2011 at 21:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.