How to properly use UTF-8-encoded data from Schema inside Catalyst app?
Asked Answered
M

2

6

Data defined inside Catalyst app or in templates has correct encoding and is diplayed well, but from database everything non-Latin1 is converted to ?. I suppose problem should be in model class, which is such:

use strict;
use base 'Catalyst::Model::DBIC::Schema';

__PACKAGE__->config(
 schema_class => 'vhinnad::Schema::DB',

 connect_info => {
     dsn => 'dbi:mysql:test',
     user => 'user',
     password => 'password',
     {
         AutoCommit        => 1,
         RaiseError        => 1,
         mysql_enable_utf8 => 1,
     },
     'on_connect_do' => [
             'SET NAMES utf8',
     ],      
     }
);

1;

I see no flaws here, but something must be wrong. I used my schema also with test scripts and data was well encoded and output was correct, but inside Catalyst app i did not get encoding right. Where may be the problem?

EDIT

For future reference i put solution here: i mixed in connect info old and new style.

Old style is like (dsn, username, passw, hashref_options, hashref_other options)

New style is (dsn => dsn, username => username, etc), so right is to use:

 connect_info => {
     dsn               => 'dbi:mysql:test',
     user              => 'user',
     password          => 'password',
     AutoCommit        => 1,
     RaiseError        => 1,
     mysql_enable_utf8 => 1,
     on_connect_do     => [
             'SET NAMES utf8',
     ],      
 }
Miscellaneous answered 30/10, 2012 at 12:11 Comment(0)
S
10

In a typical Catalyst setup with Catalyst::View::TT and Catalyst::Model::DBIC::Schema you'll need several things for UTF-8 to work:

  • add Catalyst::Plugin::Unicode::Encoding to your Catalyst app
  • add encoding => 'UTF-8' to your app config
  • add ENCODING => 'utf-8' to your TT view config
  • add <meta http-equiv="Content-type" content="text/html; charset=UTF-8"/> to the <head> section of your html to satisfy old IEs which don't care about the Content-Type:text/html; charset=utf-8 http header set by Catalyst::Plugin::Unicode::Encoding
  • make sure your text editor saves your templates in UTF-8 if they include non ASCII characters
  • configure your DBIC model according to DBIx::Class::Manual::Cookbook#Using Unicode
  • if you use Catalyst::Authentication::Store::LDAP configure your LDAP stores to return UTF-8 by adding ldap_server_options => { raw => 'dn' }

According to Catalyst::Model::DBIC::Schema#connect_info:

The old arrayref style with hashrefs for DBI then DBIx::Class options is also supported.

But you are already using the 'new' style so you shouldn't nest the dbi attributes:

connect_info => {
     dsn               => 'dbi:mysql:test',
     user              => 'user',
     password          => 'password',
     AutoCommit        => 1,
     RaiseError        => 1,
     mysql_enable_utf8 => 1,
     on_connect_do     => [
         'SET NAMES utf8',
     ],      
}
Stereoscopic answered 30/10, 2012 at 21:18 Comment(1)
You can skip the Unicode::Encoding plug-in since Catalyst 5.90040; the plug-in is now included within Catalyst's core.Sewellel
B
1

This advice assumes you have fairly up to date versions of DBIC and Catalyst.

  • This is not necessary: on_connect_do => [ 'SET NAMES utf8' ]
  • Ensure the table|column charsets are UTF-8 in your DB. You can achieve things that sometimes look right even when parts are broken. The DB must be saving the character data as UTF-8 if you expect the entire chain to work.
  • Ensure you're using and configuring Catalyst::Plugin::Unicode::Encoding in your Catalyst app. It did have serious-ish bugs in the not too distant past so get the newest.
Bushbuck answered 30/10, 2012 at 16:59 Comment(8)
I have freshly installed most recent DBIC and Catalyst. I started from point you mentioned but when it did not worked i added 'SET NAMES utf8' too. All databases i use have proper UTF-8 encoded data for years. I have enabled Unicode::Encoding plugin too.Miscellaneous
So, if you do show create table TableName you see something like CHARSET=utf8 and not Latin-1, right? And you have something like __PACKAGE__->config( encoding => "UTF-8" ) somewhere in your app?Bushbuck
Yes, for both questions. Like i said: every UTF-8 string defined inside app comes out like it supposed to. And so is all right with schemas when i use them with some other scripts. But together i can't get them work.Miscellaneous
If you use Template::Toolkit and Catalyst::View::TT you have to configure ENCODING => 'utf-8', in your view class . Furthermore you should put <meta http-equiv="Content-type" content="text/html; charset=UTF-8"/> in your html head.Stereoscopic
Well, there is some disconnect somewhere then or perhaps you have ancillary config/code biting you someplace? That setup does work as advertised. You're not using any deprecated stuff like the UTF8-columns DBIC thing? Your DBD is reasonably current? You aren't writing your own HTTP headers? Can you present a minimal MyApp style version that shows the problem? (Oh, just saw @abraxxa's comment++, good thing to try.)Bushbuck
@abraxxa: all this is also in place and if i define any complex UTF-8 string manually inside app, it is shown like a charm. But i left TT away from my question, because i have problem with pure JSON output, without templates involved. And i checked: JSON response has proper encoding in headers tooMiscellaneous
@w.k, Ah, now it becomes more clear. The Unicode::Encoding plugin doesn't operate on JSON (this is a known/debated issue). You'll have to do a bit more on your own. Give some details about your JSON view/generation and I (later) or someone (sooner) will help you.Bushbuck
@Ashley: No, i am not using UTF8-columns. DBD::mysql is 4.020. I am not writing HTTP-headers (and not-DB strings are OK). Showing all the app seemed a bit complex, but i think how to put it together.Miscellaneous

© 2022 - 2024 — McMap. All rights reserved.