The Problem
When using native PHP ODBC features (PDO_ODBC or the older odbc_
functions) and the Access ODBC driver, text is not UTF-8 encoded, even though it is stored in the Access database as Unicode characters. So, for a sample table named "Teams"
Team
-----------------------
Boston Bruins
Canadiens de Montréal
Федерация хоккея России
the code
<?php
header('Content-Type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Access character test</title>
</head>
<body>
<?php
$connStr =
'odbc:' .
'Driver={Microsoft Access Driver (*.mdb)};' .
'Dbq=C:\\Users\\Public\\__SO\\28311687.mdb;' .
'Uid=Admin;';
$db = new PDO($connStr);
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$sql = "SELECT Team FROM Teams";
foreach ($db->query($sql) as $row) {
$s = $row["Team"];
echo $s . "<br/>\n";
}
?>
</body>
</html>
displays this in the browser
Boston Bruins
Canadiens de Montr�al
????????? ?????? ??????
The Easy but Incomplete Fixes
The text returned by Access ODBC actually matches the Windows-1252 character encoding for the characters in that character set, so simply changing the line
$s = $row["Team"];
to
$s = utf8_encode($row["Team"]);
will allow the second entry to be displayed correctly
Boston Bruins
Canadiens de Montréal
????????? ?????? ??????
but the utf8_encode() function converts from ISO-8859-1, not Windows-1252, so some characters (notably the Euro symbol '€') will disappear. A better solution would be to use
$s = mb_convert_encoding($row["Team"], "UTF-8", "Windows-1252");
but that still wouldn't solve the problem with the third entry in our sample table.
The Complete Fix
For full UTF-8 support we need to use COM with ADODB Connection and Recordset objects like so
<?php
header('Content-Type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Access character test</title>
</head>
<body>
<?php
$connStr =
'Driver={Microsoft Access Driver (*.mdb)};' .
'Dbq=C:\\Users\\Public\\__SO\\28311687.mdb';
$con = new COM("ADODB.Connection", NULL, CP_UTF8); // specify UTF-8 code page
$con->Open($connStr);
$rst = new COM("ADODB.Recordset");
$sql = "SELECT Team FROM Teams";
$rst->Open($sql, $con, 3, 3); // adOpenStatic, adLockOptimistic
while (!$rst->EOF) {
$s = $rst->Fields("Team");
echo $s . "<br/>\n";
$rst->MoveNext;
}
$rst->Close();
$con->Close();
?>
</body>
</html>
Incorrect string value: '\xE9d'
appears exactly for example. so pls - more code! – Retardationé
is). – Batwing