Reading Unicode characters from an Access database via JDBC-ODBC

Asked 5/10, 2013 at 0:37 Answered 17/8, 2017 at 14:52

I have some non-standard characters in my Access 2010 database. When I read them via

Connection con = null;
try{
    Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
    java.util.Properties prop = new java.util.Properties();
    prop.put("charSet", "UTF8");
    String database = "jdbc:odbc:Lb";
    con = DriverManager.getConnection(database, prop);
} catch (Exception ex) {
    System.out.println("Error");
}
Statement stm = conn.createStatement();
ResultSet rs = stm.executeQuery("SELECT distinct forename, surname from PSN where isValid");

while (rs.next()) {
    String forename = rs.getString("forename");
}

I receive question marks (?) where the character should be. Why is this?

Antivenin answered 5/10, 2013 at 0:37 Comment(1)

possible duplicate of Java ODBC MS-Access Unicode character problems – Cysticercoid 9/7, 2014 at 9:39

I had question marks when DB contained polish characters. It was fixed when I set charecter encoding to windows-1250.

def establish(dbFile: File): Connection = {
  val fileName = dbFile.getAbsolutePath
  val database = s"${driver}DBQ=${fileName.trim};DriverID=22;READONLY=true}"
  val props = new Properties()
  props.put("charSet", "Cp1250")
  val connection= DriverManager.getConnection(database,props)
  connection
}

Rok answered 18/9, 2014 at 13:22 Comment(0)

I expect your JDBC driver to handle reading and writing characters to your database transparently. Java's internal string representation is UTF-16.

Java(UTF-16)         --JDBC--> Database(DbEncoding)
Database(DbEncoding) --JDBC--> Java(UTF-16)

Perhaps the problem is that you are trying to force reading them with UTF8 and the database uses another internal representation?

Also, how do you verify that you receive '?'

If System.out is involved, you should take into consideration that this PrintStream converts in memory Strings to the Charset that it uses. IIRC this Charset can be found with Charset.defaultcharset() and is a property of th JVM that runs the program.

It is preferable to inspect the hexadecimal value of the char and look up a Unicode table to be sure that information has been lost while reading from the database.

Hope this helps a bit.

Leading answered 5/10, 2013 at 5:49 Comment(0)

This is a long-standing interoperability issue between the Access ODBC driver and the JDBC-ODBC Bridge. Access stores Unicode characters using a variation of UTF-16LE encoding (not UTF-8) and the JDBC-ODBC bridge is unable to retrieve them.

(Note that this is not a problem with the Access ODBC driver per se because other tools like pyodbc for Python can retrieve the Unicode characters correctly. It is a compatibility issue between the JDBC-ODBC Bridge and the Access ODBC driver.)

A bug report was filed with Sun in November 2005 outlining the issue. That report was closed as "Won't Fix" in April 2013 with the comment

The bridge has been removed from Java SE 8 and is not supported

If you need to work with arbitrary Unicode characters in an Access database you should consider using UCanAccess. For more information, see

Manipulating an Access database from Java without ODBC

Cysticercoid answered 7/1, 2014 at 21:25 Comment(0)

It's not "utf8", "Cp1250" !

One must use : ISO-8859-1

java.util.Properties prop = new java.util.Properties();
prop.put("charSet", "ISO-8859-1");
String connURL = "jdbc:odbc:DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=" + accessFileName + ";uid=''; pwd='';";    
sql = "SELECT * FROM enq_horaires;";'
con = DriverManager.getConnection(connURL, prop);
stmt = con.createStatement();
ResultSet rset = stmt.executeQuery(sql);

Juxon answered 17/8, 2017 at 14:52 Comment(0)

Recommended topics

Hot tags