I'm trying to load localized strings from a unicode (UTF8-encoded) csv using SQL Loader into an oracle database. I've tried all sort of combinations but nothing seems to give me the result I'm looking for which is to have special greek characters like (Δ) not get converted to Δ or ¿.
My table definition looks like this:
CREATE TABLE "GLOBALIZATIONRESOURCE"
(
"RESOURCETYPE" VARCHAR2(255 CHAR) NOT NULL ENABLE,
"CULTURE" VARCHAR2(20 CHAR) NOT NULL ENABLE,
"KEY" VARCHAR2(128 CHAR) NOT NULL ENABLE,
"VALUE" VARCHAR2(2048 CHAR),
"DESCRIPTION" VARCHAR2(512 CHAR),
CONSTRAINT "PK_GLOBALIZATIONRESOURCE" PRIMARY KEY ("RESOURCETYPE","CULTURE","KEY") USING INDEX TABLESPACE REPSPACE_IX ENABLE
)
TABLESPACE REPSPACE;
I have tried the following configurations in my control file (and actually every permutation I could think of)
load data
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
"RESOURCETYPE" CHAR(255),
"CULTURE" CHAR(20),
"KEY" CHAR(128),
"VALUE" CHAR(2048),
"DESCRIPTION" CHAR(512)
)
load data
CHARACTERSET UTF8
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
"RESOURCETYPE" CHAR(255),
"CULTURE" CHAR(20),
"KEY" CHAR(128),
"VALUE" CHAR(2048),
"DESCRIPTION" CHAR(512)
)
load data
CHARACTERSET UTF16
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY X'002c' OPTIONALLY ENCLOSED BY X'0022'
TRAILING NULLCOLS
(
"RESOURCETYPE" CHAR(255),
"CULTURE" CHAR(20),
"KEY" CHAR(128),
"VALUE" CHAR(2048),
"DESCRIPTION" CHAR(512)
)
With the first two options, the unicode characters don't get encoded and just show up as upside down question marks.
If I choose last option, UTF16, then I get the following error even though all my data in my fields are much shorter than the length specified.
Field in data file exceeds maximum length
It seems as though every possible combination of ctl file configurations (even setting the byte order to little and big) doesn't work correctly. Can someone please give an example of a configuration (table structure and CTL file) that correctly loads unicode data from a csv? Any help would be greatly appreciated.
Note: I've already been to http://docs.oracle.com/cd/B19306_01/server.102/b14215/ldr_concepts.htm, http://docs.oracle.com/cd/B10501_01/server.920/a96652/ch10.htm and http://docs.oracle.com/cd/B10501_01/server.920/a96652/ch10.htm.
DUMP
function to determine whether the data in the database is actually properly encoded? – GambrillNVARCHAR2
data types (assuming your national character set supports Unicode), make sure that all your downstream applications supportNVARCHAR2
data types. Using the national character set rather than the database character set introduces a decent amount of complexity. – Gambrill