I am loading data dump from external source and some strings contain \uXXXX
sequences for the UTF8 chars, like this one:
\u017D\u010F\u00E1r nad S\u00E1zavou
I can check the contents by using E'' constant in psql
, but cannot find any function/operator to return me proper value.
I'd like to ask, if it's possible to convert this string with unicode escapes into normal UTF8 without using PL/pgSQL functions?
E'\u017D\u010F\u00E1r nad S\u00E1zavou'
to have it properly interpreted; what else would you like to do? – KarikariaUPDATE table SET proper = somefunc('\u017D\u010F\u00E1r nad S\u00E1zavou') WHERE id=1;
. And get the expected UTF8 string. – HarbourUPDATE table SET proper = E'\u017D\u010F\u00E1r nad S\u00E1zavou' WHERE id=1;
What do you get when you runSHOW server_encoding;
? How aboutSHOW client_encoding;
? – KarikariaUTF8
. Problem is that I cannot use literal constants. I need a table-wide query, like:UPDATE table SET proper = somefunc(badtext);
for a 8M row table. Copy-paste is not an option. – Harbourbadtext
coming from? What encoding is it in? Is this in a disk file where you could just run a encoding conversion utility on it? Depending on context, one of the convert_* functions in this table might possibly help: postgresql.org/docs/9.1/static/… – Karikaria\uXXXX
sequences are not the result of the bad encoding conversion. They're just there, literally. – Harbour