Widestring to string conversion in Delphi 7
Asked Answered
H

2

6

my app is a non-unicode app written in Delphi 7.

I'd like to convert unicode strings to ANSI with this function :

function convertU(ws : widestring) : string;
begin
  result := string(ws);
end;

I use also this code to set the right codepage to convert.

initialization
  SetThreadLocale(GetSystemDefaultLCID);
  GetFormatSettings;

It works great in the VCL main thread but not in a TThread, where I get some questions marks '?' as result of function convertU.

Why not in a TThread ?

Hypesthesia answered 9/9, 2012 at 6:54 Comment(6)
First, you don't need a function or typecast to do this; a simple stringVar := wideStringVar; will work. Second, the problem is that not all WideChars are directly convertible to an AnsiString; some are more than one character in width, and some have character values that are not representable in an AnsiChar, and some fonts don't contain all possible Unicode values. If you're seeing ?, it means you're displaying them, which could be the third problem - threads should not access GUI controls without using Sychronize. Since you didn't post the display code, it's hard to say if that's it.Dynamometry
The question is : why do I get question marks when used in a TThread ?Hypesthesia
I repeat: If you're seeing ?, you're displaying the text. You've provided zero code or information regarding how you're displaying it.Dynamometry
I have a look on the returned values with the debuggerHypesthesia
Is a TThread aware of system default LCID ?Hypesthesia
? is what you get when a conversion of a codepoint from Unicode to MBCS fails. Since you are talking about the contents of an AnsiString, and since ? Is in the common ASCII range, we can rule out display bugs.Sclerenchyma
I
6

Calling SetThreadLocale() inside of an initialization block has no effect on TThread. If you want to set a thread's locale, you have to call SetThreadLocale() inside of the TThread.Execute() method.

A better option is to not rely on SetThreadLocale() at all. Do your own conversion by calling WideCharToMultiByte() directly so you can specify the particular Ansi codepage to convert to.

Instructions answered 9/9, 2012 at 7:31 Comment(4)
If you are the author of all code in your app then explicit calls to WideCharToMultiByte is a viable option. If you include third party code, then thread-wide setting of the locale may be the best compromise.Sclerenchyma
yes. thanks. I use a mix of both now : WideCharToMultiByte AND SetThreadLocale()Hypesthesia
Because I started to use WideCharToMultiByte first, but I think I will remove calls to this function and only use SetThreadLocale for each thread.Hypesthesia
AFAIK SetThreadLocale does not change the current system Code Page, so won't affect the widestring to ansistring conversion in Delphi 7, which rely on GetACP API call.Finsteraarhorn
F
6

AFAIK SetThreadLocale does not change the current system Code Page, so won't affect the widestring to ansistring conversion in Delphi 7, which rely on GetACP API call, i.e. the system Code Page.

The system Code Page is set e.g. in Windows Seven in the Control Panel, then Region Languages / Administrative tab / Code Page for non Unicode Applications. This needs a system restart.

Delphi 7 uses this system Code Page, supplying 0 to all conversion API calls. So AFAIR SetThreadLocale won't affect the widestring to ansistring conversion in Delphi 7. It will change the locale (e.g. date/time and currency formatting), not the code page used by the system for its Ansi <-> Unicode conversion.

Newer versions of Delphi have a SetMultiByteConversionCodePage() function, able to set the code page to be used for all AnsiString handling.

But API calls (i.e. all ....A() functions in Windows.pas which are mapped by ...() in Delphi 7) will use this system code page. So you will have to call the ...W() wide API after a conversion to Unicode if you want to handle another code page. That is, the Delphi 7 VCL will work only with the system code page, not the value specified by SetThreadLocale.

Under Delphi 7, my advice is:

  • Use WideString everywhere, and specific "Wide" API calls - there are several set of components for Delphi 7 which handle WideString;
  • Use your own types, with a dedicated charset, but you'll need an explicit conversion before using the VCL/RTL or "Ansi" API calls - e.g. MyString = type AnsiString (this is what we do in mORMot, by defining a custom RawUTF8 type for internal UTF-8 process).

This is much better handled with Delphi 2009 and up, since you can specify a code page to every AnsiString type, and properly handle conversion to/from Unicode, for API calls or VCL process.

Finsteraarhorn answered 9/9, 2012 at 12:15 Comment(1)
I just want to handle the default system codepage. What I've noticed, is that for Unicode<->Ansi conversion, the codepage used is the DEFAULT user LCID and not the DEFAULT system LCID and that's an issue.Hypesthesia

© 2022 - 2024 — McMap. All rights reserved.