Delphi XE2 Dataset field type TStringField does not support Unicode?
Asked Answered
U

2

14

I've been looking through the TDataset class and its string fields, in Delphi XE2 and noticed that AsWideString returns a type of UnicodeString. However it gets the value from the function TField.AsString: String which in turn calls TFIeld.AsAnsiString:AnsiString. Therefore any unicode characters would be lost? Also the buffer which is passed to TDataset.GetFieldData is declared as an array of AnsiChar.

Am I understanding this correctly?

Untrue answered 27/2, 2012 at 1:58 Comment(1)
+1 Since this behavior is IMHO a VCL wrong implementation. It is IMHO a wrong naming, inconsistent with the rest the VCL/RTL and a source of lot of confusion/misunderstanding. Your question does perfectly sense.Stern
B
13

No, you should be examining the TWideStringField class which is for Unicode fields and the TStringField class which is for non-Unicode strings. TField is just a base class and TField.GetAsWideString is a virtual method with a fall back implementation that is overridden by descendants that are Unicode aware.

Bibbie answered 27/2, 2012 at 2:14 Comment(9)
Nice side effect of having two string field classes: database migrations to Unicode require replacing all TStringField in DFMs with TWideStringField (and many other source code changes), where developers would expect a smooth transitionHike
@mjn, this approach will let you make the transition to unicode in your app without changing the underlying database fields. TWideStringField has been around for years and is not related to Delphi switching to Unicode. If the field in the database has been unicode before, you already had to use TWideStringField in say Delphi 5 anyway. You use TStringField if the database field is just an AnsiString. Delphi won't change your data definition and data in the database automagically.Seville
@Hike Actually no since normally it depends on the database if the value is unicode or not. So if your database was unicode in the old Delphi version it was TWideStringField already. If not why should it be in the newer version if the database still does not have unicode?Conflation
@Uwe I mean migrating the database to Unicode, when Delphi 2009 is already in use - TWideStringField does not work for persistent ANSI fieldsHike
Using TStringField for AnsiString and TWideStringField for string=UnicodeString just does not make sense at all. It is IMHO a wrong naming, inconsistent with the rest the VCL/RTL and a source of lot of confusion.Stern
It's not so simple as blaming the VCL. Underlying Database string types like char and varchar don't suddenly become Unicode, so neither should TStringField which has always been associated with char(x) and varchar(x) SQL types. What did anyone seriously expect a CAST to do when your underlying field db type is still ansi? Yes conversion is harder in this area, but if they made TStringField ambiguous they would have had to require 90% of devs to switch to what, a newly created "TAnsiStringField"? That would have been much worse.Showily
@WarrenP Doesn't it that what is expected from every programmer that is porting to a new version of Delphi? For example, I had to change one of my components in many places to AnsiString what were before only String to work "as expected" in Delphi XE.Apoplexy
You totally didn't get my point did you? Yes, your code is yours and you have to fix it. But my point is that Unicode isn't something you can guess or make any One Size Fits All assumptions about. The db area of Delphi has got certain underlying realies to deal with and if people think about those realities then they won't be surprised so often.Showily
@WarrenP Agreed. That should not surprise us, yet, the feel of inconsistent naming just surprises.Apoplexy
S
6

YES, you did understand it correctly. This is the VCL and its documentation which are broken. Your confusion does perfectly make sense!

In the Delphi 2009+ implementation, you have to use AsString property for AnsiString and AsWideString for string=UnicodeString.

In fact, the As*String properties are defined as such:

property AsString: string read GetAsString write SetAsString;
property AsWideString: UnicodeString read GetAsWideString write SetAsWideString;
property AsAnsiString: AnsiString read GetAsAnsiString write SetAsAnsiString;

How on earth may we be able to find out that AsString returns an AnsiString? It just does not make sense at all, when compared to the rest of the VCL/RTL.

The implementation, which uses TStringField class for AnsiString and TWideStringField for string=UnicodeString is broken.

Furthermore, the documentation is also broken:

Data.DB.TField.AsString

Represents the field's value as a string (Delphi) or an AnsiString (C++).

This does not represent a string in Delphi, but an AnsiString! The fact that the property uses a plain string=UnicodeString type is perfectly missleading.

On the database point of view, it is up to the DB driver to handle Unicode or work with a specific charset. But on the VCL point of view, in Delphi 2009+ you should only know about string type, and be confident that using AsString: String will be Unicode-ready.

Stern answered 22/4, 2013 at 5:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.