This is a very Delphi specific question (maybe even Delphi 2007 specific). I am currently writing a simple StringPool class for interning strings. As a good little coder I also added unit tests and found something that baffled me.
This is the code for interning:
function TStringPool.Intern(const _s: string): string;
var
Idx: Integer;
begin
if FList.Find(_s, Idx) then
Result := FList[Idx]
else begin
Result := _s;
if FMakeStringsUnique then
UniqueString(Result);
FList.Add(Result);
end;
end;
Nothing really fancy: FList is a TStringList that is sorted, so all the code does is looking up the string in the list and if it is already there it returns the existing string. If it is not yet in the list, it will first call UniqueString to ensure a reference count of 1 and then add it to the list. (I checked the reference count of Result and it is 3 after 'hallo' has been added twice, as expected.)
Now to the testing code:
procedure TestStringPool.TestUnique;
var
s1: string;
s2: string;
begin
s1 := FPool.Intern('hallo');
CheckEquals(2, GetStringReferenceCount(s1));
s2 := s1;
CheckEquals(3, GetStringReferenceCount(s1));
CheckEquals(3, GetStringReferenceCount(s2));
UniqueString(s2);
CheckEquals(1, GetStringReferenceCount(s2));
s2 := FPool.Intern(s2);
CheckEquals(Integer(Pointer(s1)), Integer(Pointer(s2)));
CheckEquals(3, GetStringReferenceCount(s2));
end;
This adds the string 'hallo' to the string pool twice and checks the string's reference count and also that s1 and s2 indeed point to the same string descriptor.
Every CheckEquals works as expected but the last. It fails with the error "expected: <3> but was: <4>".
So, why is the reference count 4 here? I would have expected 3:
- s1
- s2
- and another one in the StringList
This is Delphi 2007 and the strings are therefore AnsiStrings.
Oh yes, the function StringReferenceCount is implemented as:
function GetStringReferenceCount(const _s: AnsiString): integer;
var
ptr: PLongWord;
begin
ptr := Pointer(_s);
if ptr = nil then begin
// special case: Empty strings are represented by NIL pointers
Result := MaxInt;
end else begin
// The string descriptor contains the following two longwords:
// Offset -1: Length
// Offset -2: Reference count
Dec(Ptr, 2);
Result := ptr^;
end;
end;
In the debugger the same can be evaluated as:
plongword(integer(pointer(s2))-8)^
Just to add to the answer from Serg (which seems to be 100% correct):
If I replace
s2 := FPool.Intern(s2);
with
s3 := FPool.Intern(s2);
s2 := '';
and then check the reference count of s3 (and s1) it is 3 as expected. It's just because of assigning the result of FPool.Intern(s2) to s2 again (s2 is both, a parameter and the destination for the function result) that causes this phenomenon. Delphi introduces a hidden string variable to assign the result to.
Also, if I change the function to a procedure:
procedure TStringPool.Intern(var _s: string);
the reference count is 3 as expected because no hidden variable is required.
In case anybody is interested in this TStringPool implementation: It's open source under the MPL and available as part of dzlib, which in turn is part of dzchart:
https://sourceforge.net/p/dzlib/code/HEAD/tree/dzlib/trunk/src/u_dzStringPool.pas
But as said above: It's not exactly rocket science. ;-)