How can I search faster for name/value pairs in a Delphi TStringList?
Asked Answered
T

5

6

I implemented language translation in an application by putting all strings at runtime in a TStringList with:

procedure PopulateStringList;
begin  
  EnglishStringList.Append('CAN_T_FIND_FILE=It is not possible to find the file');
  EnglishStringList.Append('DUMMY=Just a dummy record');
  // total of 2000 record appended in the same way
  EnglishStringList.Sorted := True; // Updated comment: this is USELESS!
end;

Then I get the translation using:

function GetTranslation(ResStr:String):String;
var
  iIndex : Integer;
begin
  iIndex := -1;
  iIndex :=  EnglishStringList.IndexOfName(ResStr);
  if iIndex >= 0 then
  Result :=  EnglishStringList.ValueFromIndex[iIndex] else
  Result := ResStr + ' (Translation N/A)';
end;

Anyway with this approach it takes about 30 microseconds to locate a record, is there a better way to achieve the same result?

UPDATE: For future reference I write here the new implementation that uses TDictionary as suggested (works with Delphi 2009 and newer):

procedure PopulateStringList;
begin  
  EnglishDictionary := TDictionary<String, String>.Create;
  EnglishDictionary.Add('CAN_T_FIND_FILE','It is not possible to find the file');
  EnglishDictionary.Add('DUMMY','Just a dummy record');
  // total of 2000 record appended in the same way
end;


function GetTranslation(ResStr:String):String;
var
  ValueFound: Boolean;
begin
  ValueFound:=  EnglishDictionary.TryGetValue(ResStr, Result);
  if not ValueFound then Result := Result + '(Trans N/A)';
end;

The new GetTranslation function performs 1000 times faster (on my 2000 sample records) then the first version.

Timeous answered 14/9, 2010 at 14:23 Comment(2)
Although IndexOf benefits from a TStringList being sorted, IndexOfName does not. That's not to say it couldn't in some descendant class, but TStringList does not override the basic linear search that TStrings provides.Roderick
Yes, I checked, it is the same (so I can save time not sorting the list. Since I am not modifying it it would make more sense to call Sort, instead of setting sorted True (anyway I won't do it). Thanks.Timeous
R
15

In Delphi 2009 or later I would use TDictionary< string,string > in Generics.Collections. Also note that there are free tools such as http://dxgettext.po.dk/ for translating applications.

Randeerandel answered 14/9, 2010 at 17:30 Comment(6)
Thanks, yes I know tdxggettext but the implementation of the translation i described was already done from a previous developer, so I am just trying to improve that, and with THasedStringList I obtained a satisfactory result. Anyway I'll give a try (for my personal knowledge to the TDictionary suggestion you gave).Timeous
@VilleK: There are some bugs in TDictionary and some of the other generics in Delphi 2009, so it's best to avoid it unless you are using Delphi 2010 or XE. And to be honest, there is very little documentation available on how to use it and its functionality. It's given me a much harder time than it should have.Gait
Yes, I remember there was a bug with TDictionary in Delphi 2009 but now I use it a lot in Delphi 2010 and have not encountered any problems. I think it works great. Also TObjectDictionary in the same unit is very useful because it can optionally take ownership of both keys and values. For me the Generics.Collections unit is one of the main reasons to upgrade from older Delphi versions.Randeerandel
I have Delpgi 2009 and I tried generics for the first time using as suggested TDictionary<String, String>... It is 40 times faster than THashedStringList in my case. I am only using the TDictionary methods Add and TryGetValue. Am I "safe" even in 2009 or not? So I accept this as answer since now my speed is 1000 times faster. TStringList to THashedStringList gave x25, THashedStringList to TDictionary<String, String> gave extra x40. Total x1000.Timeous
Thanks @user193655. Those timings are good to know. I believe you are safe in Delphi 2009 if the code runs. The trouble is that some constructs will just give compile errors. But if it runs, it won't give you wrong results.Gait
Yes it works. I measured the timings with the ProDelphi profiler. Thanks.Timeous
B
18

THashedStringList should be better, I think.

Backandforth answered 14/9, 2010 at 14:28 Comment(2)
THANKS!!!! Now identifying 45 records passed from 50 ms to 2 ms. 25 times faster!Timeous
Cool - somehow I never knew about this thing. I'll have to look into it.Aerostat
R
15

In Delphi 2009 or later I would use TDictionary< string,string > in Generics.Collections. Also note that there are free tools such as http://dxgettext.po.dk/ for translating applications.

Randeerandel answered 14/9, 2010 at 17:30 Comment(6)
Thanks, yes I know tdxggettext but the implementation of the translation i described was already done from a previous developer, so I am just trying to improve that, and with THasedStringList I obtained a satisfactory result. Anyway I'll give a try (for my personal knowledge to the TDictionary suggestion you gave).Timeous
@VilleK: There are some bugs in TDictionary and some of the other generics in Delphi 2009, so it's best to avoid it unless you are using Delphi 2010 or XE. And to be honest, there is very little documentation available on how to use it and its functionality. It's given me a much harder time than it should have.Gait
Yes, I remember there was a bug with TDictionary in Delphi 2009 but now I use it a lot in Delphi 2010 and have not encountered any problems. I think it works great. Also TObjectDictionary in the same unit is very useful because it can optionally take ownership of both keys and values. For me the Generics.Collections unit is one of the main reasons to upgrade from older Delphi versions.Randeerandel
I have Delpgi 2009 and I tried generics for the first time using as suggested TDictionary<String, String>... It is 40 times faster than THashedStringList in my case. I am only using the TDictionary methods Add and TryGetValue. Am I "safe" even in 2009 or not? So I accept this as answer since now my speed is 1000 times faster. TStringList to THashedStringList gave x25, THashedStringList to TDictionary<String, String> gave extra x40. Total x1000.Timeous
Thanks @user193655. Those timings are good to know. I believe you are safe in Delphi 2009 if the code runs. The trouble is that some constructs will just give compile errors. But if it runs, it won't give you wrong results.Gait
Yes it works. I measured the timings with the ProDelphi profiler. Thanks.Timeous
G
14

If THashedStringList works for you, that's great. Its biggest weakness is that every time you change the contents of the list, the Hash table is rebuilt. So it will work for you as long as your list remains small or doesn't change very often.

For more info on this, see: THashedStringList weakness, which gives a few alternatives.

If you have a big list that may be updated, you might want to try GpStringHash by gabr, that doesn't have to recompute the whole table at every change.

Gait answered 14/9, 2010 at 16:6 Comment(2)
I do not need to update the list, it is filled only once, so I think I can stay with THashedStringList since now the bottleneck is gone.Timeous
Just checked the THashedStringList implementation in XE4, and the weakness of "every time the content of the string list is changed the hash table is invalided andmust be rebuilt." is no longer exist.Roundhouse
O
4

I think that you don't use the EnglishStringList(TStringList) correctly. This is a sorted list, you add elements (strings), you sort it, but when you search, you do this by a partial string (only the name, with IndexOfName).

If you use IndexOfName in a sorted list, the TStringList can't use Dicotomic search. It use sequential search.

(this is the implementation of IndexOfName)

  for Result := 0 to GetCount - 1 do
  begin
    S := Get(Result);
    P := AnsiPos('=', S);
    if (P <> 0) and (CompareStrings(Copy(S, 1, P - 1), Name) = 0) then Exit;
  end;

I think that this is the reason of poor performance.
The alternative is use 2 TStringList:
* The first (sorted) only containts the "Name" and a pointer to the second list that contain the value; You can implement this pointer to the second list using the "pointer" of Object property.
* The second (not sorted) list containt the values.

When you search, you do it at first list; In this case you can use the Find method. when you find the name, the pointer (implemented with Object property) give you the position on second list with the value.

In this case, Find method on Sorted List is more efficient that HashList (that must execute a funcion to get the position of a value).

Regards.

Pd:Excuse-me for mistakes with english.

Organizer answered 14/9, 2010 at 16:18 Comment(0)
U
2

You can also use a CLASS HELPER to re-program the "IndexOfName" function:

TYPE
  TStringsHelper = CLASS HELPER FOR TStrings
                     FUNCTION IndexOfName(CONST Name : STRING) : INTEGER;
                   END;

FUNCTION TStringsHelper.IndexOfName(CONST Name : STRING) : INTEGER;
  VAR
    SL  : TStringList ABSOLUTE Self;
    S,T : STRING;
    I   : INTEGER;

  BEGIN
    IF (Self IS TStringList) AND SL.Sorted THEN BEGIN
      S:=Name+NameValueSeparator;
      IF SL.Find(S,I) THEN
        Result:=I
      ELSE IF (I<0) OR (I>=Count) THEN
        Result:=-1
      ELSE BEGIN
        T:=SL[I];
        IF CompareStrings(COPY(T,1,LENGTH(S)),S)=0 THEN Result:=I ELSE Result:=-1
      END;
      EXIT
    END;
    Result:=INHERITED IndexOfName(Name)
  END;

(or implement it in a descendant TStrings class if you dislike CLASS HELPERs or don't have them in your Delphi version).

This will use a binary search on a sorted TStringList and a sequential search on other TStrings classes.

Udella answered 8/10, 2010 at 12:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.