Loading millions of records into a stringlist can be very slow
Asked Answered
T

6

5

how can i load millions of records from tadotable into a stringlist very fast?

procedure TForm1.SlowLoadingIntoStringList(StringList: TStringList);
begin
  StringList.Clear;
  with SourceTable do
  begin
    Open;
    DisableControls;
    try
      while not EOF do
    begin
      StringList.Add(FieldByName('OriginalData').AsString);
      Next;
    end;
   finally
   EnableControls;
   Close;
  end;
end;
Tondatone answered 7/12, 2011 at 11:18 Comment(6)
The bottleneck here is probably the database and/or the ado driver more than the stringlist, though you could try profiling to be sure. If confirmed, you'll need a faster database or a better driver to speed things up.Imposture
You could try setting Stringlist.Capacity := SourceTable.RecordCount; This would save a lot of reallocation of the underlying arrayMarras
What is slow here is FieldByName('OriginalData').Exmoor
And try to use a direct link to the OleDB provider to bypass the ADO layer. See for instance our Open Source unit.Exmoor
Scrub my earlier comment - I just tried a very simple 10 million record loop, and it was only slightly (about 15%) faster.Marras
If the TStringList is sorted it may also slow the process down. Set Sorted := False while adding the lines then turn sorting back on if necessary at the end.Ormuz
K
10

in your loop you get the field. Search the field out of the loop

procedure TForm1.SlowLoadingIntoStringList(StringList: TStringList); 
var
  oField: TField;
begin
  StringList.Clear;   
  with SourceTable do   
  begin     
    Open;     
    DisableControls;     
    try       
      oField:= FieldByName('OriginalData');
      if oField<>Nil then
      begin
        while not EOF do
        begin       
          StringList.Add(oField.AsString);       
          Next;     
        end;   
      end; 
    finally    
      EnableControls;    
      Close;   
    end; 
  end;  
end;
Kung answered 7/12, 2011 at 11:55 Comment(2)
I would also add BeginUpdate, EndUpdate for cases when someone will pass as an input of this method e.g. TListBox.Items :) No, seriously, Kamyar, what's the reason for loading so many rows ? Are you saving those lines as a file ? Or for some dynamic search ? If it's the dynamic search then you should let the DB server to do this for you.Bateau
If the input is a TStringList parameter, a TStrings can not be passed.Exmoor
H
4

Unfortunately, you can't do this quickly. It is an inherently slow operation that involves large amounts of CPU time and memory bandwidth to achieve. You could throw more hardware at it, but I suspect you should be re-thinking your task instead.

Hawkie answered 7/12, 2011 at 11:20 Comment(0)
G
1

With 'millions of records' you may consider : 1/ Change your Query from

SELECT * FROM MYTABLE;

in

SELECT OriginalData FROM MYTABLE;

You'll use less memory and be more efficient.

2/ Look another component than TStringList depending on your needs.

3/ Look all good previous advices, mainly :

  • don't use FieldByName
  • direct link to the OleDB provider
Grogan answered 7/12, 2011 at 18:8 Comment(0)
D
0

Is it sorted?

  // Turn off the sort for now
  StringList.Sorted := False;
  // Preallocate the space
  StringList.Capacity := recordCount;
  // Now add the data with Append()
  ...
  // Now turn the sort back on
  StringList.Sorted := True;
Draconic answered 7/12, 2011 at 16:1 Comment(0)
V
0

Seriously? Millions of records in a stringlist?

Ok, let's assume you really do need to take this approach...

There are some good suggestions already posted.

If you want to experiment with a different approach you could consider concatenating the individual records server side (via a stored procedure) and then returning the concatenated data as a blob (or possibly nvarchar(max)), which is basically the list of concatenated strings delimited by say a carriage return (assuming this is a reasonable delimiter for your needs).

You can then simply assign the returned value to the Text property of the TStringList.

Even if you cannot do all of the strings in a single hit, you could do them in groups of say 1000 at a time.

This should save you a ton of time looping around each record client side.

Vide answered 8/12, 2011 at 8:16 Comment(0)
M
0

Expanding on @Ravaut123's answer I would suggest the following code:

Make sure your Query is not connected to any visual other component, and does not have any events set that fire on rowchanges because this will cause it to to updates on every change in the active record, slowing things way down.
You can disable the visual controls using disablecontrols, but not the events and non-visual controls.

...
SQLatable:= 'SELECT SingleField FROM atable ORDER BY indexedfield ASC';
AQuery:= TAdoQuery.Create(Form1);
AQuery.Connection:= ....
AQuery.SQL.Text:= SQLatable;  

Using a Query makes sure you only select 1 field, in the order that you want, this reduces network traffic. A table fetches all fields, causing much more overhead.

function TForm1.LoadingAllIntoStringList(AQuery: TAdoQuery): TStringList;  
var 
  Field1: TField; 
begin 
  Result:= nil;
  try
    if not(AQuery.Active) then begin
      AQuery.Open;
    end else begin
      AQuery.First;
    end;
    AQuery.DisableControls;
    AQuery.Filtered:= false;                    //Filter in the SQL `where` clause
    AQuery.FetchAll;                            //Preload all data into memory
    Result:= TStringlist.Create;
  except
    {ignore error, will return nil}
  end;
  try
    Result.Sorted:= false;                      //Make sure you don't enable sorting
    Result.Capacity:= AQuery.RecordCount;       //Preallocate the needed space     
    Field1:= AQuery.FieldByName('SingleField'); //Never use `fieldbyname` in a loop!
    while not AQuery.EOF do begin
      Result.Add(Field1.AsString);
      AQuery.Next;
    end; {while} 
    AQuery.EnableControls;
  except
    FreeAndNil(Result);
  end;   

If you want to load the data into the stringlist to do some processing, consider doing that in the SQL statement instead. The DB can use indexes and other optimizations that the stringlist cannot use.
If you want to save that data into a CSV file, consider using a build-in DB function for that.
e.g. MySQL has:

SELECT X FROM table1 INTO OUTFILE 'c:/filename_of_csv_file.txt'

Which will create a CSV file for you.
Many DB's have simular functions.

Magalimagallanes answered 8/12, 2011 at 10:33 Comment(1)
+1, good point about letting the server side do the hard work. Anyway I would declare the output string list as a parameter rather than as a function result ;)Bateau

© 2022 - 2024 — McMap. All rights reserved.