Why is my code so slow?
Asked Answered
E

6

5

Top-posted (sorry) answer, for those who don't have time to get into it but may have similar problems.

Rule #1, as always, move as much as you can out of loops.
2, moving TField var := ADODataSet.FieldByname() out of the loop 3, ADODataSet.DisableControls(); and ADODataSet.EnableControls(); around the loop 4, stringGrid.Rows[r].BeginUpdate() and EndUpdate() on each row (cannot do on teh whle control) each of these shaved off a few seconds, but I got it down to "faster than the eye can see" by changing

loop
  stringGrid.RowCount := stringGrid.RowCount + 1;
end loop

to putting stringGrid.RowCount := ADODataSet.RecordCount; before the loop

+1 and heartfelt thanks to all who helped.

(now I will go and see what I can do to optimize drawing a TChart, which is also slow ;-)


with about 3,600 rows in the table this takes 45 seconds to populate the string grid. What am I doing wrong?

   ADODataSet := TADODataSet.Create(Nil);
   ADODataSet.Connection := AdoConnection;

   ADODataSet.CommandText := 'SELECT * FROM measurements';
   ADODataSet.CommandType := cmdText;
   ADODataSet.Open();

   while not ADODataSet.eof do
   begin
      TestRunDataStringGrid.RowCount := TestRunDataStringGrid.RowCount + 1;

      measurementDateTime   := UnixToDateTime(ADODataSet.FieldByname('time_stamp').AsInteger);
      DoSQlCommandWithResultSet('SELECT * FROM start_time_stamp', AdoConnection, resultSet);
      startDateTime := UnixToDateTime(StrToInt64(resultSet.Strings[0]));
      elapsedTime   := measurementDateTime - startDateTime;
      TestRunDataStringGrid.Cells[0, Pred(TestRunDataStringGrid.RowCount)] := FormatDateTime('hh:mm:ss', elapsedTime);
      TestRunDataStringGrid.Cells[1, Pred(TestRunDataStringGrid.RowCount)] := FloatToStrWithPrecision(ADODataSet.FieldByname('inputTemperature').AsFloat);
      TestRunDataStringGrid.Cells[2, Pred(TestRunDataStringGrid.RowCount)] := FloatToStrWithPrecision(ADODataSet.FieldByname('outputTemperature').AsFloat);
      TestRunDataStringGrid.Cells[3, Pred(TestRunDataStringGrid.RowCount)] := FloatToStrWithPrecision(ADODataSet.FieldByname('flowRate').AsFloat);
      TestRunDataStringGrid.Cells[4, Pred(TestRunDataStringGrid.RowCount)] := FloatToStrWithPrecision(ADODataSet.FieldByname('waterPressure').AsFloat * convert);
      TestRunDataStringGrid.Cells[5, Pred(TestRunDataStringGrid.RowCount)] := FloatToStrWithPrecision(ADODataSet.FieldByname('waterLevel').AsFloat);
      TestRunDataStringGrid.Cells[6, Pred(TestRunDataStringGrid.RowCount)] := FloatToStrWithPrecision(ADODataSet.FieldByname('cod').AsFloat);
      ADODataSet.Next;
   end;

   ADODataSet.Close();
   ADODataSet.Free();

update:

Function  DoSQlCommandWithResultSet(const command : String; AdoConnection : TADOConnection; resultSet : TStringList): Boolean;
  var
        i : Integer;
        AdoQuery : TADOQuery;

begin
  Result := True;
  resultSet.Clear();

  AdoQuery := TADOQuery.Create(nil);
  try
    AdoQuery.Connection := AdoConnection;
    AdoQuery.SQL.Add(command);
    AdoQuery.Open();
    i := 0;
    while not  AdoQuery.eof do
    begin
      resultSet.Add(ADOQuery.Fields[i].Value);
      i := i + 1;
      AdoQuery.Next;
    end;

  finally
    AdoQuery.Close();
    AdoQuery.Free();
  end;
end;

Echinoderm answered 22/1, 2011 at 3:11 Comment(9)
The best way to approach the answer is to somehow profile your code. Also, there's no idea how many rows the select * from time_stamp should return; maybe it's a zillion rows?Queen
+1 Thanks. Any recommendation on profiling? In this case I was taking measurements at one second intervals for approx one hour (plus/minus a few seconds)Echinoderm
You should try SamplingProfiler, available at delphitools.info/samplingprofiler. It gives you a good look at what your program is spending time on, it works well with Delphi apps, and it's free. :)Targe
The slower your program is, the easier it is to find out why. Try random-pausing.Mcewen
@Mason: The profiler you linked to looks like it does all the right things. It samples the stack, on wall-clock time, and gives percent by line of code, similar to RotateRight's Zoom. I do random-pausing, but I think that kind of profiler is definitely the next-best thing.Mcewen
@Mike: I'd even say it's better than random-pausing, since it can sample 1000 times per second, without actually pausing and restarting your program, which means you can interact with the program normally, uninterrupted, and just have it watch and collect data on the bottlenecks in your standard workflow.Targe
@Mason: I'm sure it's pretty good, but FWIW here's what I think. 1) What you really need is samples only during the time you care about, like when it's subjectively making you wait. 2) In that interval, the useful information is contained in any 100 samples or less (much less) - the extra samples give more timing precision, but not any more precision of location of the problem. 3) When you sample yourself you see whole stacks, therefore you can see why the time is being spent, which is how you know if it's actually being wasteful. It's the difference between measuring and locating.Mcewen
@Mike: SamplingProfiler allows you to put OutputDebugString statements into your app to turn sampling on and off, which helps you focus on actual bottlenecks. It also has an option to disable sampling while the app is idle waiting for input. And it samples the top two stack frames automatically, which gives you all the context you need in 90+% of real-world cases; it's not like you need the entire stack or anything. With it I can do things like determine why my game engine is scrolling slowly when I move, which I couldn't do by random-pausing because that would interrupt my input.Targe
@Mason: Sounds like they know what they're doing. The kind of thing I run into is, like in a big .net app, if it's loading a worksheet with lots of rows, at 10-15 levels down in the stack, it's setting cells, and that's triggering another 10-15 levels of notifications. Or during startup, again we're talking a 30-level stack, extracting strings from resources. When you see what the strings actually are, you can see there's no need to put most of them in resources. These things are totally un-guessable, and take large percents of time.Mcewen
G
9

Additionally to Larry Lustig points:

  1. In general, FieldByName is comparably slow method. You are calling it in loop for the same fields. Move the getting of field references out of the loop and store references in the variables. Like: InputTempField := ADODataSet.FieldByname('inputTemperature');
  2. You are resizing the grid in the loop TestRunDataStringGrid.RowCount := TestRunDataStringGrid.RowCount + 1. That is the case, when you should use ADODataSet.RecordCount before the loop: TestRunDataStringGrid.RowCount := ADODataSet.RecordCount.
  3. That is a good practice to call ADODataSet.DisableControls before loop and ADODataSet.EnableControls after loop. Even more actual that is for ADO dataset, which has not optimal implementation and those calls help.
  4. Depending on a DBMS you are using, you can improve the fetching performance by setting a larger "rowset size". Not sure, how it control in ADO, probably setting ADODataSet.CacheSize to a greater value will help. Also, there are cursor settings :)
Gifted answered 22/1, 2011 at 7:57 Comment(3)
+1 Thanks. #1 and 2 are definitely good ideas (are you saying that InputTempField is a TField and I use InputTempField.AsFloat in teh loop?). I have already taken #3 on board. For #4, I am a bit of a n00b, so will leave that for later. Also, it needs to be ODBC complainant.Echinoderm
TestRunDataStringGrid.RowCount := ADODataSet.RecordCount; turned out to make the difference. Some others help, but that one really did it. See updated question for more differenceEchinoderm
ADODataSet.DisableControls gave me a speed-up of (fasten your seat belts) 1000x while iterating through an 850.000 records result set. How can I upvote more than once? And why is that damn thing enabled by default?Amorete
I
11
  1. You are executing the command SELECT * FROM start_time_stamp 3,600 times, but it does not appear to me that it is correlated with your outer loop in any way. Why not execute it once before the loop?

  2. That SELECT command appears to return only a single column of a single record, yet you use "*" to load all columns, and no WHERE clause to limit the results to a single row (if there's more than one row in the table).

  3. You use only a limited number of columns from Measurements, but you retrieve all columns with "*".

  4. You don't show the contents of DoSQlCommandWithResultSet, so it's not clear if there's a problem in that routine.

  5. It's not clear whether the problem is in your database access or the string grid. Comment out all the lines pertaining to the string grid and run the program. How long does the database access alone take?

Illaudable answered 22/1, 2011 at 3:45 Comment(5)
Do come back and let us know what you find.Illaudable
+1 some excellent feedback, thanks. 1) d'oh! I have move that to before the loop 2) good pint 3) actually I want to access every field of "measurements". In that case is SELECT * acceptable? 4) that might be problem. It is accessing the same ADO connection. I have posted the code above.Echinoderm
The version of DoSQLCommandWithResultSet you posted is not the one being called by your program, the signatures are different (the one in your program has three arguments, the one you posted does nothing with the results). Naming columns is always preferred to using "*" since your code then tells you exactly what columns you can refer to and, if the database can't provide any of the columns you expect, the command will fail (which is what you usually want).Illaudable
Wow, that's an awful lot of work to recover that single date. Move the query outside the loop and ditch the subroutine (why transfer multiple rows into a stringlist at all?), use just an ADOCommand to get the result back.Illaudable
+1 +1 apologies. I mistakenly posted withoutresultSet. Corrected now. Yes, the query is outside of the loop. Good point about the string listEchinoderm
G
9

Additionally to Larry Lustig points:

  1. In general, FieldByName is comparably slow method. You are calling it in loop for the same fields. Move the getting of field references out of the loop and store references in the variables. Like: InputTempField := ADODataSet.FieldByname('inputTemperature');
  2. You are resizing the grid in the loop TestRunDataStringGrid.RowCount := TestRunDataStringGrid.RowCount + 1. That is the case, when you should use ADODataSet.RecordCount before the loop: TestRunDataStringGrid.RowCount := ADODataSet.RecordCount.
  3. That is a good practice to call ADODataSet.DisableControls before loop and ADODataSet.EnableControls after loop. Even more actual that is for ADO dataset, which has not optimal implementation and those calls help.
  4. Depending on a DBMS you are using, you can improve the fetching performance by setting a larger "rowset size". Not sure, how it control in ADO, probably setting ADODataSet.CacheSize to a greater value will help. Also, there are cursor settings :)
Gifted answered 22/1, 2011 at 7:57 Comment(3)
+1 Thanks. #1 and 2 are definitely good ideas (are you saying that InputTempField is a TField and I use InputTempField.AsFloat in teh loop?). I have already taken #3 on board. For #4, I am a bit of a n00b, so will leave that for later. Also, it needs to be ODBC complainant.Echinoderm
TestRunDataStringGrid.RowCount := ADODataSet.RecordCount; turned out to make the difference. Some others help, but that one really did it. See updated question for more differenceEchinoderm
ADODataSet.DisableControls gave me a speed-up of (fasten your seat belts) 1000x while iterating through an 850.000 records result set. How can I upvote more than once? And why is that damn thing enabled by default?Amorete
H
5

instead of calling ADODataSet.FieldByname('Fieldname') inside the loop you should declare local variables of type TField for each field, assign ADODataset.FindField('Fieldname') to the variables and use the variables inside the loop. FindFieldByName searches a list with every call.

Update:

procedure TForm1.Button1Click(Sender: TObject);
var
  InputTemp, OutputTemp: TField;
begin
  ADODataSet := TADODataSet.Create(Nil);
  try
    ADODataSet.Connection := ADOConnection;
    ADODataSet.CommandText := 'SELECT * FROM measurements';
    ADODataSet.Open;
    InputTemp := ADODataSet.FindField('inputTemperature');
    OutputTemp := ADODataSet.FindField('outputTemperature');
    // assign more fields here
    while not ADODataSet.Eof do begin
      // do something with the fields, for example:
      // GridCell := Format ('%3.2f', [InputTemp.AsFloat]);
      // GridCell := InputTemp.AsString;
      ADODataSet.Next;
    end;
  finally
    ADODataSet.Free;
  end;
end;

Another option would be to drop the TADODataset Componont on the form (or use a TDataModule) and define the fields at designtime.

Hasin answered 22/1, 2011 at 5:48 Comment(1)
+1 Could I not use TFloatField if I know that it is a float? If not, can you post a few lines, so that I can see how to do it? ThanksEchinoderm
I
2

Additional to the Larry Lustig answer, consider using data-aware controls instead, like the alt text TDbGrid component.

Infanticide answered 22/1, 2011 at 4:36 Comment(0)
A
2

If you aren't using data-aware controls you should use TestRunDataStringGrid.BeginUpdate before and TestRunDataStringGrid.EndUpdate after loop. Without this is your grid constantly redrawing after each modification (adding new row, cell update).

Another tip is set AdoQuery.LockType := ltReadOnly before opening query.

Aviate answered 22/1, 2011 at 9:17 Comment(0)
P
1

You could also try an instrumenting profiler instead of a sampling profiler to get better results (sampling profilers miss lot of detail info, and most time they have less then 1000 samples per second, and 1000 is already low: only good to get a quick overview).

Instrumenting profilers:

Potion answered 24/1, 2011 at 8:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.