TClientDataSet works VERY SLOW with 100K+ rows
Asked Answered
C

4

6

i have problem retrieving data with Delphi TClientDataSet

Code with ADO:

ADOQuery1.SQL.Text:='SELECT * FROM Table1 WITH (NoLock)';
DataSource1.DataSet:=ADOQuery1;
DataSource1.DataSet.Open;
DataSource1.DataSet.Last;

Code above returns over 180k rows in 3-6 seconds when using pure ADO.

Same code with TClientDataSet:

ADOQuery1.SQL.Text:='SELECT * FROM Table1 WITH (NoLock)';
CDS1.SetProvider(ADOQuery1);
DataSource1.DataSet:=CDS1;
DataSource1.DataSet.Open;
DataSource1.DataSet.Last;

Following code returns same amount of rows(over 180k) but within 3-4minutes.

What's wrong with CDS? It's about 100-times slower then using ADO. Is it possible to fix it?

Crankcase answered 15/1, 2015 at 9:22 Comment(13)
I don't know the answer, but returning 180k(!) records to the client is a very bad design in any case...Pendulous
What Delphi version? There was a huge slowdown from XE2 to XE3 (so much that we reversed our Delphi update), I don't know if it has been fixed since.Windywindzer
@Pendulous The name of the component (TClientDataSet) does not imply that the client is a human being. The application where this component is used could be any kind of program, and some programs must be able process a huge dataset in memory. Unless the documentation explicitly says "do not use this component for more than X records".Beaverboard
I wouldn't be 100% sure that ADO is returning all the rows. Depending on settings such as CursorLocation, you may be getting only the first few and last few rows. There are also other settings that that can give ADO an opprtunity to implement certain efficiency improvements. E.g. If configured to read "forwards-only", ADO would not need to store or process any intermediate rows when you execute DataSet.Last.Tenement
That said: Pulling all data into a ClientDataSet requires each rows to be converted into an internal CDS format. This format provides "enrichment features" such as: bi-directional navigation, change-caching, local indexing. All this requires extra processing and memory allocation for the supporting internal structures. (If you need these features, then you need the extra processing. If you don't need them, then TClientDataSet is probably not the best tool for your requirements.)Tenement
Just trying to exclude the obvious but you don't have any data-aware component attached to the DataSource do you? Have you tried Disable-/EnableControls?Jurywoman
Disabling controls has no effect. Microsoft Management Studio executes same command within 3-4 seconds.Crankcase
Disabling controls is a must. And I agree with most comments here. For fetching 180k rows to client side you need to have serious reasons. And if you are sure you need to fetch so many tuples, then choose a different storage. TDataSet descendants are slow in general. For fetching optimize the query to use a forward, read only cursor and access the underlying Recordset interface. @mjn, it's not documented as well as it's not documented how many items is reasonable to use e.g. with TList.Subnormal
Query period could be for the last 10 years. So numbers of rows returned could be even bigger. Anyway, main question is how to avoid huge waste of time.Crankcase
what are you suppose to do with the result set of 108k records? do you need to display them? import them? analyze them? maybe all you need is a correct SQL statement/SP to get what you want. it only makes sense to fetch all records if you need to import all the result set, and in that case I would not use CDS at all. @mjn, I was referring to "client" side application.Pendulous
The first version of the code will simply navigate the underlying ADO recordset (ie executes inside ADO) whereas the second version is transferring the whole record set so the overhead of the ADO TDataset wrapper is being applied as well as the overhead of the TClientDataset. It may not be relevant but the Delphi ADO wrapper has major performance improvement from calling DisableControls even for TDatasets with no controls linked to them. This may make a difference but it will still be slow.Leyes
I had a similar problem which I then discovered that a round trip was being made to database server for each record for blob data. Now FetchOnDemand on CDS is true and poFetchBlobsOnDemand on provider is also true and controls are disabled at open time, but I don't remember further details.Babylonian
@TLama: I'd rather say that while some specific dataset descendants are slow (due to poor internal design/implementation) other (such as say kbmMT or my own CDS implementation) are quite fast and have no problems loading/handling hundreds of thousands or even millions of records. for example users of my apps sometimes load up to 500k records (100+ fields including many string/memo ones) in CDS and mostly restricted only with available address space (32-bit mode)Dichroite
R
1

Code above returns over 180k rows in 3-6 seconds when using pure ADO.

For some reasons, I wouldn't expect the code you posted to return every single of the 180k records... I'd expect to see the first "X" records loaded after TADOQuery.Open is called, and then the last "X" records sent when TADOQuery.Last is called. Going while not EoF do instead of ".Last" would probably be better performance test since (I assume) you actually want to browse all the records.

When calling TClientDataset.Last when linked to a DataProvider, it most likely do the equivalent of a while not EoF do on your query, which transfer all the 180k records. Also, TClientDataset Insert/Append operation tend to get slower and slower the more records you have in it. My best guess is that it has to realloc it's memory buffer every now and then. If that's the case, I haven't found a way to tell the TClientDataset: "Hey! Brace yourself, 180k records incoming!"(Akin to TList.SetCapacity).

If you have an older version of delphi, one thing that could help is the Midas Speed Fix.

Rabbitry answered 15/1, 2015 at 18:35 Comment(1)
indeed standard TDataPacketWriter.WriteDataSet not only loops over the source dataset till either eof hit or requested amount of records fetched, but also writes every other field value (into resulting stream) within inner fields loop wrapped in empty try/except blockDichroite
E
0

This is pretty old but there are a lot of new Delphi programmers out there now. Here is a bit of a scoop.

When using CDS in delphi you are in fact creating a memory table. Your query probably went to swap hell.

To get the most out of a CDS use DBX components to grab data. They are so called 'fast forward' cursors that dont create a temp table with a cursor in the database. forward only doesnt do the fancy stuff that ADO does. If you need massive datasets with total update notifications and full control, then use ADO. If you need to plow through a ton of data in a hurry with little load on the server that is where CDS/DBX shines.

The way of DBX is harder. It is just a dragster. First and Next are the only thing that works for them. No updates, no commitments, just a fast one way relationship. Hook up a DBX/provider/CDS combo and you have it all. Speed and edit ability. Use a version number to detect another user doing something to the data while you were editing. Study the provider options to see how to get power with flexibility. This is pretty much as hard core as it gets in Delphi.

Euell answered 14/2, 2016 at 17:31 Comment(2)
Sorry, the question was why is a CDS so slow with that number of records compared with an Ado dataset. The correct answer is that it always has been slow for more than a few 10k of records. Originally it was because of the way it allocated memory, but even though that has been improved over the years, it was designed for small, client-side datasets, and 180k rows is not "small" in that context. Btw a TAdoQuery feeding a CDS via a TDatasetProvider is way faster than a DBX TSqlQuery retrieving the same data.Oleta
And, properly used, a TAdoQuery can do most of the nifty things a CDS can (like its "briefcase" model), without (unlike DBX) requiring a CDS to to the client-side stuff, and it will be much quicker at it, too, ime.Oleta
H
0

CDS1.LogChanges property = false or true its the same problem

Hassock answered 29/10, 2022 at 7:51 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Fourth
P
-1

Try setting the CDS1.LogChanges property to False before loading the data. This needs to be done in code as it is not a published property.

From the helpfile: For large datasets, a value of True for LogChanges may severely impact the performance of the application.

You can then turn it on after the initial load.

Parve answered 15/1, 2015 at 16:11 Comment(1)
switching change log off shouldn't affect loading time as the data is streamed through provider not via insert/appendDichroite

© 2022 - 2024 — McMap. All rights reserved.