The Fastest Way to Load an Array DML in Delphi FireDAC
Asked Answered
T

2

8

I am using Delphi XE8 with FireDAC to load a large SQLite database. To do so, I'm using the Array DML execution technique to efficiently insert a large number of records at once, like this:

FDQueryAddINDI.SQL.Text := 'insert into indi values ('
  + ':indikey, :hasdata, :gedcomnames, :sex, :birthdate, :died, '
  + ':deathdate, :changed, :eventlinesneedprocessing, :eventlines, '
  + ':famc, :fams, :linkinfo, :todo, :nextreportindi, :firstancestralloop'
  + ')';
FDQueryAddINDI.Params.Bindmode := pbByNumber; {more efficient than by name }
FDQueryAddINDI.Params.ArraySize := MaxParams; { large enough to load all of them } 

NumParams := 0;
repeat
  { the code to determin IndiKey,... is not shown, but goes here }

  FDQueryAddINDI.Params[0].AsStrings[NumParams] := IndiKey;   
  FDQueryAddINDI.Params[1].AsIntegers[NumParams] := HasData;
  FDQueryAddINDI.Params[2].AsStrings[NumParams] := GedcomNames;
  FDQueryAddINDI.Params[3].AsStrings[NumParams] := Sex;
  FDQueryAddINDI.Params[4].AsStrings[NumParams] := Birthdate;
  FDQueryAddINDI.Params[5].AsIntegers[NumParams] := Died;
  FDQueryAddINDI.Params[6].AsStrings[NumParams] := Deathdate;
  FDQueryAddINDI.Params[7].AsStrings[NumParams] := Changed;
  FDQueryAddINDI.Params[8].AsIntegers[NumParams] := EventLinesNeedProcessing;
  FDQueryAddINDI.Params[9].AsStrings[NumParams] := EventLines;
  FDQueryAddINDI.Params[10].AsIntegers[NumParams] := FamC;
  FDQueryAddINDI.Params[11].AsIntegers[NumParams] := FamS;
  FDQueryAddINDI.Params[12].AsIntegers[NumParams] := Linkinfo;
  FDQueryAddINDI.Params[13].AsIntegers[NumParams] := ToDo;
  FDQueryAddINDI.Params[14].AsIntegers[NumParams] := NextReportIndi;
  FDQueryAddINDI.Params[15].AsIntegers[NumParams] := FirstAncestralLoop;
  inc(NumParams);
until done;
FDQueryAddINDI.Params.ArraySize := NumParams;  { Reset to actual number }

FDQueryAddINDI.Execute(LogoAppForm.FDQueryAddINDI.Params.ArraySize);

The actual loading of the data into the SQLite database is very fast, and I have no problem with the speed of that.

What is slowing me down is the amount of time being taken in the repeat loop to assign all the values to the parameters.

The Params are built into FireDAC and are a TCollection. I do not have access to the source code, so I can't see what the AsStrings and AsIntegers methods are actually doing.

Assigning each value to each parameter for each insert does not appear to me to be a very efficient way to load this TCollection. Is there a faster way to load this? I'm thinking maybe a way to load a whole set of parameters at once, e.g. (IndiKey, HasData, ... FirstAncestralLoop) all as one. Or maybe to load my own TCollection as efficiently as I can, and then use the TCollection's Assign method to copy my TCollection into the the FireDAC's TCollection.

So my question is what would be the fastest way to load this TCollection of parameters that FireDAC requires?


Update: I'm including some timings for Arnaud.

As stated in Using SQLite with FireDAC (see its Array DML section):

Starting with v 3.7.11, SQLite supports the INSERT command with multiple VALUES. FireDAC uses this feature to implement Array DML, when Params.BindMode = pbByNumber. Otherwise, FireDAC emulates Array DML.

I've tested inserting 33,790 records changing the arraysize (number of records to load per execute), and timed the loading time with both pbByName (for emulation) and pbByNumber (using multiple values insert).

This was the timing:

Arraysize: 1, Executes: 33,790, Timing: 1530 ms (pbByName), 1449 ms (pbByNumber)
Arraysize: 10, Executes: 3,379, Timing: 1034 ms (pbByName), 782 ms (pbByNumber)
Arraysize: 100, Executes: 338, Timing:  946 ms (pbByName), 499 ms (pbByNumber)
Arraysize: 1000, Executes: 34, Timing: 890 ms (pbByName), 259 ms (pbByNumber)
Arraysize: 10000, Executes: 4, Timing: 849 ms (pbByName), 227 ms (pbByNumber)
Arraysize: 20000, Executes: 2, Timing: 594 ms (pbByName), 172 ms (pbByNumber)
Arraysize: 50000, Executes: 1, Timing: 94 ms (pbByName), 94 ms (pbByNumber)

Now the interesting thing about these timings is that the loading of those 33,790 records into the TCollection is taking a full 93 ms every single test run. It doesn't matter whether they are being added 1 at a time or 10000 at a time, this overhead of filling the TCollection of Params is always there.

For comparison, I did a larger test with 198,522 inserts just for pbByNumber:

Arraysize: 100, Executes: 1986, Timing: 2774 ms (pbByNumber)
Arraysize: 1000, Executes: 199, Timing: 1371 ms (pbByNumber)
Arraysize: 10000, Executes: 20, Timing: 1292 ms (pbByNumber)
Arraysize: 100000, Executes: 2, Timing: 894 ms (pbByNumber)
Arraysize: 1000000, Executes: 1, Timing: 506 ms (pbByNumber)

For all cases of this test, the overhead of loading the TCollection of Params takes about 503 ms.

So the loading of the TCollection seems to be at about 400,000 records per second. This is a significant portion of the insert time, and once I start working with large databases in the millions, this added time will be quite noticeable to the user of my program.

I would like to improve this, but I have not yet found a way to speed the loading of the Params.


Update 2: I was able to get about a 10% time improvement by putting all my code between a StartTransaction and a Commit, so that all blocks will be processed at once.

But I'm still looking for some way to load the TCollection of Params much faster.


One other idea:

What might work well and could be up to 16 times faster if it were possible would be something like the ParamValues method. This assigns multiple params at once and has the added advantage of directly supplying a variant array, and avoids the need to cast values.

It would work like this:

    FDQueryAddINDI.Params.ParamValues['indikey;hasdata;gedcomnames;sex;birthdate;died;deathdate;changed;eventlinesneedprocessing;eventlines;famc;fams;linkinfo;todo;nextreportindi;firstancestralloop']
       := VarArrayOf([Indikey, 0, ' ', ' ', ' ', 0, ' ', ' ', 1, ' ', -1, -1, -1, -1, -1, -1]);

However, ParamValues will only assign to the first set of Params, i.e. where NumIndiParms = 0.

Is there a way to do this for each Index in the loop, i.e., every instance of NumIndiParms?


Bounty: I really want to speed up the loading of the Params. I am now offering a bounty for someone to help find me a way to speed up the loading of the Params array TCollection as implemented in FireDAC.

Tincture answered 20/7, 2015 at 17:19 Comment(0)
D
7

Sounds a bit like premature optimization to me. IMHO a profiler would show that the repeat .... until done loop takes much less time than the Execute call itself. Assigning an integer is almost instant, just like assigning a string, thanks to the CopyOnWrite paradigm of Delphi string type, which copies the text by reference.

Note that in practice, there is no array DML feature in SQLite3. FireDac emulates array DML by creating multiple insertion, i.e. executing

insert into indi values (?,?,?,....),(?,?,?,....),(?,?,?,....),....,(?,?,?,....);

AFAIK this is the fastest way of inserting data using SQLite3. At least until the upcoming OTA feature is available.

Also ensure that you nest your insert within several transactions, and that the number of parameters set at once are not too high. From my tests, you should also create several transactions, if you have a lot of rows to insert. Maintaining a single transaction slows the process down. 10000 rows per transaction is a good number, from experiment.

BTW, our ORM is able to do all this low-level plumbing on its own, depending on the backend engine it runs on.

Update: Sounds like if the FireDac parameters may be in your case a real bottleneck. You should therefore bypass FireDAC, and bind directly your TCollection content with the SQlite3 engine. Try e.g. our SynSQLite3.pas unit. Remember to prepare your INSERT statement, using a multiple insertion ((?,?,?,....),(?,?,?,....),....), then bind directly your values. BTW DB.pas may be a real bottleneck, this is why our whole ORM by-pass this layer (but may use it if needed).

Update2: Since you asked for it, here is a version using mORMot.

First you define your record:

type
  TSQLIndy = class(TSQLRecord)
...
  published
    property indikey: string read findikey write findikey;
    property hasdata: boolean read fhasdata write fhasdata;
    property gedcomnames: string read fgedcomnames write fgedcomnames;
    property sex: string read fsex write fsex;
    property birthdate: string read fbirthdate write fbirthdate;
    property died: boolean read fdied write fdied;
...
  end;

Then you run the insertion via the ORM:

db := TSQLRestServerDB.CreateWithOwnModel([TSQLIndy],'test.db3');
db.CreateMissingTables; // will CREATE TABLE if not existing
batch := TSQLRestBatch.Create(db,TSQLIndy,10000);
try
  indy := TSQLIndy.Create;
  try
    for i := 1 to COUNT do begin
      indy.indikey := IntToString(i);
      indy.hasdata := i and 1=0;
      ...
      batch.Add(indy,true);
    end;
  finally
    indy.Free;
  end;
  db.BatchSend(batch);

The full source code is available online on paste.ee.

Here are the timing for 1,000,000 records:

Prepared 1000000 rows in 874.54ms
Inserted 1000000 rows in 5.79s

If I calculate well, it is more than 170,000 rows per seconds for insertion. Here, the ORM is not an overhead, it is an advantage. All the multi INSERT work, transactions (every 10000 rows), marshaling would be done by the framework. The TSQLRestBatch would store all the content as JSON in memory, then compute the SQL at once. I'm curious how direct FireDAC performs in comparison. And you would be able to switch to other database, if needed - another RDBMS (MySQL, Oracle, MSSQL, FireBird) or even MongoDB. By just adding a new line.

Hope it helps!

Data answered 20/7, 2015 at 18:8 Comment(16)
Thanks, Arnaud, for your ideas. This isn't premature optimization. I am actually now doing the optimization. :-) I had previously been testing various arraysizes per transaction and have now included some of those for you in my Update to my question. I agree with you that an integer or string assignment should be instantaneous, so obviously there is a lot more going on in adding to a TCollection than just the assignments. Hopefully someone can give me an insight as to how to make that part faster.Tincture
@Tincture So do not use the FireDAC abstraction, but directly the SQLite3 layer. See my update.Data
@Tincture I've just optimized the TSQLRequest.BindS method to avoid any memory allocation during string parameter binding. It may help to directly bind the string values from your TCollection. See this commit.Data
Thank you Arnaud. I'll try it. I am also aware of your mORMot framework, which I might try if I can't get satisfaction with FireDAC.Tincture
Arnaud: Using your SynSQLite3.pas unit is much more complicated than I expected and I really want to stick within the FireDAC framework if I can. If you (or anyone else) can tell me how I can adapt the code in my loop to set up the FireDAC parameters more efficiently, then I'll be happy to give the bounty that I've now offered. If not, then for now, the loading speed I've got will have to do ... until I decide I really need to do better and purchase the source code to FireDAC for myself and see if I can speed it up.Tincture
@Tincture I've updated my answer. And written a new sample for you. Feedback is welcome! Hope it helps.Data
Arnaud: Thank you for taking the time to create the sample. I downloaded it and installed your mORMot framework and managed to get it working. My computer seems slower than yours giving 1.04s and 6.91s to prepare and insert. I implemented your example with FireDAC and Array DML. For 1,000,000 with batch size of 40,000, it took 0.97s and 5.735s, actually a bit faster than mORMot. Larger batch sizes took the same time to prepare and improved the insert speed by 5% while increasing RAM use. The FireDAC Delphi includes (originally DA-SOFT's AnyDAC) does appear to work well with SQLite.Tincture
I thought I'd try 64-bit as well. Downloaded the sqlite3-64.dll and tried both examples. mORMot was 1.01s to prepare and 7.23 to insert. A bit slower than 32-bit. FireDAC was .89s to prepare and 3.94s to insert, quite a good speed improvement over 32 bit. What this at least has shown me is that the preparation time FireDAC is taking is reasonable, but I'm still interested in improving it as a second is too long. I'd be happy to provide more details to you about my testing if you'd like through private conversation. You can email me at lkessler at lkessler.comTincture
+lkessler Thanks for the feedback. Did you try to increase the TSQLRestBatch batch size from 10000 to 40000? To my taste, the ORM way of working is much cleaner, and is client/server by design (you could run it over the Internet, using REST/JSON). In fact, the mORMot "prepare" does much more than FireDAC parameters, since it creates a full JSON array of all values, ready to be sent to a server (from any kind of client, including Mobile or AJAX).Data
+lkessler Also consider that once you would use your SQLite3 database on production, inserting your batches to existing millions of entries, probably with Indexes on columns, the performance would degrade, in regard to inserting to a blank database. A MongoDB server may be a good idea, when storing huge amount of data - better than SQLite3 perhaps. With our framework, switching to MongoDB would be the matter of a few initialization lines of code...Data
@Tincture I've enhanced the Batch JSON preparation to reduce memory use, and CPU. Now preparation is 40% faster, and use 60% less memory.Data
Changing TSQLRestBatch size to 40000 makes no speed difference. I'll try your JSON preparation changes tomorrow when they get into your next Nightly build. But really, I'm hoping someone can suggest a similar improvement to the FireDAC Array DML preparation. If I had their source code, I might be able to find some inefficiencies. But someone who has the code (either the Enterprise version or the Client/Server add on pack provides it) might be able to suggest a way to extend one of the Params methods to achieve an improvement like you were able to make to mORMot's preparation.Tincture
@Tincture The FireDac source code is already very optimized. Speed enhancements as I did with mORMot won't be possible, unless the whole library design is modified.Data
Personally I always build my phrases from A to Z using string functions and operators. In the end that's what the database driver will execute and I'd rather do this myself then delegating this (simple) job to some third party component and crossing my fingers. My 2c.Repose
@Repose I hope you still bind parameters: manual values concatenation is unsafe (SQL injection), and slower to execute. The ORM could allow you to focus on the objects, maintain the table layout, and allow to switch from one SQL dialect to another - even NoSQL for mORMot, which supports MongoDB and emulate SQL for it.Data
@Arnaud, nope, in fact I escape them manually using the db driver's escape functions. I feel more at ease doing it myself. As for the speed part I guess me concatenating it versus the library doing it that wouldn't make much of a difference if any. Again, there are multiple ways to kill a fly ;)Repose
T
1

The best improvement I can find is to replace the AsString and AsInteger calls with Values calls. This prevents the assignment of the datatype (string or integer) to each item and saves about 10% of the overhead.

So 93 ms in the small test is down to 83 ms. And 503 ms in the big test is down to 456 ms.

FDQueryAddINDI.Params[0].Values[NumParams] := IndiKey;   
FDQueryAddINDI.Params[1].Values[NumParams] := HasData;
FDQueryAddINDI.Params[2].Values[NumParams] := GedcomNames;
FDQueryAddINDI.Params[3].Values[NumParams] := Sex;
FDQueryAddINDI.Params[4].Values[NumParams] := Birthdate;
FDQueryAddINDI.Params[5].Values[NumParams] := Died;
FDQueryAddINDI.Params[6].Values[NumParams] := Deathdate;
FDQueryAddINDI.Params[7].Values[NumParams] := Changed;
FDQueryAddINDI.Params[8].Values[NumParams] := EventLinesNeedProcessing;
FDQueryAddINDI.Params[9].Values[NumParams] := EventLines;
FDQueryAddINDI.Params[10].Values[NumParams] := FamC;
FDQueryAddINDI.Params[11].Values[NumParams] := FamS;
FDQueryAddINDI.Params[12].Values[NumParams] := Linkinfo;
FDQueryAddINDI.Params[13].Values[NumParams] := ToDo;
FDQueryAddINDI.Params[14].Values[NumParams] := NextReportIndi;
FDQueryAddINDI.Params[15].Values[NumParams] := FirstAncestralLoop;

The type can optionally be set initially when the file is opened. Maximum string lengths can be set as well. This does not have any effect on the time, and setting the lengths does not reduce the memory used. The types and lengths are set in this manner:

FDQueryAddINDI.Params[0].DataType := ftString;
FDQueryAddINDI.Params[1].DataType := ftInteger;
FDQueryAddINDI.Params[2].DataType := ftString;
FDQueryAddINDI.Params[3].DataType := ftString;
FDQueryAddINDI.Params[4].DataType := ftString;
FDQueryAddINDI.Params[5].DataType := ftInteger;
FDQueryAddINDI.Params[6].DataType := ftString;
FDQueryAddINDI.Params[7].DataType := ftString;
FDQueryAddINDI.Params[8].DataType := ftInteger;
FDQueryAddINDI.Params[9].DataType := ftString;
FDQueryAddINDI.Params[10].DataType := ftInteger;
FDQueryAddINDI.Params[11].DataType := ftInteger;
FDQueryAddINDI.Params[12].DataType := ftInteger;
FDQueryAddINDI.Params[13].DataType := ftInteger;
FDQueryAddINDI.Params[14].DataType := ftInteger;
FDQueryAddINDI.Params[15].DataType := ftInteger;
FDQueryAddINDI.Params[0].Size := 20;
FDQueryAddINDI.Params[2].Size := 1;
FDQueryAddINDI.Params[3].Size := 1;
FDQueryAddINDI.Params[4].Size := 1;
FDQueryAddINDI.Params[6].Size := 1;
FDQueryAddINDI.Params[7].Size := 1;
FDQueryAddINDI.Params[9].Size := 1;
Tincture answered 27/7, 2015 at 1:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.