Improve UPDATE-per-second performance of SQLite?
Asked Answered
A

3

12

My question comes directly from this one, although I'm only interested on UPDATE and only that.

I have an application written in C/C++ which makes heavy use of SQLite, mostly SELECT/UPDATE, on a very frequent interval (about 20 queries every 0.5 to 1 second)

My database is not big, about 2500 records at the moments, here is the table structure:

CREATE TABLE player (
   id INTEGER PRIMARY KEY AUTOINCREMENT,
   name VARCHAR(64) UNIQUE,
   stats VARBINARY,
   rules VARBINARY
);

Up to this point I did not used transactions because I was improving the code and wanted stability rather performance.

Then I measured my database performance by merely executing 10 update queries, the following (in a loop of different values):

// 10 times execution of this
UPDATE player SET stats = ? WHERE (name = ?)

where stats is a JSON of exactly 150 characters and name is from 5-10 characters.

Without transactions, the result is unacceptable: - about 1 full second (0.096 each)

With transactions, the time drops x7.5 times: - about 0.11 - 0.16 seconds (0.013 each)

I tried deleting a large part of the database and/or re-ordering / deleting columns to see if that changes anything but it did not. I get the above numbers even if the database contains just 100 records (tested).

I then tried playing with PRAGMA options:

PRAGMA synchronous = NORMAL
PRAGMA journal_mode = MEMORY

Gave me smaller times but not always, more like about 0.08 - 0.14 seconds

PRAGMA synchronous = OFF
PRAGMA journal_mode = MEMORY

Finally gave me extremely small times about 0.002 - 0.003 seconds but I don't want to use it since my application saves the database every second and there's a high chance of corrupted database on OS / power failure.

My C SQLite code for queries is: (comments/error handling/unrelated parts omitted)

// start transaction
sqlite3_exec(db, "BEGIN TRANSACTION", NULL, NULL, NULL);

// query
sqlite3_stmt *statement = NULL;
int out = sqlite3_prepare_v2(query.c_str(), -1, &statement, NULL);
// bindings
for(size_t x = 0, sz = bindings.size(); x < sz; x++) {
   out = sqlite3_bind_text(statement, x+1, bindings[x].text_value.c_str(), bindings[x].text_value.size(), SQLITE_TRANSIENT);
   ...
}

// execute
out = sqlite3_step(statement);

if (out != SQLITE_OK) {
   // should finalize the query no mind the error
   if (statement != NULL) {
      sqlite3_finalize(statement);
   }
} 

// end the transaction
sqlite3_exec(db, "END TRANSACTION", NULL, NULL, NULL);

As you see, it's a pretty typical TABLE, records number is small and I'm doing a plain simple UPDATE exactly 10 times. Is there anything else I could do to decrease my UPDATE times? I'm using the latest SQLite 3.16.2.

NOTE: The timings above are coming directly from a single END TRANSACTION query. Queries are done into a simple transaction and i'm using a prepared statement.

UPDATE:

I performed some tests with transaction enabled and disabled and various updates count. I performed the tests with the following settings:

VACUUM;
PRAGMA synchronous = NORMAL;  -- def: FULL
PRAGMA journal_mode = WAL;    -- def: DELETE
PRAGMA page_size = 4096;      -- def: 1024

The results follows:

no transactions (10 updates)

  • 0.30800 secs (0.0308 per update)
  • 0.30200 secs
  • 0.36200 secs
  • 0.28600 secs

no transactions (100 updates)

  • 2.64400 secs (0.02644 each update)
  • 2.61200 secs
  • 2.76400 secs
  • 2.68700 secs

no transactions (1000 updates)

  • 28.02800 secs (0.028 each update)
  • 27.73700 secs
  • ..

with transactions (10 updates)

  • 0.12800 secs (0.0128 each update)
  • 0.08100 secs
  • 0.16400 secs
  • 0.10400 secs

with transactions (100 updates)

  • 0.088 secs (0.00088 each update)
  • 0.091 secs
  • 0.052 secs
  • 0.101 secs

with transactions (1000 updates)

  • 0.08900 secs (0.000089 each update)
  • 0.15000 secs
  • 0.11000 secs
  • 0.09100 secs

My conclusions are that with transactions there's no sense in time cost per query. Perhaps the times gets bigger with colossal number of updates but i'm not interested in those numbers. There's literally no time cost difference between 10 and 1000 updates on a single transaction. However i'm wondering if this is a hardware limit on my machine and can't do much. It seems i cannot go below ~100 miliseconds using a single transaction and ranging 10-1000 updates, even by using WAL.

Without transactions there's a fixed time cost of around 0.025 seconds.

Aficionado answered 3/2, 2017 at 18:30 Comment(7)
@Olaf, the only C++ stuff is a std::string; the rest is C. I specifically emphasize that above. Secondly i don't want someone to review my code, i want a better approach of SQLite to solve my problemAficionado
Please stop editing wrong tags! Either you provide a C minimal reproducible example or you leave the C++ tag! Note that there might be different answers to your question, depending on the language.Gans
You already know that the fastest way is to use a single transaction.Infirmity
@Infirmity yes, transactions reduced the times about x10 but aren't the times i get still slow? I meant, it's usual 10 simple updates to take about 100ms on transactions? or am i doing something wrong?Aficionado
@user6096479: Why are you timing the creation of the prepared statement? The call to sqlite3_prepare_v2 should not be part of the code being timed.Laminar
@NicolBolas, i am not. Actually the timings are coming directly from END TRANSACTION, meaning the queries themselves. I tried to time everything but they were close to zero before focusing on transactions. I just wanted to test everything to verify it's not an error of mine. Edited the questin to state this.Aficionado
Your question led me to add a transaction for bulk updates on thousands of update queries and the performance skyrocketed 100000%. I cannot thank you enough!!Jacklynjackman
I
4

With such small amounts of data, the time for the database operation itself is insignificant; what you're measuring is the transaction overhead (the time needed to force the write to the disk), which depends on the OS, the file system, and the hardware.

If you can live with its restrictions (mostly, no network), you can use asynchronous writes by enabling WAL mode.

Infirmity answered 4/2, 2017 at 11:14 Comment(3)
I tried WAL by setting PRAGMA synchronous = NORMAL and PRAGMA journal_mode = WAL but i did not get any improvement, i mean at all. I don't need network at all and i would like to benefit from WAL, i just don't get any gain (perhaps some additional options are required?!?). On the other side, i only need to update a maximum of 20 queries per maximum of 1 second. They (queries) may be fewer but cannot be more. It seems to me that using synchronous = NORMAL or FULL i can't break the barrier of 10 ms / update query, unless i chose to give handle to OS, reducing the safety.Aficionado
To check if WAL is enabled, execute PRAGMA journal_mode;.Infirmity
I don't know why but my PRAGMA journal_mode = WAL was not queried on my C call at all, so the mode remained DELETE. When i used the query on PHPLiteAdmin executed normally. However my times slightly reduced to about 0.07 - 0.11, which seems OK.Aficionado
L
3

You may still be limited by the time it takes to commit a transaction. In your first example each transaction took about 0.10 to complete which is pretty close to the transaction time for inserting 10 records. What kind of results do you get if you batch 100 or 1000 updates in a single transaction?

Also, SQLite expects around 60 transactions per second on an average hard drive, while you're only getting about 10. Could your disk performance be the issue here?

https://sqlite.org/faq.html#q19

Logbook answered 4/2, 2017 at 1:35 Comment(2)
Am i limited because of hard disk speed and synchronous mode waiting for SQLite to verify the data written? So if i choose safety over performance i am limited to ~10 ms per an update query with a typical 7200 disk? I did not test with 100 or 1000 updates because my application only handles a maximum of 15-20 queries per transaction (with some SELECTs in between), so i kind of emulated the scenario on the question. I de-fraggled my disk a week ago, i'll do again and reply :)Aficionado
nope, the times are still ranging from 0.10 to 0.14 secs by a transaction of 10 queries.Aficionado
P
2

Try adding INDEXEs to your database:

CREATE INDEX IDXname ON player (name)
Prodrome answered 5/3, 2017 at 13:3 Comment(1)
Using an index would make a huge difference on a large tableComintern

© 2022 - 2024 — McMap. All rights reserved.