Before you mark this question as duplicate, PLEASE HEAR ME OUT!!
I have already read the questions asked here on how to improve performance e.g. just to mention a few Improve INSERT-per-second performance of SQLite? and What are the performance characteristics of sqlite with very large database files?
I am struggling to make sqlite work with database file size of 5 gigabytes. On the contrary there are people out there, who claim that sqlite works 'great' for them even when database size is as big as 160 GB. I have not tried it myself but from the questions asked, I guess that all the bench-marking is perhaps done with only table in the database.
I am using a database with
- 20 or so tables
- Half of the tables have more than 15 columns
- Each of these 15-or-so-column-tables have 6/7 foreign key columns
- A few of these table have already grown to have 27 million records with in a month
The development machine that I am using is 3 GHz Quad core machine with 4 gigs of RAM and yet it takes more than 3 minutes just to query the row_count in these big tables.
I couldn't find any way to partition the data horizontally. The best shot that I have is to split the data across multiple database files one for each table. But in that case, as far as I know, the foreign key column constraints can't be employed, so I will have to create a self sufficient table (without any foreign keys).
So my questions are
a) Am I using the wrong database for the job?
b) What do you think where am I going wrong?
c) I have not added indexes on foreign keys yet but if just row count query takes four minutes how are indexes of foreign keys going to me help me?
EDIT To provide more information even though no one has asked for it :) I am using SQLite version 3.7.9 with system.data.sqlite.dll version 1.0.77.0
EDIT2: I THINK where I am going different from the 160 gig guys is that they may select an individual record or a small range of records. But I have to load all of the 27 millions rows in my table, join them against another tables, group the records as asked by the User and return the results. Any input on whats the best way to optimize the database for such results.
I can't cache the results of a previous query as it does not make sense in my case. The chances of hitting the cache will are fairly low.
PRAGMA cache_size=16000
(this reduced a process that was taking over 20 mins to under 2 mins) you could use a higher value if needed but be warned that value is in pages so the memory used iscache_size
in pages timespage_size
_ in bytes_ – Hoxie