SQLite3 how do I use indices?
Asked Answered
A

2

13

I’m working on SQLite3 indices.

Here’s a table COMAPNY:

CREATE TABLE COMPANY(
ID INT PRIMARY KEY     NOT NULL,
NAME           TEXT    NOT NULL,
AGE            INT     NOT NULL,
ADDRESS        CHAR(50),
SALARY         REAL
);

INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (1, 'Paul', 32, 'California', 20000.00 );

INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) 
VALUES (2, 'Allen', 25, 'Texas', 15000.00 );

INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (3, 'Teddy', 23, 'Norway', 20000.00 );

INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (4, 'Mark', 25, 'Rich-Mond ', 65000.00 );

INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (5, 'David', 27, 'Texas', 85000.00 );

INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (6, 'Kim', 22, 'South-Hall', 45000.00 );

INSERT INTO COMPANY VALUES (7, 'James', 24, 'Houston', 10000.00 );

=======================================================

SELECT * FROM COMPANY;

Results:

1|Paul|32|California|20000.0
2|Allen|25|Texas|15000.0
3|Teddy|23|Norway|20000.0
4|Mark|25|Rich-Mond |65000.0
5|David|27|Texas|85000.0
6|Kim|22|South-Hall|45000.0
7|James|24|Houston|10000.0

Let’s create an index salary_index,

CREATE INDEX IF NOT EXISTS salary_index on COMPANY (SALARY);

What does it do and how do I use it?

This time I make an index like this after dropping the older one:

CREATE INDEX IF NOT EXISTS salary_index on COMPANY (SALARY) 
WHERE SALARY > 50000;

After I added the index, I did:

SELECT * FROM COMPANY;

expecting that I’d see only the ones with salary higher than 50000, but I saw people lower than that.

And I also tried to do this:

SELECT * FROM COMPANY INDEXED BY salary_index;

Then I get Error: no query solution Apparently I have to do: SELECT * FROM COMPANY INDEXED BY salary_index WHERE SALARY > 50000; Where the condition must be the same as in the index.

So… how do I use indices?

And answered 9/5, 2014 at 19:18 Comment(0)
D
11

Indexes never change the meaning of your queries. What they can do is to speed up some of your queries; when that is possible, they are used automatically.

An index is useful for

  • looking up records with comparisons on the indexed column:

    SELECT * FROM Company WHERE Salary = 20000.0;
    SELECT * FROM Company WHERE Salary BETWEEN 40000 AND 80000;
    

    which also includes joins on the indexed column; and

  • sorting records:

    SELECT * FROM Company ORDER BY Salary
    

    which also includes GROUP BY and DISTINCT.

See the documentation for details:
Query Planning
The SQLite Query Planner

Digestion answered 9/5, 2014 at 21:12 Comment(0)
A
7

Here is the conversation I had with one of my code master(Thanks S.P.):

An index is usually a tool for performance. If you do not have an index for a field, queries on that field will need to do a full sequential scan of the table. This is not a problem if the table is small but if you have tens of thousands, or above rows, then a full sequential scan is simply too slow.

So if you want to get the rows for which salary < 50000, just create an index on the table, and then issue

SELECT * FROM COMPANY WHERE SALARY < 50000

It will automatically use the correct indexas long as the SALARY field is indexed

So if we have two indexes like

CREATE INDEX salary_index WHERE salary < 50000;
CREATE INDEX age_index WHERE age < 40;

and then we run a query like

SELECT * FROM COMPANY WHERE salary < 50000 AND age < 40;

It automatically uses the above 2 indices for the query.

In most RDBMSs, it is possible to use more than one index in a single query and yes, they're used automatically if they apply. But there might be restrictions on this and they're RBDMS specific. But a better idea is to create an index that contains multiple fields.

In an optimal situation, you would have all the fields needed by the query in a single index So if you want employees that earn more than 50 000 $ and are younger than 40 years you would define an index like this:

CREATE INDEX company_salary_age ON company (salary, age);

The order of the fields matters. This index can be used in a query that has a WHERE clause on salary, or salary and age, but not age without salary. That is, any number of fields of the index can be used as long as they are contiguous in the front of the index That is, in the query you can omit fields from the end, but not in the beginning or middle.

And answered 10/5, 2014 at 20:12 Comment(9)
No, most database use only one index per table. In your two-index example, the database's query optimizer would choose which index to use. Also, it's uncommon to build partial indexes (not all databases support them) to match future queries. A full index on salary will be equally useful in responding to your query as a partial index (and will be able to satisfy queries that your partial index cannot satisfy).Lap
Sorry I made a wrong comment. SQLite3 supports multiple indexes and my question was about SQLite3.And
That is not correct. SQLite will allow you to create multiple indexes (all databases do that), and it allows you to create multi-column indexes. But it will only ever apply a single index per table for the row-filtering operation on that table in a query. The document you linked explains (at a basic level) how SQLite chooses which index to use.Lap
Yes you are right, i meant "multi-column" indexes :) Does DB always choose the most optimal indexes then?And
The issue of choosing the most optimal index is a complicated one because it depends not only on the definition of the index and the nature of the query, but also on the contents of the index. An index with 1000 evenly distributed values is better than an index with only 2 values, or an index with 1000 values if the values are not distributed evenly (unless the search is for one of the uncommon values). The database attempts to find the optimal index and in almost every case is successful.Lap
if there is an index salary_index ON COMPANY("salary", "index"), does it matter whether conditioning "index" comes before "salary" in a query? Or should it be in the same order as an index to use the corresponding index?And
Order in the query doesn't matter. Order in the index does. So Company(salary, index) will increase performance on a query with an qualifying use of salary or salary and index (in either order in the query), but not on a query that only uses the index column for filtering.Lap
Is it then always good to have an index Company(salary, index) rather than Company(salary)? Or if such index is unnecessary it's a waste of disk?And
If you expect to issue queries that include optimizable expressions for both salary and index and need those queries to be as fast as possible, then it makes sense to maintain the compound index. Otherwise the overhead of maintaining the compound index is wasted to no good effect.Lap

© 2022 - 2024 — McMap. All rights reserved.