What does the COLLATE keyword do when creating a sqlite index?
Asked Answered
F

3

20

According to the sqlite3 documentation,

The COLLATE clause following each column name defines a collating sequence used for text entries in that column. The default collating sequence is the collating sequence defined for that column in the CREATE TABLE statement. Or if no collating sequence is otherwise defined, the built-in BINARY collating sequence is used.

What does a collating sequence do, and what is a BINARY collating sequence?

Freeland answered 28/4, 2009 at 21:11 Comment(0)
B
9

It is the way that the sql engine orders data internally. Binary Collation does what it suggests, it does a binary comparison. Generally its the fastest collation though I have never quantified it, as it checks bit patterns, which means it is case and accent sensitive.

Bethought answered 28/4, 2009 at 21:11 Comment(0)
G
7

Binary collation compares your string byte by byte, as in an unicode table. For example: A,B,a,b. A case insensitive (NOCASE) order would be: a,A,b,B.

The advantage of binary collation is its speed, as string comparison is very simple/fast. In the general case, indexes with binary might not produce expected results for sort; however, for exact matches they can be useful.

COLLATE NOCASE also affects case sensitive queries.

If you have a column with these values: 'aa', 'aA'

select * from table where col = 'aa'

If you have created your column with COLLATE NOCASE it will return both 'aa' and 'aA'. Otherwise, if you didn't specify it, it will return only 'aa'.

You can also specify it in a query (this is slower then if you had created your column with COLLATE NOCASE)

select * from table where col = 'aa' COLLATE NOCASE
Gammadion answered 28/4, 2009 at 21:11 Comment(0)
O
0

Collation does not only affect ordering in indexes or results, but may also affect the validity of certain operations on tables. So you might find it appropriate to define a case-insensitive text column as primary key (or with UNIQUE constraint), like this:

CREATE TABLE t1 (
  id TEXT COLLATE NOCASE PRIMARY KEY
);

After inserting an id value 'a',

INSERT INTO t1 VALUES ('a');

it's not possible to add another id with value 'A' (differing just by case). The following SQL

INSERT INTO t1 VALUES ('A');

will fail with

Runtime error: UNIQUE constraint failed: t1.id (19)

It's nevertheless possible to change the id value from lowercase to uppercase:

UPDATE t1 SET id = 'A' WHERE id = 'A';

… which is quite interesting because the where clause actually works case-insensitively and the change is made from something A-like to a proper A.

Ojeda answered 28/4, 2009 at 21:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.