Using COLLATE in Android SQLite - Locales is ignored in LIKE statement
Asked Answered
R

5

13

When creating my SQLite database in Android I set the database locale - db.setLocale(new Locale("cz_CZ")). This is a Czech locale.

A SELECT statement works and takes the locale into account, for example:

SELECT * from table WHERE name='sctzy' COLLATE LOCALIZED 

Will find the entry 'ščťžý'.

But using LIKE will fail:

SELECT * from table WHERE name LIKE '%sctzy%' COLLATE LOCALIZED 

No row is returned.

BTW. There is no java.text.Normalized class in Android. I thought I could make a second column with a normalized text, stripped of special characters, which would be used for searching - but I am missing a class or way how to normalize the String.

Regimentals answered 13/8, 2010 at 22:9 Comment(1)
I found a partial answer here: #3212474 But that's only how to remove accents. I would have to make a duplicate column in my database with a normalized text... Isn't there a better way?Regimentals
N
7

Have you had a look at the SQLite documentation for LIKE? It has come information about non ASCII characters and a bug. Maybe Android has an older version of SQLite installed where this is a problem.

I think the second normalised column might be your best option unfortunately.

Nun answered 13/8, 2010 at 23:37 Comment(0)
F
3

Just today i had exactly the same task as you had. And in my situation making additional shadow columns is not a case because i'm having to search more than one column. So i came to the solution like this, which is tested in real project. In my case i'm handling only lower case letters but you can extend the function with upper case letters as well.

db.setLocale(Locale("cz", "CZ"))
val query = "SELECT * FROM table WHERE name GLOB ${getExpr(str)} ORDER BY name COLLATE LOCALIZED ASC"

private fun getExpr(input: String) : String{
    var expr = ""
    for(lettter in input){
        expr += when(lettter){
            's','š' -> "[sš]"
            'a','á' -> "[aá]"
            'e','ě','é' -> "[eěé]"
            'i','í' -> "[ií]"
            'z','ž' -> "[zž]"
            'c','č' -> "[cč]"
            'y','ý' -> "[yý]"
            'r','ř' -> "[rř]"
            'u','ů','ú' -> "[uůú]"
            'o','ó' -> "[oó]"
            'n','ň' -> "[nň]"
            'd','ď' -> "[dď]"
            't','ť' -> "[tť]"
            else -> lettter
        }
     }
     return "'*${expr}*'"
}
Fencible answered 29/11, 2018 at 9:14 Comment(1)
For extending to upper case letters I just added: ...in input.lowercase()... and for each row 's','š' -> "[sšSŠ]", etc.Vicechancellor
S
3

In Android sqlite, LIKE and GLOB ignore both COLLATE LOCALIZED and COLLATE UNICODE (they only work for ORDER BY). However, as @asat explains in his answer, you can use GLOB with a pattern that will replace each letter with all the available alternatives of that letter. In Java:

public static String addTildeOptions(String searchText) {
    return searchText.toLowerCase()
                     .replaceAll("[aáàäâã]", "\\[aáàäâã\\]")
                     .replaceAll("[eéèëê]", "\\[eéèëê\\]")
                     .replaceAll("[iíìî]", "\\[iíìî\\]")
                     .replaceAll("[oóòöôõ]", "\\[oóòöôõ\\]")
                     .replaceAll("[uúùüû]", "\\[uúùüû\\]")
                     .replace("*", "[*]")
                     .replace("?", "[?]");
}

And then (not literally like this, of course):

SELECT * from table WHERE lower(column) GLOB "*addTildeOptions(searchText)*"

This way, for example in Spanish, a user searching for either mas or más will get the search converted into m[aáàäâã]s, returning both results.

It is important to notice that GLOB ignores COLLATE NOCASE, that's why I converted everything to lower case both in the function and in the query. Notice also that the lower() function in sqlite doesn't work on non-ASCII characters - but again those are probably the ones that you are already replacing!

The function also replaces both GLOB wildcards, * and ?, with "escaped" versions.

Senzer answered 17/4, 2019 at 9:40 Comment(1)
I had to fix your regex to get it working, because it is inverted. Where is .replaceAll("[aáàäâã]", "\\[aáàäâã\\]") I had to use .replaceAll("\\.*[aáàäâã]\\.*", "[aáàäâã]")Suspensory
M
2

Creating a second normalised column can be used to go around limitations (as mentioned briefly in other answers).

This means in practice that you have to create another (shadow) column of your first where the same data in a fixed case (e.g. all upper chars) is stored. Case insensitive queries (including like queries) can be made on this new column with search values in the same case.

If the first column "a" contains

AAA
aaa
Bbb
äää
ééé

The second column a_shadow would contain for the same rows

AAA
AAA
BBB
ÄÄÄ
ÉÉÉ

and your original query (example) "select a from mytable where a='äää'"
would be replaced with "select a from mytable where A='ÄÄÄ'"

Your code needs to be updated to fill the converted shadow content when adding the primary content. If the column is added after creation or you cannot change the code existing values may need to be converted using an update query. Example:

UPDATE mytable SET a_shadow=UPPER(a);

Maggiemaggio answered 28/11, 2013 at 9:3 Comment(0)
P
0

Might be time consuming, but you can use the java.text.Normalizer like here

Converting Symbols, Accent Letters to English Alphabet

As is not part of the java subset that Android, you may try to look for it at the code of java, such as Normalizer.java With the Javadoc found here:

And copy the part of the code needed inside your project.

Hope it works!

Primogenial answered 4/5, 2012 at 9:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.