Redisearch full text index not working with Python client
Asked Answered
M

3

6

I am trying to follow this Redis documentation link to create a small DB of notable people searchable in real time (using Python client).

I tried a similar code, but the final line, which queries by "s", should return two documents, instead, it returns a blank set. Can anybody help me find out the mistake I am making?

import redis
from redis.commands.json.path import Path
import redis.commands.search.aggregation as aggregations
import redis.commands.search.reducers as reducers
from redis.commands.search.field import TextField, NumericField, TagField
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.query import NumericFilter, Query

d1 = {"key": "shahrukh khan", "pl": '{"d": "mvtv", "id": "1234-a", "img": "foo.jpg", "t: "act", "tme": "1965-"}', "org": "1", "p": 100}
d2 = {"key": "salman khan", "pl": '{"d": "mvtv", "id": "1236-a", "img": "fool.jpg", "t: "act", "tme": "1965-"}', "org": "1", "p": 100}
d3 = {"key": "aamir khan", "pl": '{"d": "mvtv", "id": "1237-a", "img": "fooler.jpg", "t: "act", "tme": "1965-"}', "org": "1", "p": 100}


schema = ( 
    TextField("$.key", as_name="key"),  
    NumericField("$.p", as_name="p"),  
) 

r = redis.Redis(host='localhost', port=6379)
rs = r.ft("idx:au") 
rs.create_index(     
    schema,     
    definition=IndexDefinition(     
        prefix=["au:"], index_type=IndexType.JSON   
    )    
)

r.json().set("au:mvtv-1234-a", Path.root_path(), d1)  
r.json().set("au:mvtv-1236-a", Path.root_path(), d2)  
r.json().set("au:mvtv-1237-a", Path.root_path(), d3)  

rs.search(Query("s"))
Massif answered 26/12, 2023 at 11:33 Comment(0)
J
2

When executing a query from the redis-py client, it will transmit the FT.SEARCH command to the redis server. You can observe it by using the command MONITOR from a redis-client for example.

According to the documentation, when providing a single word for the research, the matching is full. That's why the result of your query is the empty set. If you want to search by prefix, you need to use the expression prefix*.

However, documentation says:

The prefix needs to be at least two characters long.

Hence, you cannot search by word starting only by s. What you could do:

rs.search(Query("sa*"))
#Result{1 total, docs: [Document {'id': 'au:mvtv-1236-a', 'payload': None, 'json': '{"key":"salman khan","pl":"{\\"d\\": \\"mvtv\\", \\"id\\": \\"1236-a\\", \\"img\\": \\"fool.jpg\\", \\"t: \\"act\\", \\"tme\\": \\"1965-\\"}","org":"1","p":100}'}]}

Aside note

If you want to scope your search on a specific field, the syntax is:

Query("@field_name: word") # Query("@key: sa*")

where @field_name is the schema field's name. Otherwise, the search will look up for all TextField attributes.

Jennettejenni answered 31/12, 2023 at 13:59 Comment(3)
Thanks. That makes sense. However, in our use case, we really need to start showing results even after just one character has been typed in. Is there a way around here to avoid that restriction?Massif
@Massif I kept your question in mind but I don't see a neat solution. For example, with tags corresponding to all first letters but it would probably be over-engineering and be problematic when updating the field.Jennettejenni
Hi, please describe the solution, if needed we will overengineer, but we need the first letter to be working. :)Massif
A
1

you can try redefine the documents d1, d2, and d3. There are syntax errors in the JSON

On Following code. I corrected the syntax errors in the JSON strings of the "pl" field and fixed the typo in the query string.

d1 = {"key": "shahrukh khan", "pl": '{"d": "mvtv", "id": "1234-a", "img": "foo.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}
d2 = {"key": "salman khan", "pl": '{"d": "mvtv", "id": "1236-a", "img": "fool.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}
d3 = {"key": "aamir khan", "pl": '{"d": "mvtv", "id": "1237-a", "img": "fooler.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}


schema = ( 
    TextField("$.key", as_name="key"),  
    NumericField("$.p", as_name="p"),  
) 

r = redis.Redis(host='localhost', port=6379)
rs = r.ft("idx:au") 
rs.create_index(     
    schema,     
    definition=IndexDefinition(     
        prefix=["au:"], index_type=IndexType.JSON   
    )    
)

r.json().set("au:mvtv-1234-a", Path.root_path(), d1)  
r.json().set("au:mvtv-1236-a", Path.root_path(), d2)  
r.json().set("au:mvtv-1237-a", Path.root_path(), d3)  
Almaraz answered 30/12, 2023 at 23:22 Comment(1)
Thanks. That typo was really silly on my part. :) But despite that it does not work, because, as another answer below points out, the prefix must be minimum two characters, and here "s" is just 1 character.Massif
A
0

Issue might be related to the way you are constructing the Query object in the you're trying to search for documents where the value of the "key" field matches the string "s". However, since your "key" field is of type TextField, it won't perform a full-text search for the term "s" Instead it will look for an exact match

So, If you want to perform a full-text search on the "key" field you should use the TextFiel search capabilities

import redis
from redis.commands.json.path import Path
from redis.commands.search.field import TextField, NumericField
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.query import Query

d1 = {"key": "shahrukh khan", "pl": '{"d": "mvtv", "id": "1234-a", "img": "foo.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}
d2 = {"key": "salman khan", "pl": '{"d": "mvtv", "id": "1236-a", "img": "fool.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}
d3 = {"key": "aamir khan", "pl": '{"d": "mvtv", "id": "1237-a", "img": "fooler.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}

schema = (
    TextField("$.key", as_name="key"),
    NumericField("$.p", as_name="p"),
)

r = redis.Redis(host='localhost', port=6379)
rs = r.ft("idx:au")
rs.create_index(
    schema,
    definition=IndexDefinition(
        prefix=["au:"], index_type=IndexType.JSON
    )
)

r.json().set("au:mvtv-1234-a", Path.root_path(), d1)
r.json().set("au:mvtv-1236-a", Path.root_path(), d2)
r.json().set("au:mvtv-1237-a", Path.root_path(), d3)

# Use TextField with wildcard for partial matching on the "key" field
rs.search(Query("@key:s*"))
Almaraz answered 31/12, 2023 at 20:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.