Kibana Regular expression search
Asked Answered
S

3

25

I am newbie to ELK. I want to search for docs based on order of occurrence of words in a field. For example,

In doc1, my_field: "MY FOO WORD BAR EXAMPLE"
In doc2, my_field: "MY BAR WORD FOO EXAMPLE"

I would like to query in Kibana for docs where "FOO" is followed by "BAR" and not the opposite. So, I would like doc1 to return in this case and not doc2. I tried using below query in Kibana search. But, it is not working. This query doesn't even produce any search results.

my_field.raw:/.*FOO.*BAR.*/

I also tried with analyzed field(just my_field), though I came to know that should not work. And of course, that didn't produce any results either.

Please help me with this regex search. Why am I not getting any matching result for that query?

Sunfast answered 13/11, 2016 at 0:12 Comment(0)
N
3

I'm not sure offhand why that regex query wouldn't be working but I believe Kibana is using Elasticsearch's query string query documented here so for instance you could do a phrase query (documented in the link) by putting your search in double quotes and it would look for the word "foo" followed by "bar". This would perform better too since you would do this on your analyzed field (my_field) where it has tokenized each word to perform fast lookups. So you search in Kibana would be:

my_field: "FOO BAR"

Update:

Looks like this is an annoying quirk of Kibana (probably for backwards compatability reasons). Anyway, this isn't matching for you because you're searching against a non-analyzed field and apparently Kibana by default is lowercasing the search therefore it won't match the the non-analyzed uppercase "FOO". You can configure this in Kibana advanced settings mentioned here, specifically by setting the configuration option "lowercase_expanded_terms" to false.

Nonoccurrence answered 13/11, 2016 at 3:8 Comment(4)
Thanks for reply. Not just only that. I will be needing all the docs even if "FOO" and "BAR" are separated by some other words.<br> Example: Match doc1, my_field: "MY FOO WORD BAR EXAMPLE" . <br> Not Match doc2, my_field: "MY BAR WORD FOO EXAMPLE"Sunfast
So, I will require regex and not phrase matchingSunfast
Okay, I figured out why this was happening for you (weird quirk of Kibana), updated the answer.Nonoccurrence
Also, from a performance standpoint using a span near query (which phrase matching) uses with a high slop value + in_order = true would achieve what you regex does and you could do it against the analyzed field which I think should perform better (because each token has its order so in theory it looks for both tokens then makes sure the indexOf(bar) > indexOf(foo), similar answer here - https://mcmap.net/q/539918/-preserving-order-of-terms-in-elasticsearch-queryNonoccurrence
E
4

Kibana’s standard query language is based on Lucene query syntax.

And the default analyzer will tokenize the text to different words: [MY, FOO, WORD, BAR, EXAMPLE]

Instead of using regex match, you can try the following search string in Kibana:

my_field: FOO AND my_field: BAR

And if your "my_field" data looks like "MYFOOWORDBAREXAMPLE",which can not be tokenized, you should use the query string:

my_field: *FOO*BAR*
Emilieemiline answered 6/11, 2018 at 10:36 Comment(0)
E
4
GET /_search
{
    "query": {
        "regexp": {
            "user": {
                "value": "k.*y",
                "flags" : "ALL",
                "max_determinized_states": 10000,
                "rewrite": "constant_score"
            }
        }
    }
}

More details on here

Eadith answered 9/7, 2020 at 10:36 Comment(3)
and how do you do that in the GUI web console?Writ
@RodneyS.Foley Top left you should see Add a Filer + In the popup click Edit Query DSLInvoluted
this query matches records whose user field contains a word like "key" or "kay" or something. but how to query with more than one word? let's say there is a record like this "oh kay good better best", the query "k.*y" matches it but the query "k.*y good" or "k.*y good.*" does not match anything. what's wrong?Salgado
N
3

I'm not sure offhand why that regex query wouldn't be working but I believe Kibana is using Elasticsearch's query string query documented here so for instance you could do a phrase query (documented in the link) by putting your search in double quotes and it would look for the word "foo" followed by "bar". This would perform better too since you would do this on your analyzed field (my_field) where it has tokenized each word to perform fast lookups. So you search in Kibana would be:

my_field: "FOO BAR"

Update:

Looks like this is an annoying quirk of Kibana (probably for backwards compatability reasons). Anyway, this isn't matching for you because you're searching against a non-analyzed field and apparently Kibana by default is lowercasing the search therefore it won't match the the non-analyzed uppercase "FOO". You can configure this in Kibana advanced settings mentioned here, specifically by setting the configuration option "lowercase_expanded_terms" to false.

Nonoccurrence answered 13/11, 2016 at 3:8 Comment(4)
Thanks for reply. Not just only that. I will be needing all the docs even if "FOO" and "BAR" are separated by some other words.<br> Example: Match doc1, my_field: "MY FOO WORD BAR EXAMPLE" . <br> Not Match doc2, my_field: "MY BAR WORD FOO EXAMPLE"Sunfast
So, I will require regex and not phrase matchingSunfast
Okay, I figured out why this was happening for you (weird quirk of Kibana), updated the answer.Nonoccurrence
Also, from a performance standpoint using a span near query (which phrase matching) uses with a high slop value + in_order = true would achieve what you regex does and you could do it against the analyzed field which I think should perform better (because each token has its order so in theory it looks for both tokens then makes sure the indexOf(bar) > indexOf(foo), similar answer here - https://mcmap.net/q/539918/-preserving-order-of-terms-in-elasticsearch-queryNonoccurrence

© 2022 - 2024 — McMap. All rights reserved.