Changing alias in ElasticSearch returns 200 and acknowledged but does not change alias
Asked Answered
C

1

6

Using elasticsearch 8.4.3 with Java 17 and a cluster of 3 nodes where 3 are master eligible, we start with following situation:

  • index products-2023-01-12-0900 which has an alias current-products

We then start a job that creates a new index products-2023-01-12-1520 and at the end using elastic-rest-client on client side and alias API, we make this call:

At 2023-01-12 16:27:26,893:

POST /_aliases
{"actions":[
   {
    "remove": { 
       "alias":"current-products",
       "index":"products-*"
    }
   },
   { 
    "add":{
       "alias":"current-products",
       "index":"products-2023-01-12-1520"}
    }
]}

And we get the following response 26 millis after with HTTP response code 200:

{"acknowledged":true}

But looking at what we end up with, we still have old index with current-products alias.

I don't understand why it happens, and it does not happen 100% of the time (it happened 2 times out of around 10 indexations). Is it a known bug ? or a regular behaviour ?

Edit for @warkolm:

GET /_cat/aliases?v before indexation as of now:

alias               index                       filter routing.index routing.search is_write_index
current-products    products-2023-01-13-1510    -      -             -              -
Curule answered 14/1, 2023 at 18:29 Comment(9)
can you share _cat/aliases?v showing this?Servile
Why you're not using a rollover_alias to automatize the rollover move alias process? elastic.co/guide/en/elasticsearch/reference/current/… You can even use datastream indices for your case, and it would be better. elastic.co/guide/en/elasticsearch/reference/current/…Snubnosed
@musabDogan, thanks for your time. Datastream indices does not look to be well suited for my use case, it's not time series index. Regarding rollover_alias, why not, still I'd like first to understand what exactly is happening in my case, I see nothing wrong in the solution we built.Curule
@LocAnn I test it your scenario and it's working perfectly. "It happens not 100% of the time" from this sentence, the only possible reason looks like POST _aliases API not returning {"acknowledged":true} every time you run. Do you have any check and retry mechanism until get {"acknowledged":true}? Are you sure the API successfully runs every time you push?Snubnosed
@MusabDogan, we log the response and I can guarantee to you that we get {"acknowledged":true} and 200 while no switch is done. But unfortunately it does not always happen, from 3 attemps done to reproduce since yesterday, it never happened. And to answer, yes we introduced a retry mechanism following checking aliases to see if switch really happened as we cannot rely on {"acknowleded":true} which looks to me like a bug or something I don't get in ES internal related to potential sync, knowing we have a cluster of 3 nodes all master eligible.Curule
Maybe you can check the elasticsearch logs to see some clues about it.Snubnosed
I tested your scenario in our local cluster and it works perfectly. I was a bit afraid, that the actions might not always be processed in the given order. But this was not the case. I tested it like 20 times and it was always executed as expected. Maybe this is some rare case, but definitely not 2 out of 10.Prithee
Thanks @Prithee for your time. Do you have any idea what would explain this occurence ? Is is worth opening a bug with ticket information ?Curule
You need proof to open a bug or solve the case. All you have is experience :). Please check the Elasticsearch logs, find a log about it, and share it with me. If you can share the logs I can review them for you.Snubnosed
S
1

It appears that there might be an issue with the way you are updating the alias. When you perform a POST request to the _aliases endpoint with the "remove" and "add" actions, Elasticsearch will update the alias based on the current state of the indices at the time the request is executed.

However, it is possible that there are other processes or actions that are also modifying the indices or aliases at the same time, and this can cause conflicts or inconsistencies. Additionally, when you use the wildcard character (*) in the "index" field of the "remove" action, it will remove the alias from all indices that match the pattern, which may not be the intended behavior.

To avoid this issue, you could try using the Indices Aliases API instead of the _aliases endpoint. This API allows you to perform atomic updates on aliases, which means that the alias will only be updated if all actions succeed, and will roll back if any of the actions fail. Additionally, instead of using the wildcard character, you can explicitly specify the index that you want to remove the alias from.

Here is an example of how you could use the Indices Aliases API to update the alias:

POST /_aliases
{
    "actions": [
        { "remove": { "index": "products-2023-01-12-0900", "alias": "current-products" } },
        { "add": { "index": "products-2023-01-12-1520", "alias": "current-products" } }
    ]
}

This way, the alias will only be removed from the specific index "products-2023-01-12-0900" and added to the specific index "products-2023-01-12-1520". This can help avoid any conflicts or inconsistencies that may be caused by other processes or actions that are modifying the indices or aliases at the same time.

Additionally, it is recommended to use a version of elasticsearch that is equal or greater than 8.4.3, as it has many bug fixes that might be the cause of the issue you are facing.

In conclusion, the issue you are encountering may not be a known bug but it's a regular behavior if multiple processes are modifying the indices or aliases at the same time, and using the Indices Aliases API and specifying the exact index to remove or add the alias can help avoid this issue.

Sourwood answered 23/1, 2023 at 18:41 Comment(1)
Thanks for your time, note that I am already using this _aliases API . But thanks for the * note and version.Curule

© 2022 - 2024 — McMap. All rights reserved.