All shards failed

P

7

80

I was working on elastic search and it was working perfectly. Today I just restarted my remote server (Ubuntu). Now I am searching in my indexes, it is giving me this error.

{"error":"SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed]","status":503}

I also checked the health. The status is red. Can anyone tell me what's the issue.

Procambium answered 16/1, 2014 at 9:7 Comment(0)

N

59

It is possible on your restart some shards were not recovered, causing the cluster to stay red.
If you hit:
http://<yourhost>:9200/_cluster/health/?level=shards you can look for red shards.

I have had issues on restart where shards end up in a non recoverable state. My solution was to simply delete that index completely. That is not an ideal solution for everyone.

It is also nice to visualize issues like this with a plugin like:
Elasticsearch Head

Nubia answered 16/1, 2014 at 12:7 Comment(4)

Hi @mconlin, how do you figure which index to be deleted in this case? – Flapjack 28/4, 2014 at 17:20

using head you will see greyed out unrecovered shards on the last row. – Nubia 29/4, 2014 at 20:56

If you are on docker, try to force recreate elasticsearch and kibana – Goalie 19/1, 2019 at 23:18

Note that a large number of shards will take a long time to initialize; try reducing the number, see: elastic.co/blog/… – Comestible 12/2, 2019 at 8:43

C

45

If you're running a single node cluster for some reason, you might simply need to do avoid replicas, like this:

curl -XPUT -H 'Content-Type: application/json' 'localhost:9200/_settings' -d '
{
    "index" : {
        "number_of_replicas" : 0
    }
}'

Doing this you'll force to use es without replicas

Cushiony answered 11/3, 2019 at 22:5 Comment(3)

what will this do?? @paulo , pls explain – Bunn 7/11, 2019 at 12:19

Tells ES that all your indices are on a single machine: no replicas. – Eichmann 10/2, 2020 at 19:50

I had one machine and one elasticsearch node installed on it without replicas and this command worked for me. – Etymon 7/12, 2020 at 10:14

P

7

first thing first, all shards failed exception is not as dramatic as it sounds, it means shards were failed while serving a request(query or index), and there could be multiple reasons for it like

Shards are actually in non-recoverable state, if your cluster and index state are in Yellow and RED, then it is one of the reasons.
Due to some shard recovery happening in background, shards didn't respond.
Due to bad syntax of your query, ES responds in all shards failed.

In order to fix the issue, you need to filter it in one of the above category and based on that appropriate fix is required.

The one mentioned in the question, is clearly in the first bucket as cluster health is RED, means one or more primary shards are missing, and my this SO answer will help you fix RED cluster issue, which will fix the all shards exception in this case.

Pressmark answered 12/2, 2021 at 5:36 Comment(0)

K

5

For Elasticsearch > 5.0 it's possible to get some more information from this endpoint:

http://localhost:9200/_cluster/allocation/explain?pretty

I just ran into a case where I hit the virtual disk limit configured in Docker Desktop and adding an additional, unrelated container caused ES to fail.

Kharkov answered 29/4, 2023 at 20:24 Comment(1)

Thank You for posting this comment. It allowed me to debug my issue – Crusade 18/1 at 15:41

K

3

If you encounter this apparent index corruption in a running system, you can work around it by deleting all files called segments.gen. It is advisory only, and Lucene can recover correctly without it.

From ElasticSearch Blog

Knesset answered 23/7, 2014 at 11:13 Comment(1)

The current link is redirecting to the main elastic.co page. It no longer shows the blog entry. Edit submitted. – Leyte 31/12, 2020 at 3:39

I

0

If you are upgrading the Elasticsearch and have multiple versions you can face this issue. Continue to upgrade ALL nodes. And run the daemon reload.

sudo systemctl daemon-reload

Intention answered 3/7, 2023 at 23:39 Comment(0)

M

0

Depending on how you serve the request body, you may have parameters that are not required, or filled in by default causing this.

Using the Swagger UI to run & test my endpoints caused this problem. I had yellow shards but this wasn't the fault. It was the pre-populated body.

This caused the problem:

{
  "foo": "string",
  "bar": "string",
  "page": 0,
  "pageSize": 0
}

This solved the problem:

{
  "foo": "",
  "bar": "",
  "page": 1,
  "pageSize": 10
}

The pagination numbers are arbitrary

Mezzorilievo answered 20/9, 2023 at 10:14 Comment(0)

Recommended topics

Hot tags