"LOST" node in EMR Cluster
Asked Answered
M

1

8

How do I troubleshoot and recover a Lost Node in my long running EMR cluster?

The node stopped reporting a few days ago. The host seems to be fine and HDFS too. I noticed the issue only from the Hadoop Applications UI.

Mathew answered 3/9, 2015 at 20:57 Comment(0)
S
1

EMR nodes are ephemeral and you cannot recover them once they are marked as LOST. You can avoid this in first place by enabling 'Termination Protection' feature during a cluster launch.

Regarding finding reason for LOST node, you can probably check YARN ResourceManager logs and/or Instance controller logs of your cluster to find out more about root cause.

Shauna answered 9/6, 2018 at 7:18 Comment(2)
so when a node is lost will the HDFS data in that node also gets lost ?Cargile
Yes - if your HDFS replication factor is just 1. No - if its greater than 1 & yours is a multi-node cluster.Shauna

© 2022 - 2024 — McMap. All rights reserved.