Container killed by the ApplicationMaster Exit code is 143
Asked Answered
I

3

19

I've been getting the following error in several cases:

2017-03-23 11:55:10,794 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1490079327128_0048_r_000003_0: Container killed by the ApplicationMaster.

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

I noticed it happens one large sorts but when i change the "Sort Allocation Memory" it does not help.

I tried changing other memory properties but yet, the solution eludes me. Is there a good explanation to how Mapreduce works and what's the interaction between the different components? What should I change? where do I locate the Java error leading to this?

Illuviation answered 23/3, 2017 at 10:7 Comment(8)
hive jobs throwing this error right?Golden
Exactly, and I cant find the JAVA exceptionIlluviation
I am asking about, is it hive jobs or your own map reduce jobs?Golden
There hive Jobs.Illuviation
The table having partition?Golden
I cant use partitioning or bucketing in this caseIlluviation
could you share your schema and what you are trying to do?Golden
Hello @Illuviation Did you solve the problem? If so do you remember it?Fane
Q
15

Exit code 143 is related to Memory/GC issues. Your default Mapper/reducer memory setting may not be sufficient to run the large data set. Thus, try setting up higher AM, MAP and REDUCER memory when a large yarn job is invoked.

Please check this link out: https://community.hortonworks.com/questions/96183/help-troubleshoot-container-killed-by-the-applicat.html

Please look into: https://www.slideshare.net/SparkSummit/top-5-mistakes-when-writing-spark-applications-63071421

Excellent source to optimize your code.

Quitrent answered 19/9, 2018 at 10:2 Comment(0)
I
3

I found out I mixed up two separate things. The 143 exit code is from the metrics collector which is down. The Jobs are killed, as far as I understand, due to no memory issues. The problem is with large window functions that cant reduce the data till the last one which contains all the data.

Although, the place in the logs where it gives the reason why the job was killed, still eludes me.

Illuviation answered 29/3, 2017 at 15:57 Comment(0)
B
0

This issue is related to memory issue. you can resolve it by increasing driver memory, executor memory and executor cores. If you are working on spark shell then while opening spark shell you can provide these configs to increase memory according to your data size.

Bik answered 18/8, 2023 at 6:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.