I am trying to run a simple workflow executing a hive script. This hive script just calls joining(tables is very large); Once the hive script execution ends I was expecting to see the workflow status changing from RUNNING to successful, but this is not happening.
This is the content of the workflow log:
2016-05-31 15:52:34,590 WARN
org.apache.oozie.action.hadoop.HiveActionExecutor:
SERVER[hadoop02] U
SER[scapp]
GROUP[-]
TOKEN[]
APP[wf-sqoop-hive-agreement]
JOB[0000001-160531143657136-oozie-oozi-W]
ACTION[0000001-160531143657136-oozie-oozi-W@hive-query-agreement] Launcher
ERROR, reason: Main class [org.apache.oozie.action.hadoop.HiveMain], exception invoking main(), Output data exceeds its limit [2048] 2016-05-31 15:52:34,591
WARN org.apache.oozie.action.hadoop.HiveActionExecutor:
SERVER[hadoop02]
USER[scapp]
GROUP[-]
TOKEN[]
APP[wf-sqoop-hive-agreement]
JOB[0000001-160531143657136-oozie-oozi-W]
ACTION[0000001-160531143657136-oozie-oozi-W@hive-query-agreement]
Launcher exception: Output data exceeds its limit [2048]
org.apache.oozie.action.hadoop.LauncherException: Output data exceeds its limit [2048]
at org.apache.oozie.action.hadoop.LauncherMapper.getLocalFileContentStr(LauncherMapper.java:415)
at org.apache.oozie.action.hadoop.LauncherMapper.handleActionData(LauncherMapper.java:391)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:275) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
<capture-output/>
flag but too much key/value data in the output, but that's a Hive action, no output to capture and process in Oozie. Unless you run a plain SELECT that vomits results to StdOut -- which would be stupid for a batch job scheduled by Oozie (why want to flood the YARN logs with SELECT results that nobody will be able to access?) – Mckelvey