AWS Elastic mapreduce doesn't seem to be correctly converting the streaming to jar
Asked Answered
S

1

2

I have a mapper and reducer that work fine when I run them in the piped version:

cat data.csv | ./mapper.py | sort -k1,1 | ./reducer.py

I used the elastic mapreducer wizard, loaded inputs, outputs, bootstrap, etc. The bootstrap is successful, but I am still getting an error in execution.

This is the error I'm getting in my stderr for step 1...

+ /etc/init.d/hadoop-state-pusher-control stop
+ PID_FILE=/mnt/var/run/hadoop-state-pusher/hadoop-state-pusher.pid
+ LOG_FILE=/mnt/var/log/hadoop-state-pusher/hadoop-state-pusher.out
+ SVC_FILE=/mnt/var/lib/hadoop-state-pusher/run-hadoop-state-pusher
+ case $1 in
+ stop
+ echo 0
/etc/init.d/hadoop-state-pusher-control: line 35: /mnt/var/lib/hadoop-state-pusher/run-hadoop-state-pusher: No such file or directory
+ /etc/init.d/hadoop-state-pusher-control start
+ PID_FILE=/mnt/var/run/hadoop-state-pusher/hadoop-state-pusher.pid
+ LOG_FILE=/mnt/var/log/hadoop-state-pusher/hadoop-state-pusher.out
+ SVC_FILE=/mnt/var/lib/hadoop-state-pusher/run-hadoop-state-pusher
+ case $1 in
+ start
++ dirname /mnt/var/lib/hadoop-state-pusher/run-hadoop-state-pusher
+ sudo -u hadoop mkdir -p /mnt/var/lib/hadoop-state-pusher
+ echo 1
++ dirname /mnt/var/run/hadoop-state-pusher/hadoop-state-pusher.pid
+ sudo -u hadoop mkdir -p /mnt/var/run/hadoop-state-pusher
++ dirname /mnt/var/log/hadoop-state-pusher/hadoop-state-pusher.out
+ sudo -u hadoop mkdir -p /mnt/var/log/hadoop-state-pusher
+ disown %1
+ sleep 5
+ sudo -u hadoop /usr/bin/hadoop-state-pusher -server --pidfile /mnt/var/run/hadoop-state-pusher/hadoop-state-pusher.pid
+ exit 0
Command exiting with ret '0'

This is cryptic. What on earth does this mean?

It seems to have a problem with mounting something? Which of the other log files might say something informative, where I should be looking?

I tried a solution I found here, in just making the instance bigger, but this did not work, same error message.

Stoffel answered 1/9, 2013 at 7:34 Comment(0)
S
0

I was looking in the wrong log file. There is a different (there were like 6?) that actually gave me some useful python debugging information. It turned out I had used a string interpolation.format("of this kind {}, not this kind with a digit {1}".vars(a,b)) that was unsupported in python < 2.7, which was what was installed by default on the EC2 image used in elastic mapreduce.

Stoffel answered 30/9, 2013 at 14:58 Comment(1)
Not sure, it was 8 years ago.Stoffel

© 2022 - 2024 — McMap. All rights reserved.