I'm trying to use a cron job to call a healthcheck script I've written to check the status of a web app (api) I've written (a url call doesn't suffice to test full functionality, hence the custom healthcheck). The healthcheck app has several endpoints which are called from a shell script (see below), and this script restarts the bigger web app we are checking. Naturally, I'm having trouble.
How it works: 1) cron job runs every 60s 2) healthcheck script is run by cron job 3) healthcheck script checks url, if url returns non-200 response, it stops and start a service
What works: 1) I can run the script (healthcheck.sh) as the ec2-user 2) I can run the script as root 3) The cron job calls the script and it runs, but it doesn't stop/start the service (I can see this by watching /tmp/crontest.txt and ps aux).
It totally seems like a permissions issue or some very basic linux thing that I'm not aware of.
The log when I run as root or ec2-user (/tmp/crontest.txt):
Fri Nov 23 00:28:54 UTC 2012
healthcheck.sh: api not running, restarting service!
api start/running, process 1939 <--- it restarts the service properly!
The log when the cron job runs:
Fri Nov 23 00:27:01 UTC 2012
healthcheck.sh: api not running, restarting service! <--- no restart
Cron file (in /etc/cron.d):
# Call the healthcheck every 60s
* * * * * root /srv/checkout/healthcheck/healthcheck.sh >> /tmp/crontest.txt
Upstart script (/etc/init/healthcheck.conf)- this is for the healthcheck app, which provides endpoints which we call from the shell script healthcheck.sh:
#/etc/init/healthcheck.conf
description "healthcheck"
author "me"
env USER=ec2-user
start on started network
stop on stopping network
script
# We run our process as a non-root user
# Upstart user guide, 11.43.2 (http://upstart.ubuntu.com/cookbook/#run-a-job-as-a-different-user)
exec su -s /bin/sh -c "NODE_ENV=production /usr/local/bin/node /srv/checkout/healthcheck/app.js" $USER
end script
Shell script permissions:
-rwxr-xr-x 1 ec2-user ec2-user 529 Nov 23 00:16 /srv/checkout/healthcheck/healthcheck.sh
Shell script (healthcheck.sh):
#!/bin/bash
API_URL="http://localhost:4567/api"
echo `date`
status_code=`curl -s -o /dev/null -I -w "%{http_code}" $API_URL`
if [ 200 -ne $status_code ]; then
echo "healthcheck.sh: api not running, restarting service!"
stop api
start api
fi