We're running a Rails 5.1 app on ECS using a docker image based on the official ruby:2.4.2
image.
On many deploys we receive this exception, from what I understand from the old process:
SignalException: SIGTERM - SignalException in at_exit
Backtrace:
[GEM_ROOT]/gems/puma-3.11.0/lib/puma/launcher.rb:397 :in `block in setup_signals`
[GEM_ROOT]/gems/puma-3.11.0/lib/puma/single.rb:106 :in `join`
[GEM_ROOT]/gems/puma-3.11.0/lib/puma/single.rb:106 :in `run`
[GEM_ROOT]/gems/puma-3.11.0/lib/puma/launcher.rb:183 :in `run`
[GEM_ROOT]/gems/puma-3.11.0/lib/puma/cli.rb:77 :in `run`
[GEM_ROOT]/gems/puma-3.11.0/bin/puma:10 :in `<top (required)>`
/usr/local[GEM_ROOT]/bin/puma:21 :in `load`
19 require "bundler/setup"
20
21 load Gem.bin_path("puma", "puma")
/usr/local[GEM_ROOT]/bin/puma:21 :in `<main>`
I found this suggesting that setting BUNDLE_DISABLE_EXEC_LOAD
to true
would resolve the issue, but it did not.
Also, note the in at_exit
part. Is it possible that our shutdown takes too long so ECS sends another SIGTERM
before the process terminated properly?
The command is
"command": [
"bin/ecs",
"webserver"
]
and bin/ecs
is
#!/usr/bin/env ruby
COMMANDS = {
"webserver" => "puma -C config/puma.rb",
"sidekiq" => "sidekiq -C config/sidekiq.yml"
}
system("bundle", "exec", "rake", "db:abort_if_pending_migrations")
exit $?.exitstatus unless $?.success?
command = COMMANDS[ARGV.first].split(" ")
exec(*command)
We do this to avoid running a shell somewhere because that swallows signals. We also set a high stop timeout to make sure long-running sidekiq tasks don't get killed:
ECS_CONTAINER_STOP_TIMEOUT=1h
The container exits quite fast, so the timeout is not the problem. If it would be killed, it also couldn't report about it, could it?