- Is this behaviour configurable? For instance, I don't want Docker to restart my stack under any circumstances. If it is configurable, then how?
With a version 3 stack, the restart policy moved to the deploy section:
version: '3'
services:
crash:
image: busybox
command: sleep 10
deploy:
restart_policy:
condition: none
# max_attempts: 2
Documentation on this is available at: https://docs.docker.com/compose/compose-file/#restart_policy
- Is there any docker journal to keep any stack restarts as it's entries?
Depending on the task history limit (configurable with docker swarm update
, you can view the previously run tasks for a service:
$ docker service ps restart_crash
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
30okge1sjfno restart_crash.1 busybox:latest bmitch-asusr556l Shutdown Complete 4 minutes ago
papxoq1vve1a \_ restart_crash.1 busybox:latest bmitch-asusr556l Shutdown Complete 4 minutes ago
1hji2oko51sk \_ restart_crash.1 busybox:latest bmitch-asusr556l Shutdown Complete 5 minutes ago
And you can inspect the state for any one task:
$ docker inspect 30okge1sjfno --format '{{json .Status}}' | jq .
{
"Timestamp": "2018-11-06T19:55:02.208633174Z",
"State": "complete",
"Message": "finished",
"ContainerStatus": {
"ContainerID": "8e9310bde9acc757f94a56a32c37a08efeed8a040ce98d84c851d4eef0afc545",
"PID": 0,
"ExitCode": 0
},
"PortStatus": {}
}
There's also an event history in the docker engine that you can query:
$ docker events --filter label=com.docker.swarm.service.name=restart_crash --filter event=die --since 15m --until 0s
2018-11-06T14:54:09.417465313-05:00 container die f17d945b249a04e716155bcc6d7db490e58e5be00973b0470b05629ce2cca461 (com.docker.stack.namespace=restart, com.docker.swarm.node.id=q44zx0s2lvu1fdduk800e5ini, com.docker.swarm.service.id=uqirm6a8dix8c2n50thmpzj06, com.docker.swarm.service.name=restart_crash, com.docker.swarm.task=, com.docker.swarm.task.id=1hji2oko51skhv8fv1nw71gb8, com.docker.swarm.task.name=restart_crash.1.1hji2oko51skhv8fv1nw71gb8, exitCode=0, image=busybox:latest@sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812, name=restart_crash.1.1hji2oko51skhv8fv1nw71gb8)
2018-11-06T14:54:32.391165964-05:00 container die d6f98b8aaa171ca8a2ddaf31cce7a1e6f1436ba14696ea3842177b2e5e525f13 (com.docker.stack.namespace=restart, com.docker.swarm.node.id=q44zx0s2lvu1fdduk800e5ini, com.docker.swarm.service.id=uqirm6a8dix8c2n50thmpzj06, com.docker.swarm.service.name=restart_crash, com.docker.swarm.task=, com.docker.swarm.task.id=papxoq1vve1adriw6e9xqdaad, com.docker.swarm.task.name=restart_crash.1.papxoq1vve1adriw6e9xqdaad, exitCode=0, image=busybox:latest@sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812, name=restart_crash.1.papxoq1vve1adriw6e9xqdaad)
2018-11-06T14:55:00.126450155-05:00 container die 8e9310bde9acc757f94a56a32c37a08efeed8a040ce98d84c851d4eef0afc545 (com.docker.stack.namespace=restart, com.docker.swarm.node.id=q44zx0s2lvu1fdduk800e5ini, com.docker.swarm.service.id=uqirm6a8dix8c2n50thmpzj06, com.docker.swarm.service.name=restart_crash, com.docker.swarm.task=, com.docker.swarm.task.id=30okge1sjfnoicd0lo2g1y0o7, com.docker.swarm.task.name=restart_crash.1.30okge1sjfnoicd0lo2g1y0o7, exitCode=0, image=busybox:latest@sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812, name=restart_crash.1.30okge1sjfnoicd0lo2g1y0o7)
See more details on the events command at: https://docs.docker.com/engine/reference/commandline/events/
The best practice at larger scale organizations is to send the container logs to a central location (e.g. Elastic) and monitor the metrics externally (e.g. Prometheus/Grafana).