In my Terraform AWS Docker Swarm module I use cloud-init to initialize the EC2 instance. However, Terraform says the resource is ready before cloud-init finishes. Is there a way of making it wait for cloud-init
to finish, ideally without SSHing or checking for a port to be up using a null
Your managers and workers both use template_cloudinit_config. They also have ec2:CreateTags.
You can use an EC2 resource tag like trajano/terraform-docker-swarm-aws/cloudinit-complete to indicate that the cloudinit has finished.
You could add this final part to each to invoke a tagging script:
part { filename = "" content = local.tag_complete_script content_type = "text/x-shellscript" }
And declare tag_complete_script be the following:
locals {
tag_complete_script = <<-EOF
instance_id="${TOKEN=`curl -X PUT "" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"` \
&& curl -H "X-aws-ec2-metadata-token: $TOKEN" -v}"
aws ec2 create-tags --resources "$instance_id" --tags 'Key=trajano/terraform-docker-swarm-aws/cloudinit-complete,Value=true'
Then with a null_resource, you wait for the tag to appear (wrote this on my phone, so use it for a general idea, but I don't expect that it will work without testing and edits):
resource "null_resource" "wait_for_cloudinit" {
provisioner "local-exec" {
command = <<-EOF
poll_tags="aws ec2 describe-tags --filters 'Name=resource-id,Values=${join(",", aws_instance.managers[*].id)}' 'Name=key,Values=trajano/terraform-docker-swarm-aws/cloudinit-complete' --output text --query 'Tags[*].Value'"
expected='${join(",", formatlist("true", aws_instance.managers[*].id))}'
while [[ "$tags" != "$expected" ]] ; do
This way you can have dependencies on null_resource.wait_for_cloudinit on any resources that need to run after cloudinit has completed.
Another possible approach is using AWS Systems Manager Run Command, if available on your AMI.
You create an SSM Document with Terraform that uses the cloud-init status --wait
command, then you trigger the command from a local provisioner, and wait for it to complete. In this way, you don't have to play around with tags, and you are 100% sure cloud-init has been completed.
This is an example of the document you can create with Terraform:
resource "aws_ssm_document" "cloud_init_wait" {
name = "cloud-init-wait"
document_type = "Command"
document_format = "YAML"
content = <<-DOC
schemaVersion: '2.2'
description: Wait for cloud init to finish
- action: aws:runShellScript
name: StopOnLinux
- platformType
- Linux
- cloud-init status --wait
and then you can use a local-provisioner inside the EC2 instance block, or in a null resource, up to what you have to do with it.
The provisioner would be more or less like this:
provisioner "local-exec" {
interpreter = ["/bin/bash", "-c"]
command = <<-EOF
set -Ee -o pipefail
command_id=$(aws ssm send-command --document-name ${aws_ssm_document.cloud_init_wait.arn} --instance-ids ${} --output text --query "Command.CommandId")
if ! aws ssm wait command-executed --command-id $command_id --instance-id ${}; then
echo "Failed to start services on instance ${}!";
echo "stdout:";
aws ssm get-command-invocation --command-id $command_id --instance-id ${} --query StandardOutputContent;
echo "stderr:";
aws ssm get-command-invocation --command-id $command_id --instance-id ${} --query StandardErrorContent;
exit 1;
echo "Services started successfully on the new instance with id ${}!"
An error occurred (InvalidInstanceId) when calling the SendCommand operation: Instances [[i-09eb8edc7df904bd9]] not in a valid state for account 71********30
and I'm not really sure why. (I checked the ID, it's valid) –
Otherdirected aws ssm send-command
in the instance after its created works though... –
Otherdirected sleep 30
in the begining of your comand block to make sure the instance was running before it's executed, it did the trick. –
Otherdirected --profile your_profile
to all aws ssm commands. –
