How do I set up cloud-init on custom AMIs in AWS? (CentOS)
Asked Answered
P

4

36

Defining userdata for instances in AWS seems really useful for doing all kinds of bootstrap-type actions. Unfortunately, I have to use a custom CentOS AMI that didn't originate from one of the provided AMIs for PCI reasons, so cloud-init is not already installed and configured. I only really want it to set a hostname and run a small bash script. How do I get it working?

Praedial answered 1/5, 2014 at 15:56 Comment(3)
What is "PCI" in this context?Kanpur
@Kanpur It refers to the guidelines set by the Payment Card Industry Data Security Council. Compliance is needed to process credit card payments, and is generally a strong security standard.Praedial
whereswalden: ok, thanks. gotta watch the "jargon" y'know.Kanpur
P
55

cloud-init is a very powerful, but very undocumented tool. Even once it's installed, there are lot of modules active by default that overwrite things you may have already defined on your AMI. Here are instructions for a minimal setup from scratch:

Instructions

  1. Install cloud-init from a standard repository. If you're worried about PCI, you probably don't want to use AWS's custom repositories.

    # rpm -Uvh https://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
    # yum install cloud-init
    
  2. Edit /etc/cloud/cloud.cfg, a yaml file, to reflect your desired configuration. Below is a minimal configuration with documentation for each module.

    #If this is not explicitly false, cloud-init will change things so that root
    #login via ssh is disabled. If you don't want it to do anything, set it false.
    disable_root: false
    
    #Set this if you want cloud-init to manage hostname. The current
    #/etc/hosts file will be replaced with the one in /etc/cloud/templates.
    manage_etc_hosts: true
    
    #Since cloud-init runs at multiple stages of boot, this needs to be set so
    #it can log in all of them to /var/log/cloud-init.
    syslog_fix_perms: null
    
    #This is the bit that makes userdata work. You need this to have userdata
    #scripts be run by cloud-init.
    datasource_list: [Ec2]
    datasource:
      Ec2:
        metadata_urls: ['http://169.254.169.254']
    
    #modules that run early in boot
    cloud_init_modules:
     - bootcmd  #for running commands in pre-boot. Commands can be defined in cloud-config userdata.
     - set-hostname  #These 3 make hostname setting work
     - update-hostname
     - update-etc-hosts
    
    #modules that run after boot
    cloud_config_modules:
     - runcmd  #like bootcmd, but runs after boot. Use this instead of bootcmd unless you have a good reason for doing so.
    
    #modules that run at some point after config is finished
    cloud_final_modules:
     - scripts-per-once  #all of these run scripts at specific events. Like bootcmd, can be defined in cloud-config.
     - scripts-per-boot
     - scripts-per-instance
     - scripts-user
     - phone-home  #if defined, can make a post request to a specified url when done booting
     - final-message  #if defined, can write a specified message to the log
     - power-state-change  #can trigger stuff based on power state changes
    
    system_info:
      #works because amazon's linux AMI is based on CentOS
      distro: amazon
    
  3. If there is a defaults.cfg in /etc/cloud/cloud.cfg.d/, delete it.

  4. To take advantage of this configuration, define the following userdata for new instances:

    #cloud-config
    hostname: myhostname
    fqdn: myhostname.mydomain.com
    runcmd:
     - echo "I did this thing post-boot"
     - echo "I did this too"
    

    You can also simply run a bash script by replacing #cloud-config with #!/bin/bash and putting the bash script in the body, but if you do, you should remove all of the hostname-related modules from cloud_init_modules.


Additional Notes

Note that this is a minimal configuration, and cloud-init is capable of managing users, ssh keys, mount points, etc. Look at the references below for more documentation on those specific features.

In general, it seems that cloud-init does stuff based on the modules specified. Some modules, like "disable-ec2-metadata", do stuff simply by being specified. Others, like "runcmd", only do stuff if their parameters are specified, either in cloud.cfg, or in cloud-config userdata. Most of the documentation below only tell you what parameters are possible for each module, not what the module is called, but the default cloud.cfg should have a complete module list to begin with. The best way I've found to disable a module is simply to remove it from the list.

In some cases, "rhel" may work better for the "distro" tag than "amazon". I haven't really figured out when.


References

Praedial answered 1/5, 2014 at 15:56 Comment(7)
The config file you want to edit is actually /etc/cloud/cloud.cfg, not /etc/cloud.cfg. I made all my changes to /etc/cloud.cfg (including setting disable_root to false), created an ami and then was promptly locked out of any instances created from the ami and also my original instance because cloud-init doesn't actually read that file.Pragmaticism
Right you are! Edited.Praedial
So, does this /etc/cloud/cloud.cfg get saved in the AMI and the AMI rebuilt? thanksKanpur
@JDS: Yes, it has to exist on all instances where you want the cloud-config userdata to work, otherwise there's nothing to consume the userdata.Praedial
@Praedial THANKS. I looked and looked, and I figured that must be the case, but I couldn't find anything that definitively said, build your AMI with cloud-init installed. It's implied in a lot of places, but the freely-available Centos AMIs don't seem to have it and it seems i'd have to re-bake. sorry for the loquatiousness but i had some beerKanpur
Re replacing the '#cloud-config' with #!/bin/bash and making it a script. Would the new file be in /etc/cloud/cloud.cfg.d/runcmd.cfg or would it be in /etc/cloud/cloud.cfg.d/runcmd.sh?Unaneled
@TennisSmith: Neither: as the answer says, the script should be provided as userdata to new instances. I believe there is a way to have stuff in the AMI that gets run independent of userdata if that's what you're looking for, but I forget how to do it and I no longer have access to the environment I used when writing this answer.Praedial
M
12

Here is a brief tutorial on how to run scripts during startup using cloud-init on AWS EC2 (CentOS).

This tutorial explains:

  • how to set configuration file /etc/cloud/cloud.cfg
  • how the cloud path /var/lib/cloud/scripts looks like
  • the script files under the cloud path using an example, and
  • how to check if the script files are executed during startup of the instance

Configuration file

The configuration file below is on AWS CentOS6. For Amazon Linux, see here.

# cat /etc/cloud/cloud.cfg
manage_etc_hosts: localhost
user: root
disable_root: false
ssh_genkeytypes: [ rsa, dsa ]

cloud_init_modules:
 - resizefs
 - update_etc_hosts
 - ssh

cloud_final_modules:
 - scripts-per-once
 - scripts-per-boot
 - scripts-per-instance
 - scripts-user

Directory Tree

Here is what the cloud path /var/lib/cloud/scripts looks like:

# cd /var/lib/cloud/scripts
# tree `pwd`
/var/lib/cloud/scripts
├── per-boot
│     └── per-boot.sh
├── per-instance
│     └── per-instance.sh
└── per-once
       └── per-once.sh

Content of the Script Files

Here are the contents of the example script files.
The files have to be under user root. See my way on creating the boot script.

# cat /var/lib/cloud/scripts/per-boot/per-boot.sh
#!/bin/sh
echo per-boot: `date` >> /tmp/per-xxx.txt

# cat /var/lib/cloud/scripts/per-instance/per-instance.sh
#!/bin/sh
echo per-instance: `date` >> /tmp/per-xxx.txt

# cat /var/lib/cloud/scripts/per-once/per-once.sh   
#!/bin/sh
echo per-once: `date` >> /tmp/per-xxx.txt

Result of Execution

In the case of initial start-up

# cat /tmp/per-xxx.txt
per-once: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:30:16 JST 
per-instance: 1 January 3, 2013 Thursday 17:30:16 JST

In the case of a reboot

# cat /tmp/per-xxx.txt
per-once: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:30:16 JST 
per-instance: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:32:24 JST

In the case of start from in the AMI

# cat /tmp/per-xxx.txt
per-once: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:30:16 JST 
per-instance: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:32:24 JST 
per-boot: 1 January 3, 2013 Thursday 17:44:08 JST

Reference
The timing at which the script is run in cloud-init (CentOS6) was examined (translated)

Markley answered 20/7, 2016 at 10:47 Comment(1)
I found that you had to make sure the executable permission was set on shell scriptsEpiclesis
S
6

Expanding on the prior answer for anyone trying to create a CentOS AMI that is cloud-init enabled (and capable of actually executing your CloudFormation scripts), you might have some success by doing the following:

  1. launch a marketplace CentOS AMI w/Updates - make sure cloud-init is present or sudo yum install -y cloud-init
  2. rm -rf /var/lib/cloud/data
  3. rm -rf /var/lib/cloud/instance
  4. rm -rf /var/lib/cloud/instances/*
  5. replace /etc/cloud/cloud.cfg with the configuration in the answer above but make sure you set distro: rhel
  6. Add the CloudFormation helpers (http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-helper-scripts-reference.html)
  7. create an AMI image from this instance

Had a heck of a time trying to figure out why my UserData was not being invoked until I realized that the images in the marketplace naturally only run your UserData once per instance AND of course they had already run. Removing the indicators that those had already been executed along with changing the distro: rhel in the cloud.cfg file did the trick.

For the curious, the distro: value should correspond to one of the python scripts in /usr/lib/python2.6/site-packages/cloudinit/distros. As it turns out the AMI I launched had no amazon.py, so you need to use rhel for CentOS. Depending on the AMI you launch and the version of cloud-init, YMMV.

Salivation answered 25/10, 2016 at 7:55 Comment(1)
When referencing another answer it's helpful to link it. The order may change.Kitchens
G
0

Thank you to those here who have already clarified so much about cloud-init! However, at least regarding AWS, one issue needs emphasis:

By default, user data scripts and cloud-init directives run only during the first boot cycle when an EC2 instance is launched.

In other words, they will not run during subsequent boots. This is not always mentioned and I suspect is causing confusion. More on this:

cloud-init has to determine whether or not the current boot is the first boot of a new instance or not, so that it applies the appropriate configuration. On an instance’s first boot, it should run all “per-instance” configuration, whereas on a subsequent boot it should run only “per-boot” configuration.

Giselegisella answered 23/3, 2021 at 18:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.