How does cloud-init work?
Asked Answered
B

2

6

cloud-init is package performing various configurations on a virtual machine on first boot. You have to configure a file with your config, and throw it at your VM then you virtualize it.

But how exactly does it work ? How is the user data sent to the VM, and how cloud-init manages to execute the configurations ?

Thank you.

Behead answered 28/6, 2018 at 10:23 Comment(0)
B
0

short answer - datasources.

Cloud-init has the concept of datasources which point to the source of the user-data and metadata.

Have a look here https://cloudinit.readthedocs.io/en/latest/topics/datasources.html

Bespangle answered 3/7, 2018 at 7:21 Comment(2)
Can you please add an explanation of the mentioned?Australasia
I think the answer is superficial, I'm sorry. The documentation is huge and it's more like a dictionary, so it's far from didactical and far from what the user was looking for.Darondarooge
D
5

Disclaimer: cloud-init is very complex, and there are lots of supported cloud vendors, and it's used in lots of different ways, but I think this is a fairly accurate simplified overview.

Couple of minor corrections first: cloud-init can run on any machine, not just a VM, and it can run on any boot, not just 'first boot'. It's basically just a way to run scripts during boot. Current Ubuntu server images, for example, come with cloud-init pre-installed, and it runs during boot, even on your desktop.

However, the main use case is first boot of "cloud images". The problem here is that cloud vendors want to ship an official distro release which just works, without the end-user having to actually carry out an installation, or the cloud vendor having to modify the distro in some way. cloud-init handles this by retrieving configuration data at various points during the boot process. In practice, this tends to be user names, passwords, ssh keys, locales, hostnames, additional repos, and so on. In other words, the sort of stuff you would have manually typed in during an installation, but normally without the network setup.

cloud-init can frequently determine exactly what it is running on during boot, by querying the DMI/SMBIOS, or a specific file such as /proc/1/environ. In these cases, it has built-in knowledge of where to find the required configuration data. In general, however, the data will come from the network or, failing that, a filesystem that is bundled with the image.

Many (most? all?) cloud vendors run a private webserver for the image, which is set up for dhcp on eth0 (the image can instead retrieve the required network configuration from another data source, but I think it's much more common just to use dhcp, which is the fallback position). The webserver responds to requests from cloud-init for the user, vendor, and instance data. If you've installed a VM at a cloud provider you'll have seen a user-data block that you can fill in - this is returned to cloud-init as the user data.

The docs have a simple tutorial which does exactly this: it uses QEMU to run an image, and the qemu-system-x86_64 command line sets the image smbios info to specify where the Python webserver is (10.0.2.2:8000). In practice, most cloud vendors serve private data from 169.254.169.254. This is the 'Instance Metadata Service' (IMDS).

There are various other ways to get the data, in addition to or instead of IMDS: a disk partition labelled config-2, for example, which attaches to the instance when it boots, or the kernel command line, or specific files in the filesystem.

Note that cloud-init fits a very specific niche, where a vendor has to provide a standard image to an end-user, with some customisation. You can run custom images at a cloud vendor without cloud-init, but some vendors won't let you install custom images, for reasons best known to themselves.

Detain answered 17/6, 2023 at 12:12 Comment(0)
B
0

short answer - datasources.

Cloud-init has the concept of datasources which point to the source of the user-data and metadata.

Have a look here https://cloudinit.readthedocs.io/en/latest/topics/datasources.html

Bespangle answered 3/7, 2018 at 7:21 Comment(2)
Can you please add an explanation of the mentioned?Australasia
I think the answer is superficial, I'm sorry. The documentation is huge and it's more like a dictionary, so it's far from didactical and far from what the user was looking for.Darondarooge

© 2022 - 2024 — McMap. All rights reserved.