cloud-init是用来对云实例进行初始化配置的一个工具,目前支持很多云平台,例如OpenStack、AWS、ALiYun等。

1. 概述

Everything about cloud-init, a set of python scripts and utilities to make your cloud images be all they can be!

以下内容大部分来自官方文档。

1.1 用户可配置

用户可以通过user-data配置cloudinit的行为,user-data可以在实例启动时指定,不同的云解决方案会有不同的方式,例如通过--user-data或者--user-data-file参数。

  • user-data string
  • user-data file

1.2 功能检测

可以通过features list确认当前cloud-init支持的功能,该列表保存于cloudinit.version.FEATURES,目前已定义的有:

1.3 Boot Stages

为了能够提供cloud-init所执行的功能,必须以相当可控的方式将cloud-init集成到系统引导中。 总共有5个阶段:Generator、Local、Network、Config、Final。这里简单提一下,方便理解命令行接口。

1.4 命令行接口

命令行接口只简单说下两个比较重要的。

cloud-init init

通常由OS引导系统来执行cloud-init的init阶段和init-local阶段。也可以通过命令行执行,但是由于/var/lib/cloud/instance/sem//var/lib/cloud/sem中的信号量,通常只能运行一次。

  • –local: Run init-local stage instead of init.

cloud-init modules

cloud-init有5个boot stages,其中Network、Config、Final三个阶段需要执行的modules被声明在/etc/cloud/cloud.cfg 文件中对应的关键字下面。也可以通过命令行执行,但由于/var/lib/cloud/中的信号量,每个module只能运行一次 。

  • –mode (init|config|final): Run modules:initmodules:config or modules:final cloud-init stages.
modules:init就是在Network阶段执行的。

2. User-Data格式

  • Gzip Compressed Content
  • Mime Multi Part Archive
  • User-Data Script (Begins with: #! or Content-Type: text/x-shellscript when using a MIME archive.)
  • Include File (Begins with: #include or Content-Type: text/x-include-url when using a MIME archive. )
  • Cloud Config Data (Begins with: #cloud-config or Content-Type: text/cloud-config when using a MIME archive.)
  • Upstart Job (Begins with: #upstart-job or Content-Type: text/upstart-job when using a MIME archive.)
  • Cloud Boothook
  • Part Handler
Cloud Config Data(cloud.cfg) Exampleshttp://cloudinit.readthedocs.io/en/latest/topics/examples.html

3. Boot Stages

为了能够提供cloud-init所执行的功能,必须以相当可控的方式将cloud-init集成到系统引导中。 总共有5个阶段:

  • Generator
  • Local
  • Network
  • Config
  • Final

3.1 Generator

在systemd启动后,一个Generator会被启动,用来判断cloud-init是否需要包含在启动的目标中,默认情况下允许cloud-init启动,以下情况会禁止cloud-init启动:

  • A file exists: /etc/cloud/cloud-init.disabled
  • The kernel command line as found in /proc/cmdline contains cloud-init=disabled. When running in a container, the kernel command line is not honored, but cloud-init will read an environment variable named KERNEL_CMDLINE in its place.

3.2 Local

  • systemd service: cloud-init-local.service
  • runs: As soon as possible with / mounted read-write.
  • blocks: as much of boot as possible, must block network bringup.
  • modules: none

local阶段的目的:

  • 定位datasource
  • 将网络配置应用于系统

网络配置可以来自于:

  • datasource
  • fallback
  • none (disabled)

3.3 Network

  • systemd service: cloud-init.service
  • runs: After local stage and configured networking is up.
  • blocks: As much of remaining boot as possible.
  • modules: cloud_init_modules in /etc/cloud/cloud.cfg

此阶段要求所有已配置的网络都处于online状态,因为它将完全处理找到的所有用户数据(可能会通过网络获取数据) 。

此阶段会运行disk_setupmounts模块,它们有可能会对磁盘进行分区和格式化并配置挂载点。

运行 cloud_init_modules in /etc/cloud/cloud.cfg

3.4 Config

  • systemd service: cloud-config.service
  • runs: After network stage.
  • blocks: None.
  • modules: cloud_config_modules in /etc/cloud/cloud.cfg

This stage runs config modules only. Modules that do not really have an effect on other stages of boot are run here.

运行 cloud_config_modules in /etc/cloud/cloud.cfg

3.5 Final

  • systemd service: cloud-final.service
  • runs: As final part of boot (traditional “rc.local”)
  • blocks: None.
  • modules: cloud_final_modules in /etc/cloud/cloud.cfg

该阶段在boot过程中尽可能靠后运行,用户习惯于在登陆系统后运行的脚本都应该在该阶段正确执行,包括:

  • package installations
  • configuration management plugins (puppet, chef, salt-minion)
  • user-scripts (including runcmd).
运行 cloud_final_modules in /etc/cloud/cloud.cfg

4. Datasources

datasource是cloud-init配置数据的来源,通常来自用户(称为userdata)或者来自创建配置驱动的堆栈(称为metadata)。

userdata

  • files
  • yaml
  • shell scripts

medatata

  • server name
  • instance id
  • display name
  • other cloud specific details

不同的云解决方案会有不同的方式提供此数据。

instance-data

cloud-init会保存所有的metadata、vendordata、userdata到 /run/cloud-init/instance-data.json文件中 。这个json文件就是instance-data,它会包含特定datasource才有的key和name,但是cloud-init维护了一组最小标准的keys,并在任何云上保持稳定。这些key出现在v1关键字下。任意datasource中被cloud-init消费的metadata被放在ds关键字下。

Below is an instance-data.json example from an OpenStack instance:

{
    "base64-encoded-keys": [
        "ds/meta-data/random_seed",
        "ds/user-data"
    ],
    "ds": {
        "ec2_metadata": {
            "ami-id": "ami-0000032f",
            "ami-launch-index": "0",
            "ami-manifest-path": "FIXME",
            "block-device-mapping": {
                "ami": "vda",
                "ephemeral0": "/dev/vdb",
                "root": "/dev/vda"
            },
            "hostname": "xenial-test.novalocal",
            "instance-action": "none",
            "instance-id": "i-0006e030",
            "instance-type": "m1.small",
            "local-hostname": "xenial-test.novalocal",
            "local-ipv4": "10.5.0.6",
            "placement": {
                "availability-zone": "None"
            },
            "public-hostname": "xenial-test.novalocal",
            "public-ipv4": "10.245.162.145",
            "reservation-id": "r-fxm623oa",
            "security-groups": "default"
        },
        "meta-data": {
            "availability_zone": null,
            "devices": [],
            "hostname": "xenial-test.novalocal",
            "instance-id": "3e39d278-0644-4728-9479-678f9212d8f0",
            "launch_index": 0,
            "local-hostname": "xenial-test.novalocal",
            "name": "xenial-test",
            "project_id": "e0eb2d2538814...",
            "random_seed": "A6yPN...",
            "uuid": "3e39d278-0644-4728-9479-678f92..."
        },
        "network_json": {
            "links": [{
                    "ethernet_mac_address": "fa:16:3e:7d:74:9b",
                    "id": "tap9ca524d5-6e",
                    "mtu": 8958,
                    "type": "ovs",
                    "vif_id": "9ca524d5-6e5a-4809-936a-6901..."
                }
            ],
            "networks": [{
                    "id": "network0",
                    "link": "tap9ca524d5-6e",
                    "network_id": "c6adfc18-9753-42eb-b3ea-18b57e6b837f",
                    "type": "ipv4_dhcp"
                }
            ],
            "services": [{
                    "address": "10.10.160.2",
                    "type": "dns"
                }
            ]
        },
        "user-data": "I2Nsb3VkLWNvbmZpZ...",
        "vendor-data": null
    },
    "v1": {
        "availability-zone": null,
        "cloud-name": "openstack",
        "instance-id": "3e39d278-0644-4728-9479-678f9212d8f0",
        "local-hostname": "xenial-test",
        "region": null
    }
}

datasource api

由于不同的云有不同的方式提供数据(metadata),cloud-init内部创建了一个datasource的抽象类,通过继承子类来实现不同云系统对应的方法,用一个统一的方式来获取该数据。

目前一个datasource对象必须实现以下接口:

# returns a mime multipart message that contains
# all the various fully-expanded components that
# were found from processing the raw userdata string
# - when filtering only the mime messages targeting
#   this instance id will be returned (or messages with
#   no instance id)
def get_userdata(self, apply_filter=False)

# returns the raw userdata string (or none)
def get_userdata_raw(self)

# returns a integer (or none) which can be used to identify
# this instance in a group of instances which are typically
# created from a single command, thus allowing programmatic
# filtering on this launch index (or other selective actions)
@property
def launch_index(self)

# the data sources' config_obj is a cloud-config formatted
# object that came to it from ways other than cloud-config
# because cloud-config content would be handled elsewhere
def get_config_obj(self)

#returns a list of public ssh keys
def get_public_ssh_keys(self)

# translates a device 'short' name into the actual physical device
# fully qualified name (or none if said physical device is not attached
# or does not exist)
def device_name_to_device(self, name)

# gets the locale string this instance should be applying
# which typically used to adjust the instances locale settings files
def get_locale(self)

@property
def availability_zone(self)

# gets the instance id that was assigned to this instance by the
# cloud provider or when said instance id does not exist in the backing
# metadata this will return 'iid-datasource'
def get_instance_id(self)

# gets the fully qualified domain name that this host should  be using
# when configuring network or hostname releated settings, typically
# assigned either by the cloud provider or the user creating the vm
def get_hostname(self, fqdn=False)

def get_package_mirror_info(self)

目前已实现的datasource对象:

其中需要了解几个用得比较多的:

  • config drive
  • OpenStack
  • Amazon EC2

config drive

config driver类型的datasource支持了OpenStack的配置驱动磁盘。

在OpenStack中可以将metadata数据写入一个特殊的配置驱动磁盘,并在实例启动的过程中被attache到该实例上。通过安装这个磁盘,实例可以从中读取文件来获取medata信息。这样做的好处是可以在网络未配置的情况下获取一些配置信息,例如IP地址等。

在默认的情况下,cloud-init会把这个datasource视为完整的数据源,否则仅通过它获取网络信息,再通过metadata服务去查找完整的数据源。相关配置请参考上述链接。

OpenStack

openstack支持从它的metadata服务读取metadata信息。需要网络online。相关配置请参考上述链接。

Amazon EC2

亚马逊支持通过一个magic IP来获取metadata信息,其实OpenStack就是模仿了亚马逊,连IP地址都一样。需要网络online。相关配置请参考上述链接。

5. 目录结构

cloud-init在运行过程中记录信息的目录结构如下所示:

/var/lib/cloud/
    - data/
       - instance-id
       - previous-instance-id
       - datasource
       - previous-datasource
       - previous-hostname
    - handlers/
    - instance
    - instances/
        i-00000XYZ/
          - boot-finished
          - cloud-config.txt
          - datasource
          - handlers/
          - obj.pkl
          - scripts/
          - sem/
          - user-data.txt
          - user-data.txt.i
    - scripts/
       - per-boot/
       - per-instance/
       - per-once/
    - seed/
    - sem/

默认情况下,这些数据会保存在/var/lib/cloud目录下,但是可以通过配置更改。

以下是对其中一部分目录的描述:

data/

Contains information related to instance ids, datasources and hostnames of the previous and current instance if they are different. These can be examined as needed to determine any information related to a previous boot (if applicable).

handlers/

Custom part-handlers code is written out here. Files that end up here are written out with in the scheme of part-handler-XYZ where XYZ is the handler number (the first handler found starts at 0).

instance

A symlink to the current instances/ subdirectory that points to the currently active instance (which is active is dependent on the datasource loaded).

instances/

All instances that were created using this image end up with instance identifier subdirectories (and corresponding data for each instance). The currently active instance will be symlinked the instance symlink file defined previously.

scripts/

Scripts that are downloaded/created by the corresponding part-handler will end up in one of these subdirectories.

seed/

TBD

sem/

Cloud-init has a concept of a module semaphore, which basically consists of the module name and its frequency. These files are used to ensure a module is only ran per-once, per-instance, per-always. This folder contains semaphore files which are only supposed to run per-once (not tied to the instance id).