您现在的位置是:首页 > 文章详情

openstack vm_lifecycle

日期:2016-12-22点击:658

nova instance状态:power_state, vm_state, task_state

nova instance有3种状态:power_state, vm_state, task_state,分别对应horizon界面上的Power State,Status,Task

Openstack wiki上有介绍:

  • power_state is the hypervisor state, loaded “bottom-up” from compute worker;
  • vm_state reflects the stable state based on API calls, matching user expectation, revised “top-down” within API implementation.
  • task_state reflects the transition state introduced by in-progress API calls.
  • power_state and vm_state may conflict with each other, which needs to be resolved case-by-case.

Power_state

power_state反映的是hypervisor的状态。目前都是用libvirt作为hypervisor。
例如,nova会调用libvirt api去查询instance的state

[root@node-2 ~]# virsh list --all Id Name State ---------------------------------------------------- 229 instance-00000139 running 230 instance-00000138 running 231 instance-00000137 running 232 instance-00000155 running - instance-00000136 shut off - instance-0000013a shut off [root@node-2 ~]# virsh domstate instance-00000136 shut off

在libvirt中,共有几种状态:

  • undefined
  • defined(也叫stopped)
  • running
  • paused
  • saved

具体解释在这里
下图表示在几种状态的转换关系:
enter image description here
Notes:

  • 有的转换依赖于domain(instance)是transient domain还是persistent domain, 如running ---(shutdown)---> defined/undefined。两者的区别在这里。思考:Openstack创建的domain是否都是persistent domain?
  • virsh create创建transient domain;virsh define;virsh start创建、启动persistent domain

How is it updated?

  1. 调hypervisor api取得的状态总是对的;如果nova数据库中的状态与其不一致,那么更新nova数据库。
  2. nova有定时任务来检查、更新power_state
  3. 如果对instance做了操作(task),并且该操作可能影响power_state,那么在该task结束之前需要更新power_state

power_state list

目前(2015/9/21 master)有6种状态:

# nova/compute/power_state.py # NOTE(maoy): These are *not* virDomainState values from libvirt. # The hex value happens to match virDomainState for backward-compatibility # reasons. NOSTATE = 0x00 RUNNING = 0x01 PAUSED = 0x03 SHUTDOWN = 0x04 # the VM is powered off CRASHED = 0x06 SUSPENDED = 0x07 # TODO(justinsb): Power state really needs to be a proper class, # so that we're not locked into the libvirt status codes and can put mapping # logic here rather than spread throughout the code STATE_MAP = { NOSTATE: 'pending', RUNNING: 'running', PAUSED: 'paused', SHUTDOWN: 'shutdown', CRASHED: 'crashed', SUSPENDED: 'suspended', } 

Vm_state

vm_state反映基于API调用的一种稳定状态,而task_state反映的是一种瞬时状态。

How is it updated?

  • 只有当一个task(一个compute api call)结束时,才能更新vm_state。
  • 如果成功则同时设置task_state=None,如果失败,vm_state可能不变(task支持rollback),也可能设置为ERROR。

vm_state list

# nova/compute/vm_states.py ACTIVE = 'active' # VM is running BUILDING = 'building' # VM only exists in DB PAUSED = 'paused' SUSPENDED = 'suspended' # VM is suspended to disk. STOPPED = 'stopped' # VM is powered off, the disk image is still there. RESCUED = 'rescued' # A rescue image is running with the original VM image attached. RESIZED = 'resized' # a VM with the new size is active. The user is expected to manually confirm or revert. SOFT_DELETED = 'soft-delete' # VM is marked as deleted but the disk images are still available to restore. DELETED = 'deleted' # VM is permanently deleted. ERROR = 'error' SHELVED = 'shelved' # VM is powered off, resources still on hypervisor SHELVED_OFFLOADED = 'shelved_offloaded' # VM and associated resources are not on hypervisor ALLOW_SOFT_REBOOT = [ACTIVE] # states we can soft reboot from ALLOW_HARD_REBOOT = ALLOW_SOFT_REBOOT + [STOPPED, PAUSED, SUSPENDED, ERROR] # states we allow hard reboot from 

Task_state

task_state反映了基于API调用的瞬时状态。
它表示instance当前处于某个API调用中的某个阶段,比vm_state更细化,让用户了解某个action/task执行到哪个步骤了,比如创建instance时会有scheduling -> block_device_mapping -> networking -> spawning。但调用完之后设置为None。

task_state list

# nova/compute/task_states.py """Possible task states for instances. urrent moment. These tasks can be generic, such as 'spawning', or specific, such as 'block_device_mapping'. These task states allow for a better view into what an instance is doing and should be displayed to users/administrators as necessary. """ # possible task states during create() SCHEDULING = 'scheduling' BLOCK_DEVICE_MAPPING = 'block_device_mapping' NETWORKING = 'networking' SPAWNING = 'spawning' # possible task states during snapshot() IMAGE_SNAPSHOT = 'image_snapshot' IMAGE_SNAPSHOT_PENDING = 'image_snapshot_pending' IMAGE_PENDING_UPLOAD = 'image_pending_upload' IMAGE_UPLOADING = 'image_uploading' # possible task states during backup() IMAGE_BACKUP = 'image_backup' # possible task states during set_admin_password() UPDATING_PASSWORD = 'updating_password' # possible task states during resize() RESIZE_PREP = 'resize_prep' RESIZE_MIGRATING = 'resize_migrating' RESIZE_MIGRATED = 'resize_migrated' RESIZE_FINISH = 'resize_finish' # possible task states during revert_resize() RESIZE_REVERTING = 'resize_reverting' # possible task states during confirm_resize() RESIZE_CONFIRMING = 'resize_confirming' # possible task states during reboot() REBOOTING = 'rebooting' REBOOT_PENDING = 'reboot_pending' REBOOT_STARTED = 'reboot_started' REBOOTING_HARD = 'rebooting_hard' REBOOT_PENDING_HARD = 'reboot_pending_hard' REBOOT_STARTED_HARD = 'reboot_started_hard' # possible task states during pause() PAUSING = 'pausing' # possible task states during unpause() UNPAUSING = 'unpausing' # possible task states during suspend() SUSPENDING = 'suspending' # possible task states during resume() RESUMING = 'resuming' # possible task states during power_off() POWERING_OFF = 'powering-off' # possible task states during power_on() POWERING_ON = 'powering-on' # possible task states during rescue() RESCUING = 'rescuing' # possible task states during unrescue() UNRESCUING = 'unrescuing' # possible task states during rebuild() REBUILDING = 'rebuilding' REBUILD_BLOCK_DEVICE_MAPPING = "rebuild_block_device_mapping" REBUILD_SPAWNING = 'rebuild_spawning' # possible task states during live_migrate() MIGRATING = "migrating" # possible task states during delete() DELETING = 'deleting' # possible task states during soft_delete() SOFT_DELETING = 'soft-deleting' # possible task states during restore() RESTORING = 'restoring' # possible task states during shelve() SHELVING = 'shelving' SHELVING_IMAGE_PENDING_UPLOAD = 'shelving_image_pending_upload' SHELVING_IMAGE_UPLOADING = 'shelving_image_uploading' # possible task states during shelve_offload() SHELVING_OFFLOADING = 'shelving_offloading' # possible task states during unshelve() UNSHELVING = 'unshelving' 

task_state state machine diagram (in-progress):
https://docs.google.com/spreadsheets/d/1uvrFI_L86_tBcZGlE2ck3RnMgtsgjIqYVtWTIT9eD8I/edit#gid=3

http://docs.openstack.org/developer/nova/devref/vmstates.html
http://docs.openstack.org/developer/nova/vmstates.html

Code walk-through

nova-compute定义了如下文件:

  • nova/compute/power_state.py
  • nova/compute/vm_states.py
  • nova/compute/task_states.py

nova-api从自身角度定义了vm_state和task_state的对应关系

  • nova/api/openstack/common.py
from nova.compute import task_states from nova.compute import vm_states _STATE_MAP = { vm_states.ACTIVE: { 'default': 'ACTIVE', task_states.REBOOTING: 'REBOOT', # 前面是task_state, 后面是显示在cli/horizon上的'Status' task_states.REBOOT_PENDING: 'REBOOT', task_states.REBOOT_STARTED: 'REBOOT', task_states.REBOOTING_HARD: 'HARD_REBOOT', task_states.REBOOT_PENDING_HARD: 'HARD_REBOOT', task_states.REBOOT_STARTED_HARD: 'HARD_REBOOT', task_states.UPDATING_PASSWORD: 'PASSWORD', task_states.REBUILDING: 'REBUILD', task_states.REBUILD_BLOCK_DEVICE_MAPPING: 'REBUILD', task_states.REBUILD_SPAWNING: 'REBUILD', task_states.MIGRATING: 'MIGRATING', task_states.RESIZE_PREP: 'RESIZE', task_states.RESIZE_MIGRATING: 'RESIZE', task_states.RESIZE_MIGRATED: 'RESIZE', task_states.RESIZE_FINISH: 'RESIZE', }, vm_states.BUILDING: { 'default': 'BUILD', }, vm_states.STOPPED: { 'default': 'SHUTOFF', task_states.RESIZE_PREP: 'RESIZE', task_states.RESIZE_MIGRATING: 'RESIZE', task_states.RESIZE_MIGRATED: 'RESIZE', task_states.RESIZE_FINISH: 'RESIZE', task_states.REBUILDING: 'REBUILD', task_states.REBUILD_BLOCK_DEVICE_MAPPING: 'REBUILD', task_states.REBUILD_SPAWNING: 'REBUILD', }, vm_states.RESIZED: { 'default': 'VERIFY_RESIZE', # Note(maoy): the OS API spec 1.1 doesn't have CONFIRMING_RESIZE # state so we comment that out for future reference only. #task_states.RESIZE_CONFIRMING: 'CONFIRMING_RESIZE', task_states.RESIZE_REVERTING: 'REVERT_RESIZE', }, vm_states.PAUSED: { 'default': 'PAUSED', task_states.MIGRATING: 'MIGRATING', }, vm_states.SUSPENDED: { 'default': 'SUSPENDED', }, vm_states.RESCUED: { 'default': 'RESCUE', }, vm_states.ERROR: { 'default': 'ERROR', task_states.REBUILDING: 'REBUILD', task_states.REBUILD_BLOCK_DEVICE_MAPPING: 'REBUILD', task_states.REBUILD_SPAWNING: 'REBUILD', }, vm_states.DELETED: { 'default': 'DELETED', }, vm_states.SOFT_DELETED: { 'default': 'SOFT_DELETED', }, vm_states.SHELVED: { 'default': 'SHELVED', }, vm_states.SHELVED_OFFLOADED: { 'default': 'SHELVED_OFFLOADED', }, } def status_from_state(vm_state, task_state='default'): """Given vm_state and task_state, return a status string.""" task_map = _STATE_MAP.get(vm_state, dict(default='UNKNOWN')) status = task_map.get(task_state, task_map['default']) if status == "UNKNOWN": LOG.error(_LE("status is UNKNOWN from vm_state=%(vm_state)s " "task_state=%(task_state)s. Bad upgrade or db " "corrupted?"), {'vm_state': vm_state, 'task_state': task_state}) return status 

status_from_state方法看,_STATE_MAP表示vm_state处于某个状态时,所有可能的task_state有哪些,都穷举出来了。这里注意,某些task_state的default值,在nova/compute/task_states.py中没有定义,如:

 vm_states.ACTIVE: { 'default': 'ACTIVE', ...} vm_states.STOPPED: { 'default': 'SHUTOFF', ...} vm_states.RESIZED: { 'default': 'VERIFY_RESIZE', ...} vm_states.ERROR: { 'default': 'ERROR', ...}

给定一对vm_state/task_state,status_from_state能返回一个status字符串,这个status就是cli/horizon上的'Status':

[root@node-129 ~]# nova list | cut -d"|" -f1-6 +--------------------------------------+-------------------------------------------------------+---------+------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ID | Name | Status | Task State | Power State +--------------------------------------+-------------------------------------------------------+---------+------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+ | f6d4e67b-885c-4a10-891f-d91675e52bae | 123 | SHUTOFF | - | Shutdown | 05be07c9-6b2a-4db9-b03b-8322c035816c | 652 | SHUTOFF | - | Shutdown | 19d1308c-f19b-4860-979f-220fd376ef58 | ITMGMT | SHUTOFF | - | Shutdown | c496c710-29f8-46c3-90f8-6c2606d4b252 | Winboo-vm | ACTIVE | - | Running | a27f4030-ece5-4aeb-a84e-70bf0389f81f | abc | ACTIVE | - | Running 

比较一下,nova-api列出的"Status"其实比nova-compute的vm_state(14个)多,共20个:

  • ACTIVE
  • REBOOT
  • HARD_REBOOT
  • PASSWORD
  • REBUILD
  • MIGRATING
  • RESIZE
  • BUILD
  • SHUTOFF
  • VERIFY_RESIZE
  • REVERT_RESIZE
  • PAUSED
  • SUSPENDED
  • RESCUE**
  • ERROR
  • DELETED
  • SOFT_DELETED**
  • SHELVED
  • SHELVED_OFFLOADED**
  • UNKNOWN

状态转移

参考:

enter image description here

不同的操作,只有当instance处于特定的vm_state和task_state时才能进行。这个控制不是由nova-api来完成的,而是由nova-compute来决定的:
nova/compute/api.py定义了nova-compute暴露给外部的接口。大多数方法名称就是操作名,如create,restore, shelve, soft_delete

基本上所有方法都用check_instance_state修饰了。这个修饰器限定了该操作在某些vm_state/task_state下才可以进行。比如:

@wrap_check_policy @check_instance_lock @check_instance_state(vm_state=[vm_states.SOFT_DELETED]) def restore(self, context, instance): """Restore a previously deleted (but not reclaimed) instance.""" @wrap_check_policy @check_instance_cell @check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.STOPPED, vm_states.PAUSED, vm_states.SUSPENDED]) def snapshot(self, context, instance, name, extra_properties=None): """Snapshot the given instance.""" @wrap_check_policy @check_instance_lock @check_instance_state(vm_state=set( vm_states.ALLOW_SOFT_REBOOT + vm_states.ALLOW_HARD_REBOOT), task_state=[None, task_states.REBOOTING, task_states.REBOOT_PENDING, task_states.REBOOT_STARTED, task_states.REBOOTING_HARD, task_states.RESUMING, task_states.UNPAUSING, task_states.PAUSING, task_states.SUSPENDING]) def reboot(self, context, instance, reboot_type): """Reboot the given instance.""" 

所以只要查看check_instance_state的参数就能确定操作可行与否。

原文链接:https://yq.aliyun.com/articles/439401
关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。

文章评论

共有0条评论来说两句吧...

文章二维码

扫描即可查看该文章

点击排行

推荐阅读

最新文章