您现在的位置是:首页 > 文章详情

ceph 安全迁移

日期:2020-04-02点击:529

    ceph迁移有很多种方案。最简单的就是上一台新的,拆一台旧的。但是这样的方法不安全,有可能在数据迁移到一半的时候出现异常而老的数据在迁移回去费时费力而且又可能出现新的问题。所以今天分享一个比较安全的迁移方案。

1    设计方案

    1.1 准备硬件

            硬件准备包括安装操作系统,防火墙关闭,插入磁盘,分配网络,安装ceph软件。

    1.2 迁移规划

迁移前:

主机 IP 组件
ceph-admin 172.18.0.131 mon,osd
ceph-node1 172.18.0.132 mon,osd
ceph-node2 172.18.0.133 mon,osd

迁移后:

主机 IP 组件
ceph-admin 172.18.0.131 mon
ceph-node1 172.18.0.132 mon
ceph-node2 172.18.0.133 mon
transfer01 172.18.0.135 osd
transfer02 172.18.0.34 osd
transfer03 172.18.0.51 osd

2 迁移原理

    迁移原理基于ceph 的crush 伪随机原理。简单的说就是当副本数增加的时候原来bucket中的数据不迁移,只是拷贝一份副本到新的bucket中。当副本数减少的时候 也是把指定的bucket中的数据删除,其他bucket中的数据不会迁移。

   2.1  搭建独立的bucket(new_root)

  2.2  在老的 bucket (default) 中选择3个副本

  2.3  修改pool的副本数增加到6副本

  2.3  在新的bucket(new_root)中再选择3个副本 等待从老的bucket中选出的3个副本中的数据 拷贝到另外新bucket中的3副本上,达到6副本的效果

  2.4  在新的bucket(new_root) 中选择3个副本(等于是放弃老bucket中的3个副本的数据)并修改pool的副本数从6副本降低至3副本(自动删除老bucket的3副本中的数据保留新bucket中的3副本数据从而达到无缝安全迁移的目的)

   优点:

    1  安全。在最后一刻修改副本数自动删除老bucket数据之前老的bucket中的数据从来都没有迁移过,不用担心因为迁移过程中的异常而丢损坏数据。

    2  省时。每一步都是可逆的,如果再操作过程中出现任何问题只用简单的修改crushmap和副本数 就达到了回退的目的,避免在迁移过程中回退花费大量的时间和精力。

操作步骤 操作命令 回退命令 集群数据变化 是否有风险
设置规则1 ceph osd pool set [pool-name]  crush_rule replicated_rule_1 ceph osd pool set [pool-name] set crush_rule replicated_rule 无变化
修改pool 副本数 ceph osd pool set [pool-name]  size 6 ceph osd pool set [pool-name]  size 3 无变化
设置规则2 ceph osd pool set [pool-name]  crush_rule replicated_rule_2 ceph osd pool set [pool-name]  crush_rule replicated_rule_1 数据从老bucket树拷贝到新bucket树 极低
设置规则3 ceph osd pool set [pool-name]  crush_rule replicated_rule_3 ceph osd pool set [pool-name]  crush_rule replicated_rule_2 老bucket树中删除数据 极低
修改pool副本数 ceph osd pool set [pool-name]  size 3 无变化

在设置规则3前老的数据都未挪动半步,不影响业务使用也不担心数据丢失和损坏。当执行步骤3的的时候就表示步骤2已经完全拷贝了一份数据导新bucket中,全程安全无缝对接。

 

3 迁移实施

    3.1 准备基础环境

# 准备磁盘 [root@transfer01 ~]# lsblk|grep vda vda 252:0 0 100G 0 disk # 更新yum [root@transfer01 yum.repos.d]# ll|grep ceph.repo -rw-r--r--. 1 root root 614 Apr 1 14:34 ceph.repo [root@transfer01 yum.repos.d]# ll|grep epel.repo -rw-r--r--. 1 root root 921 Apr 1 14:34 epel.repo # 安装ceph 软件 [root@transfer01 yum.repos.d]# yum install ceph net-tools vim -y # 防火墙关闭 [root@transfer01 yum.repos.d]# systemctl stop firewalld && systemctl disable firewalld 

   3.2 osd 初始化

# 修改集群参数osd_crush_update_on_start 这一点最关键一定不可忘记,否则迁移失败 在ceph.conf 中添加参数osd_crush_update_on_start = false,阻止因为osd的变化而引起crush的变化 添加参数后重启mon服务使得参数生效。(可以使用ceph --show-config | grep osd_crush_update_on_start 验证参数是否改变) # 初始化osd ceph-deploy osd create transfer01 --data /dev/vda ceph-deploy osd create transfer02 --data /dev/vda ceph-deploy osd create transfer03 --data /dev/vda # 初始化后的ceph tree 是这样的 [root@ceph-admin ceph]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -9 0 root new_root -1 0.14699 root default -3 0.04900 host ceph-admin 0 hdd 0.04900 osd.0 up 1.00000 1.00000 -5 0.04900 host ceph-node1 1 hdd 0.04900 osd.1 up 1.00000 1.00000 -7 0.04900 host ceph-node2 2 hdd 0.04900 osd.2 up 1.00000 1.00000 3 hdd 0 osd.3 up 1.00000 1.00000 4 hdd 0 osd.4 up 1.00000 1.00000 5 hdd 0 osd.5 up 1.00000 1.00000

   3.3 搭建新的bucket 树

# 添加 新bucket 树中的叶子节点 [root@ceph-admin ceph]# ceph osd crush add-bucket new_root root [root@ceph-admin ceph]# ceph osd crush add-bucket transfer01 host added bucket transfer01 type host to crush map [root@ceph-admin ceph]# ceph osd crush add-bucket transfer02 host added bucket transfer02 type host to crush map [root@ceph-admin ceph]# ceph osd crush add-bucket transfer03 host added bucket transfer03 type host to crush map [root@ceph-admin ceph]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -13 0 host transfer03 -12 0 host transfer02 -11 0 host transfer01 -9 0 root new_root -1 0.14699 root default -3 0.04900 host ceph-admin 0 hdd 0.04900 osd.0 up 1.00000 1.00000 -5 0.04900 host ceph-node1 1 hdd 0.04900 osd.1 up 1.00000 1.00000 -7 0.04900 host ceph-node2 2 hdd 0.04900 osd.2 up 1.00000 1.00000 3 hdd 0 osd.3 up 1.00000 1.00000 4 hdd 0 osd.4 up 1.00000 1.00000 5 hdd 0 osd.5 up 1.00000 1.00000 # host级别bucket 添加到root 级别bucket下 [root@ceph-admin ceph]# ceph osd crush move transfer01 root=new_root moved item id -11 name 'transfer01' to location {root=new_root} in crush map [root@ceph-admin ceph]# ceph osd crush move transfer02 root=new_root moved item id -12 name 'transfer02' to location {root=new_root} in crush map [root@ceph-admin ceph]# ceph osd crush move transfer03 root=new_root moved item id -13 name 'transfer03' to location {root=new_root} in crush map [root@ceph-admin ceph]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -9 0 root new_root -11 0 host transfer01 -12 0 host transfer02 -13 0 host transfer03 -1 0.14699 root default -3 0.04900 host ceph-admin 0 hdd 0.04900 osd.0 up 1.00000 1.00000 -5 0.04900 host ceph-node1 1 hdd 0.04900 osd.1 up 1.00000 1.00000 -7 0.04900 host ceph-node2 2 hdd 0.04900 osd.2 up 1.00000 1.00000 3 hdd 0 osd.3 up 1.00000 1.00000 4 hdd 0 osd.4 up 1.00000 1.00000 5 hdd 0 osd.5 up 1.00000 1.00000 # osd 级别bucket 添加到host级别bucket 下,完成搭建一个新的bucket树 [root@ceph-admin ceph]# ceph osd crush add osd.3 0.049 host=transfer01 add item id 3 name 'osd.3' weight 0.049 at location {host=transfer01} to crush map [root@ceph-admin ceph]# ceph osd crush add osd.4 0.049 host=transfer02 add item id 4 name 'osd.4' weight 0.049 at location {host=transfer02} to crush map [root@ceph-admin ceph]# ceph osd crush add osd.5 0.049 host=transfer03 add item id 5 name 'osd.5' weight 0.049 at location {host=transfer03} to crush map [root@ceph-admin ceph]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -9 0.14699 root new_root -11 0.04900 host transfer01 3 hdd 0.04900 osd.3 up 1.00000 1.00000 -12 0.04900 host transfer02 4 hdd 0.04900 osd.4 up 1.00000 1.00000 -13 0.04900 host transfer03 5 hdd 0.04900 osd.5 up 1.00000 1.00000 -1 0.14699 root default -3 0.04900 host ceph-admin 0 hdd 0.04900 osd.0 up 1.00000 1.00000 -5 0.04900 host ceph-node1 1 hdd 0.04900 osd.1 up 1.00000 1.00000 -7 0.04900 host ceph-node2 2 hdd 0.04900 osd.2 up 1.00000 1.00000

  3.4 编辑crushmap 并更新

# 获取crushmap [root@ceph-admin opt]# ceph osd getcrushmap -o /opt/map2 59 # 使用crushtools 镜像反编译成可读文档 [root@ceph-admin opt]# crushtool -d /opt/map2 -o /opt/map2.txt 
# 编辑/opt/map2.txt vim /opt/map2.txt # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable chooseleaf_stable 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 class hdd device 1 osd.1 class hdd device 2 osd.2 class hdd #### 加上初始化好的osd bucket device 3 osd.3 class hdd device 4 osd.4 class hdd device 5 osd.5 class hdd ### 加上host bucket …… host transfer01 { id -11 # do not change unnecessarily id -16 class hdd # do not change unnecessarily # weight 0.049 alg straw2 hash 0 # rjenkins1 item osd.3 weight 0.049 } host transfer02 { id -12 # do not change unnecessarily id -15 class hdd # do not change unnecessarily # weight 0.049 alg straw2 hash 0 # rjenkins1 item osd.4 weight 0.049 } host transfer03 { id -13 # do not change unnecessarily id -14 class hdd # do not change unnecessarily # weight 0.049 alg straw2 hash 0 # rjenkins1 item osd.5 weight 0.049 } ### 添加新搭建独立的bucket结构new_root root new_root { id -9 # do not change unnecessarily id -10 class hdd # do not change unnecessarily # weight 0.147 alg straw2 hash 0 # rjenkins1 item transfer01 weight 0.049 item transfer02 weight 0.049 item transfer03 weight 0.049 } ### 新增3条规则 # 规则1 在老的bucket结构中选出3副本(默认就是3副本,所以这里系统没有变化) rule replicated_rule_1 { id 1 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 3 type host step emit } # 规则2 在老的bucket结构中选出3副本并且在新bucket结构中选出3副本 rule replicated_rule_2 { id 2 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 3 type host step emit step take new_root step chooseleaf firstn 3 type host step emit } # 规则3 在新的bucket结构中选出3副本(等于是废弃老的bucket结构) rule replicated_rule_3 { id 3 type replicated min_size 1 max_size 10 step take new_root step chooseleaf firstn 3 type host step emit } 
#编译更新后的crushmap可编辑文件/opt/map2.txt 为map2.bin [root@ceph-admin opt]# crushtool -c /opt/map2.txt -o /opt/map2.bin # 注入新的crushmap [root@ceph-admin opt]# ceph osd setcrushmap -i /opt/map2.bin 60

  3.5 开始迁移

# 修改pool的crush_rule规则为我们新加的规则1,我们用rbd pool做演示,真实环境下每一个pool都要操作 # 当前rbd crush_rule 是默认的0 [root@ceph-admin ~]# ceph osd dump|grep rbd|grep crush_rule pool 1 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 398 flags hashpspool stripe_width 0 # 设置crush_rule 为修改的crushmap 文件中的规则1(replicated_rule_1) [root@ceph-admin ~]# ceph osd pool set rbd crush_rule replicated_rule_1 set pool 1 crush_rule to replicated_rule_1 # 由于默认的规则和规则1相同,数据结构不变化 [root@ceph-admin ~]# ceph -s cluster: id: e3a671b9-9bf2-4f25-9c04-af79b5cffc7a health: HEALTH_OK services: mon: 3 daemons, quorum ceph-admin,ceph-node1,ceph-node2 mgr: ceph-admin(active), standbys: ceph-node1, ceph-node2 mds: cephfs-1/1/1 up {0=ceph-admin=up:active}, 2 up:standby osd: 6 osds: 6 up, 6 in rgw: 3 daemons active data: pools: 9 pools, 144 pgs objects: 347 objects, 5.79MiB usage: 6.32GiB used, 444GiB / 450GiB avail pgs: 144 active+clean # 设置pool中的副本数为6副本 [root@ceph-admin ~]# ceph osd pool set rbd size 6 set pool 1 size to 6 # 由于规则1指定的bucket是从老的bucket树中选择3个副本,但是当前修改pool的副本数为6副本,不能满足,所以ceph状态为active+undersized [root@ceph-admin ~]# ceph -s cluster: id: e3a671b9-9bf2-4f25-9c04-af79b5cffc7a health: HEALTH_WARN Degraded data redundancy: 32 pgs undersized services: mon: 3 daemons, quorum ceph-admin,ceph-node1,ceph-node2 mgr: ceph-admin(active), standbys: ceph-node1, ceph-node2 mds: cephfs-1/1/1 up {0=ceph-admin=up:active}, 2 up:standby osd: 6 osds: 6 up, 6 in rgw: 3 daemons active data: pools: 9 pools, 144 pgs objects: 347 objects, 5.79MiB usage: 6.33GiB used, 444GiB / 450GiB avail pgs: 112 active+clean 32 active+undersized # 设置crush_rule 为修改的规则2,规则2开始数据拷贝 [root@ceph-admin ~]# ceph osd pool set rbd crush_rule replicated_rule_2 set pool 1 crush_rule to replicated_rule_2 # 数据集开始拷贝(迁移) [root@ceph-admin ~]# ceph -s cluster: id: e3a671b9-9bf2-4f25-9c04-af79b5cffc7a health: HEALTH_WARN Reduced data availability: 13 pgs peering services: mon: 3 daemons, quorum ceph-admin,ceph-node1,ceph-node2 mgr: ceph-admin(active), standbys: ceph-node1, ceph-node2 mds: cephfs-1/1/1 up {0=ceph-admin=up:active}, 2 up:standby osd: 6 osds: 6 up, 6 in rgw: 3 daemons active data: pools: 9 pools, 144 pgs objects: 347 objects, 5.79MiB usage: 6.33GiB used, 444GiB / 450GiB avail pgs: 9.028% pgs not active 131 active+clean 13 peering # 数据拷贝完成后 新老bucket树中都各有3个完整的副本,达到6副本。ceph集群状态又恢复到active+clean [root@ceph-admin ~]# ceph -s cluster: id: e3a671b9-9bf2-4f25-9c04-af79b5cffc7a health: HEALTH_OK services: mon: 3 daemons, quorum ceph-admin,ceph-node1,ceph-node2 mgr: ceph-admin(active), standbys: ceph-node1, ceph-node2 mds: cephfs-1/1/1 up {0=ceph-admin=up:active}, 2 up:standby osd: 6 osds: 6 up, 6 in rgw: 3 daemons active data: pools: 9 pools, 144 pgs objects: 347 objects, 5.79MiB usage: 6.33GiB used, 444GiB / 450GiB avail pgs: 144 active+clean # 修改规则为3 [root@ceph-admin ~]# ceph osd pool set rbd crush_rule replicated_rule_3 set pool 1 crush_rule to replicated_rule_3 # 由于规则3是 只保留在新bucket树中的数据,系统会自动删除老bucket中的3副本数据,但是这个时候pool 中的副本数还是6副本所以集群状态active+clean+remapped [root@ceph-admin ~]# ceph -s cluster: id: e3a671b9-9bf2-4f25-9c04-af79b5cffc7a health: HEALTH_OK services: mon: 3 daemons, quorum ceph-admin,ceph-node1,ceph-node2 mgr: ceph-admin(active), standbys: ceph-node1, ceph-node2 mds: cephfs-1/1/1 up {0=ceph-admin=up:active}, 2 up:standby osd: 6 osds: 6 up, 6 in; 32 remapped pgs rgw: 3 daemons active data: pools: 9 pools, 144 pgs objects: 347 objects, 5.79MiB usage: 6.33GiB used, 444GiB / 450GiB avail pgs: 112 active+clean 32 active+clean+remapped io: client: 4.00KiB/s rd, 0B/s wr, 3op/s rd, 2op/s wr # 修改pool的副本为3副本, [root@ceph-admin ~]# ceph osd pool set rbd size 3 set pool 1 size to 3 # 上面pool的副本数 大于 实际的副本数 导致状态为active+clean+remapped ,修改pool副本数为3后,状态恢复健康 [root@ceph-admin ~]# ceph -s cluster: id: e3a671b9-9bf2-4f25-9c04-af79b5cffc7a health: HEALTH_OK services: mon: 3 daemons, quorum ceph-admin,ceph-node1,ceph-node2 mgr: ceph-admin(active), standbys: ceph-node1, ceph-node2 mds: cephfs-1/1/1 up {0=ceph-admin=up:active}, 2 up:standby osd: 6 osds: 6 up, 6 in rgw: 3 daemons active data: pools: 9 pools, 144 pgs objects: 347 objects, 5.79MiB usage: 6.33GiB used, 444GiB / 450GiB avail pgs: 144 active+clean io: client: 4.00KiB/s rd, 0B/s wr, 3op/s rd, 2op/s wr 

  3.6 检查验证

  

rbd 池中的pg 已经迁移到了新的osd3,4,5中,符合迁移要求。

 

原文链接:https://my.oschina.net/wangzilong/blog/3217618
关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。

文章评论

共有0条评论来说两句吧...

文章二维码

扫描即可查看该文章

点击排行

推荐阅读

最新文章