elasticsearch如何安全重启节点
问题:
elasticsearch集群,有时候可能需要修改配置,增加硬盘,扩展内存等操作,需要对节点进行维护升级。但是业务不能停,如果直接kill掉节 点,可能导致数据丢失。而且集群会认为该节点挂掉了,就开始转移数据,当重启之后,它又会恢复数据,如果你当前的数据量已经很大了,这是很耗费机器和网络 资源的。
stackoverflow的问题上说得更准确:
I have an ES cluster with 4 nodes:
number_of_replicas: 1
search01 - master: false, data: false
search02 - master: true, data: true
search03 - master: false, data: true
search04 - master: false, data: true
I had to restart search03, and when it came back, it rejoined the cluster no problem, but left 7 unassigned shards laying about.
{
"cluster_name" : "tweedle",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 3,
"active_primary_shards" : 15,
"active_shards" : 23,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 7
}
说的就是说shard 表现出来的征兆是出现unassigned,然后开始初始化,rebalance数据,然后就是大量的等待。
解决方案就是:rolling restart
本文转载官方提供的安全重启集群节点的方法:https://www.elastic.co/guide/cn/elasticsearch/guide/current/_rolling_restarts.html
滚动重启