remove a node from ElasticSearch cluster

1, stop shard allocation for this node

$ curl -XGET "127.0.0.1:9200/_cat/allocation?v"
shards disk.indices disk.used disk.avail disk.total disk.percent host         ip           node
   412      960.3gb     1.8tb     15.6tb     17.4tb           10 172.29.4.156 172.29.4.156 es_node_156_2
   411      478.9gb     1.5tb     15.9tb     17.4tb            8 172.29.4.158 172.29.4.158 es_node_158_2
   411      557.5gb   558.7gb     16.9tb     17.4tb            3 172.29.4.157 172.29.4.157 es_node_157
   411      743.5gb     1.5tb     15.9tb     17.4tb            8 172.29.4.158 172.29.4.158 es_node_158
   411          1tb       1tb      9.9tb     10.9tb            9 172.29.4.177 172.29.4.177 es_node_177
   411      840.6gb     1.8tb     15.6tb     17.4tb           10 172.29.4.156 172.29.4.156 es_node_156
   248        9.3tb     9.3tb      1.5tb     10.9tb           85 172.29.4.178 172.29.4.178 es_node_178

假设我们希望下掉es_node_158_2这个节点, 则下面3条命令任选其一

curl -XPUT 127.0.0.1:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
  "transient" :{
    "cluster.routing.allocation.exclude._ip": "<node_ip_address>"
  }
}'


curl -XPUT 127.0.0.1:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
  "transient" :{
    "cluster.routing.allocation.exclude._name": "es_node_158_2"
  }
}'


curl -XPUT 127.0.0.1:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
  "transient" :{
    "cluster.routing.allocation.exclude._id": "<node_id>"
  }
}'

确认上面的命令执行成功

curl -XGET "127.0.0.1:9200/_cluster/settings?pretty=true"
{
  "persistent" : {
    "cluster" : {
      "max_shards_per_node" : "30000"
    },
    "indices" : {
      "breaker" : {
        "fielddata" : {
          "limit" : "20%"
        }
      }
    },
    "search" : {
      "max_buckets" : "87000"
    },
    "xpack" : {
      "monitoring" : {
        "collection" : {
          "enabled" : "true"
        }
      }
    }
  },
  "transient" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "enable" : "all",
          "exclude" : {
            "_name" : "es_node_158_2"
          }
        }
      }
    }
  }
}

然后Elasticsearch会将es_node_158_2节点上的shards分配给其余节点. 再次查看shards allocation情况会发现es_node_158_2上面的shards数量在明显减少.

$ curl -XGET "127.0.0.1:9200/_cat/allocation?v"
shards disk.indices disk.used disk.avail disk.total disk.percent host         ip           node
   248        9.3tb     9.3tb      1.5tb     10.9tb           85 172.29.4.178 172.29.4.178 es_node_178
   438          1tb       1tb      9.9tb     10.9tb            9 172.29.4.177 172.29.4.177 es_node_177
   417      559.9gb   561.1gb     16.9tb     17.4tb            3 172.29.4.157 172.29.4.157 es_node_157
   441      963.1gb     1.8tb     15.6tb     17.4tb           10 172.29.4.156 172.29.4.156 es_node_156_2
   443      842.5gb     1.8tb     15.6tb     17.4tb           10 172.29.4.156 172.29.4.156 es_node_156
   443      747.1gb     1.5tb     15.9tb     17.4tb            8 172.29.4.158 172.29.4.158 es_node_158
   285      472.7gb     1.5tb     15.9tb     17.4tb            8 172.29.4.158 172.29.4.158 es_node_158_2  # shards开始减少

2, stop node and afterwork

等es_node_158_2上面的shards数量变为0的时候, 就可以登陆es_node_158_2并shutdown elasticsearch service了.

在es_node_158_2上面执行

$ systemctl stop elasticsearch
$ systemctl disable elasticsearch

在其它node上面执行

$ curl -XPUT 127.0.0.1:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
  "transient" :{
    "cluster.routing.allocation.exclude._name": null
  }
}'

参考文档: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html#cluster-shard-allocation-filtering

月与灯依旧

remove a node from ElasticSearch cluster

Leave a Reply Cancel reply