April 2019 – 月与灯依旧

Elasticsearch常用命令

1, Index

curl -X GET 'localhost:9200/_cat/indices?v'                          # 查看所有index

curl -X GET "localhost:9200/_cat/indices/INDEX_PATTERN-*?v&s=index"  # 查看某些名字的index

curl -X GET "localhost:9200/_cat/indices?v&s=docs.count:desc"        # 查看按容量大小排序的index

curl -XDELETE "localhost:9200/INDEX_NAME"                            # 删除某些index

curl -X GET "localhost:9200/INDEX_NAME/_count"                       # 查看某个INDEX的文档数量

curl -XGET "127.0.0.1:9200/_all/_settings?pretty=true "              # 查看所有Index的配置,会列出所有Index,很长...

curl -XGET "127.0.0.1:9200/office_dns_log-*/_settings?pretty=true"   # 查看某些Index的配置

curl -XGET "127.0.0.1:9200/office_dns_log-*/_settings/index.number_*?pretty=true" # 查看某些Index的shards和replicas数量

# 修改索引的replicas数量
curl -XPUT "localhost:9200/INDEX_NAME/_settings?pretty" -H 'Content-Type: application/json' -d' { "number_of_replicas": 2 }'

# 查看INDEX的mapping信息
curl -XGET "127.0.0.1:9200/INDEX_NAME/_mapping?include_type_name=true&pretty=true"

主要参考:
Get index settings API

2, Node

(这里主要参考了这篇文章.)

curl -X GET "localhost:9200/_cat/nodes?v"    # 查看各节点内存使用状态及负载情况
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.18.192.101           37          80   4    0.41    0.49     0.52 dim       -      it-elk-node3
172.18.192.102           69          90   4    1.19    1.53     1.56 dim       *      it-elk-node4
172.18.192.100           36         100   1    2.91    2.60     2.86 dim       -      it-elk-node2

curl -X GET "localhost:9200/_nodes/stats"
curl -X GET "localhost:9200/_nodes/nodeId1,nodeId2/stats"

# return just indices
curl -X GET "localhost:9200/_nodes/stats/indices"
# return just os and process
curl -X GET "localhost:9200/_nodes/stats/os,process"
# return just process for node with IP address 10.0.0.1
curl -X GET "localhost:9200/_nodes/10.0.0.1/stats/process"

# return just process
curl -X GET "localhost:9200/_nodes/process"
# same as above
curl -X GET "localhost:9200/_nodes/_all/process"
# return just jvm and process of only nodeId1 and nodeId2
curl -X GET "localhost:9200/_nodes/nodeId1,nodeId2/jvm,process"
# same as above
curl -X GET "localhost:9200/_nodes/nodeId1,nodeId2/info/jvm,process"
# return all the information of only nodeId1 and nodeId2
curl -X GET "localhost:9200/_nodes/nodeId1,nodeId2/_all"


# Fielddata summarised by node
curl -X GET "localhost:9200/_nodes/stats/indices/fielddata?fields=field1,field2"
# Fielddata summarised by node and index
curl -X GET "localhost:9200/_nodes/stats/indices/fielddata?level=indices&fields=field1,field2"
# Fielddata summarised by node, index, and shard
curl -X GET "localhost:9200/_nodes/stats/indices/fielddata?level=shards&fields=field1,field2"
# You can use wildcards for field names
curl -X GET "localhost:9200/_nodes/stats/indices/fielddata?fields=field*"

3, segment

# 查看所有INDEX的segment(注意如果INDEX较多, 这个列表可能很长)
curl -u elastic:HMEaQXtLiJaD4zn1ZxzM -X GET "127.0.0.1:9200/_cat/segments?v"

# 查看某个INDEX的segment
curl -u elastic:HMEaQXtLiJaD4zn1ZxzM -X GET "127.0.0.1:9200/_cat/segments/INDEX_PATTERN-*?v"

4, templates相关

template可以定义每个index的设置, 以及每个field的类型, 等等(仅对将来的INDEX有效, 不对现在已有的INDEX有效).

# 查看所有templates
curl -X GET "127.0.0.1:9200/_cat/templates?v&s=name"

# 查看某一个template
curl -X GET "127.0.0.1:9200/_template/template_1?pretty=true"

# 针对某INDEX设定一个template(仅针对未来创建的index有效)
curl -XPUT 127.0.0.1:9200/_template/template_1 -H 'Content-Type: application/json' -d'{
  "index_patterns": ["office_dns*"],
  "settings" : {
    "index.refresh_interval": "30s",
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "index.translog.durability": "request",
    "index.translog.sync_interval": "30s"
  },
  "order" : 1
}'

# 提示:
index.refresh_interval: INDEX的刷新间隔, 默认为1s, 即写入的数据经过多少可以在ES中搜索到.
number_of_shards: 分片的数量(重要), 建议设置为node数量, 如果一个index的容量超过了30G, 会导致查询速度很慢, 此时一定要通过shards数量来分散index
number_of_replicas: 副本数量
index.translog.durability: 将translog数据(包括index/update/delete等)待久化至硬盘的方式,request是系统默认方式.
index.translog.sync_interval：translog提交间隔, 默认是5s

参考文档: https://blog.csdn.net/u014646662/article/details/99293604

5, License相关

# 查看License
curl -XGET 'http://127.0.0.1:9200/_license'

# 删除License
curl -X DELETE "localhost:9200/_license"

# 导入License(本地的License文件为aaa.json), 如果启用了用户名/密码, 这里需要加上用户密码, 例如-u elastic:password
curl -XPUT 'http://127.0.0.1:9200/_xpack/license' -H "Content-Type: application/json" -d @aaa.json

6, shards管理

curl -XGET "127.0.0.1:9200/_cluster/settings?pretty"    # 查看集群最大分片数量
{
  "persistent" : {
    "cluster" : {
      "max_shards_per_node" : "30000"    # 单个节点能容纳30000个shards,默认值是1000
    }
    "xpack" : 
      "monitoring" : 
        "collection" : {
          "enabled" : "true"
        }
      }
    }
  }
}


curl -XGET "127.0.0.1:9200/_cluster/health?pretty"    # 查看当前shards使用数量
{
  "cluster_name" : "it-elk",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 7233,
  "active_shards" : 7248,
  "relocating_shards" : 0,
  "initializing_shards" : 12,
  "unassigned_shards" : 5323,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 2085,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 3167844,
  "active_shards_percent_as_number" : 57.601525868234916        #数据的正常率，100表示一切ok
}

# 查看未分配的分片
curl -XGET "127.0.0.1:9200/_cat/shards?h=index,shard,prirep,state,unassigned.*&pretty"|grep UNASSIGNED | wc -l

# 查看未分配分片, 以及未分配原因
curl -XGET localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED

7, template管理

# 查看所有templates
curl -X GET "127.0.0.1:9200/_cat/templates?v&s=name"

# 查看某一个template
curl -X GET "127.0.0.1:9200/_template/TEMPLATE_NAME?pretty=true"

# 设定一个名为mail的template, 使得以后的mail-w3svc1-*索引具备以下设定
curl -u -XPUT 127.0.0.1:9200/_template/mail -H 'Content-Type: application/json' -d'{
  "index_patterns": ["mail-w3svc1-*"],
  "settings" : {
    "index.refresh_interval": "30s",
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "index.translog.durability": "request",
    "index.translog.sync_interval": "30s"
  },
  "mappings": {
    "properties": {
      "rt": { "type": "integer" },
      "status": { "type": "integer" },
      "width": { "type": "float" }
      "uri": { "type": "text", "fielddata": true },
      "username": { "type": "text", "fielddata": true },
      "server_ip": { "type": "text", "fielddata": true },
      "client_ip": { "type": "text", "fielddata": true }
    }
  },
  "order" : 1
}'

# 提示:
1, index.refresh_interval: INDEX的刷新间隔, 默认为1s, 即写入的数据经过多少可以在ES中搜索到
由于刷新是很耗费资源的行为, 初次导入大量数据时, 可转设置长一点(如30s)等, 后来再改成5s或者10s.
2, number_of_shards: 主分片的数量(重要), 默认为1
如果一个index的容量超过了30G, 会导致查询速度很慢, 此时可以通过shards数量来分散index
3, number_of_replicas: 副本数量, 默认为1.
请注意, 如果一个INDEX的shards为5, 而replicas的话, 会导致这个INDEX一共有5*(1+1)个分片, 会拖慢集群性能.
建议: node数量<=shards数量*(replicas数量+1)
4, index.translog.durability: 将translog数据(包括index/update/delete等操作)待久化至硬盘的方式
5, index.translog.sync_interval: translog提交间隔, 默认是5s
6, mappings.properties: 表示rt/status这几个fields的类型为int类型,并开启username/server_ip等字段的外部脚本的聚合功能
常见字段类型可参考https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html
# 查看上面设定的mail模板
curl -X GET "127.0.0.1:9200/_template/mail?pretty=true"

8, Document

# 写入一条Document(如果index不存在则会自动创建)
curl -XPOST "127.0.0.1:9200/INDEX_NAME/_doc/1" -H "Content-Type: application/json" -d'{"name": "zhu kun"}'
 
# 查看一条Document
curl -XGET "127.0.0.1:9200/INDEX_NAME/_doc/1?pretty=true"
 
# 搜索一条数据
curl -XGET "127.0.0.1:9200/INDEX_NAME/_search?q=name:zhu&pretty=true"

9, 特别注意

一般text类型的field, 仅能在kibana中进行排序和搜索, 如果需要在脚本(如Grafana)中进行聚合(排序,统计,汇总等)操作,则需要设定fielddata为ture, 否则可能会报出如下的错误(参考官网这个文档):

Fielddata is disabled on text fields by default. Set `fielddata=true` on [`your_field_name`] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

参考文档:
https://www.datadoghq.com/blog/elasticsearch-unassigned-shards/#reason-2-too-many-shards-not-enough-nodes

Posted on 2019-04-25 by bear in 电脑网络.Tags: ElasticSearch, Linux.0 Comments