Skip to content

标签:Linux

转载: 使用tcpdump抓包

1, tcpdump常用选项介绍

-n 禁止IP名称解析
-nn 禁止IP和端口名称解析
-i 指定捕获哪个网卡的网络数据包。
-w 指定将包写入哪个文件,如果文件不存在则创建该文件;如果存在则覆盖其内容
-f 指定过滤表达式,例如指定捕获哪个端口,哪个协议等
-r 指定从哪个文件读取网络数据包文件
-F 指定使用哪个文件的过滤表达式抓包
-D 列出所有可以使用tcpdump抓包的网卡
-c 指定捕获或者读取包的个数,-c后面直接接数字即可
-l 抓包时保存到文件的同时查看包的内容
-t 不打印时间戳
-tt 秒级时间戳
-ttt 打印时间戳到微秒或者纳秒,取决于 –time-stamp-precision option 选项
-s 指定每个包捕获的字节数
-S 打印绝对的tcp序列号,而不是相对的序列号
-v/-vv/-vvv 打印详细信息,v的个数越多, 打印内容越详细

上面是常用的选项, 更多的选项请参考tcpdump官方文档, 下面将对使用过滤条件抓包进行基本的介绍

2, tcpdump常用命令

#协议为tcp, 目标端口或源端口为80的包, 并将其写入packets.pcap文件中
tcpdump -nni ens33 -w packets.pcap 'tcp port 80'

#协议为tcp, 目标端口为80
tcpdump -nni ens33 -w packets.pcap 'tcp dst port 80' -c10

#协议类型为tcp, 源端口为80
tcpdump -nni ens33 -w packets.pcap 'tcp src port 80' -c10

#读取文件中协议类型为tcp, 目标端口为80的包
tcpdump -nnr packets.pcap 'tcp dst port 80' -c10

#将packets.pcap文件中目标端口为443的包转存到dst_port_443.pcap中
tcpdump -r packets.pcap 'dst port 443' -w dst_port_443.pcap 

#指定IP地址为14.215.177.39
tcpdump -nni ens33 host 14.215.177.39 -c5

#源IP地址为192.168.248.134
tcpdump -nni ens33 src 192.168.248.134 -c5

#目标IP地址为192.168.248.134
tcpdump -nni ens33 dst 192.168.248.134 -c5

#通往网络192.168.248.0/24
tcpdump -nni ens33 net 192.168.248.0/24 -c5

本文来源:
tcpdump使用过滤条件抓包(基础篇)

Leave a Comment

解决访问kibana monitoring 被提示Access Denied

在Kibana中使用”Stack Monitoring”时, 提示

Access Denied

You are not authorized to access Monitoring. To use Monitoring, you need the privileges granted by both the `kibana_user` and `monitoring_user` roles.

If you are attempting to access a dedicated monitoring cluster, this might be because you are logged in as a user that is not configured on the monitoring cluster.

解决办法: 停用Elasticsearch集群的remote.cluster功能, 将现有remote.cluster全部清除即可.

# 查看现有的 remote cluster
curl -XGET "127.0.0.1:9200/_cluster/settings?pretty"
{
  "persistent" : {
    "cluster" : {
      "remote" : {
        "aaa" : {
          "skip_unavailable" : "true",
          "seeds" : [
            "172.29.4.168:9300"
          ]
        },
        "leader" : {
          "seeds" : [
            "172.29.4.168:9300"
          ]
        },
        "hello-elk" : {
          "skip_unavailable" : "false",
          "seeds" : [
            "127.0.0.1:9300"
          ]
        }
      }
    },
    "xpack" : {
      "monitoring" : {
        "collection" : {
          "enabled" : "true"
        }
      }
    }
  },
  "transient" : { }
}

# 清除其中一个 remote cluster 节点
curl -X PUT "127.0.0.1:9200/_cluster/settings" -H 'Content-Type: application/json' -d'{
  "persistent" : {
    "cluster" : {
      "remote" : {
        "leader" : {
          "seeds" : null
        }
      }
    }
  }
}'

提示: 如果一个remote cluster节点设置了”skip_unavailable” : “true”信息, 直接清除可能会提示Cannot configure setting [cluster.remote.hello-elk.skip_unavailable] if remote cluster is not enabled. 解决办法为, 先将skip_unavailable设置为null, 再将seeds设置为null

Leave a Comment

Migrate data for Elasticsearch cluster

迁移ElasticSearch集群的数据, 最好用的是用到ElasticSearch的CCR(Cross-cluster replication, 跨集群复制)功能(). 但无奈今天配置了一天, 怎么也没有成功. 其实CCR存在的意义不仅仅是迁移数据, 更重要的是保证ElasticSearch集群的多副本/高可用状态. 如果仅仅是迁移数据的话, 只用到ES的reindex功能即可完成.

将旧集群(172.29.4.168:9200)里的mail-w3svc1-2020.06.06索引数据迁移过来, 仅需要在新集群上执行如下命令即可.

curl -X POST "localhost:9200/_reindex?pretty" -H 'Content-Type: application/json' -d'
{
  "source": {
    "remote": {
      "host": "http://172.29.4.168:9200",
      "username": "elastic",
      "password": "MyPassword"
    },
    "index": "mail-w3svc1-2020.06.06"
  },
  "dest": {
    "index": "mail-2020.06.06"
  }
}'

参考文档:
Reindex API

Leave a Comment

ElasticSearch中修改Field的Type

ElasticSearch中的Index一旦建立, 里面的Field类型就不可以再更改. 例如, 你不能把一个int类型的字段, 改为string类型. 否则该字段中的数据将失效. 那么如何解决这个问题呢? 答案就是重新建立索引(Reindex).

本文演示一下如何将以下旧的index中的数据以零停机的方式迁移到新的Index中.
旧Index: mail-w3svc1-2020.06.09
新Index: mail-w3svc1-2020.06.09-v2

1, 创建一个template, 指定新Index的mapping内容

curl -XPUT 127.0.0.1:9200/_template/mail-test -H 'Content-Type: application/json' -d'{
  "index_patterns": ["mail-w3svc1-2020.06.09-v2"],
  "settings" : {
    "index.refresh_interval": "10s",
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "index.translog.durability": "request",
    "index.translog.sync_interval": "30s"
  },
  "mappings": {
    "properties": {
      "rt": { "type": "integer" },
      "status": { "type": "integer" },
      "sub_status": { "type": "integer" }
    }
},
  "order" : 5
}'

2, 创建新Index

curl -XPUT 127.0.0.1:9200/mail-w3svc1-2020.06.09-v2

# 查看之
curl -XGET 127.0.0.1:9200/mail-w3svc1-2020.06.09-v2?pretty=true

3, 创建alias指向旧的Index(本例中的alias名为mail-w3svc1-2020.06.09-alt)

curl -XPOST localhost:9200/_aliases -H 'Content-Type: application/json' -d '{
  "actions": [
    { "add": { "index": "mail-w3svc1-2020.06.09", "alias": "mail-w3svc1-2020.06.09-alt" } }
  ]
}'
Leave a Comment

ElasticSearch 解决 429: es_rejected_execution_exception

ElasticSearch日志中出现如下内容:

[INFO ] 2020-06-04 11:04:45.093 [[main]>worker16] elasticsearch - retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of processing of [74000580][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[mail-w3svc1-2020.05.31][0]] containing [10] requests, target allocation id: 6OPCe3kOTWqG5k68lRozUA, primary term: 1 on EsThreadPoolExecutor[name = it_elk_node171/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@31aa695f[Running, pool size = 40, active threads = 40, queued tasks = 200, completed tasks = 31488344]]"})

原因分析:

为了防止集群过载,Elasticsearch在设计之初就限制了请求队列的大小, 从而增加了稳定性和可靠性。如果没有限制,客户端可以很容易地通过不良或恶意行为将整个群集关闭。所以在并发查询量大的情况下,访问流量超过了集群中单个Elasticsearch实例的处理能力,Elasticsearch服务端会触发保护性的机制,拒绝执行新的访问,并且抛出EsRejectedExecutionException异常。

这个保护机制与异常触发是由Elasticsearch API实现中的thread pool与配套的queue决定的. 在上面的错误日志中,Elasticsearch为index操作分配的线程池,pool size=40, queue capacity=200,当40个线程处理不过来,并且队列中缓冲的tasks超过200个,那么新的task就会被简单的丢弃掉,并且抛出EsRejectedExecutionException异常。

Elasticsearch将此条日志的级别定义为INFO, 说明它不是一个严重的问题, 根据ElasticSearch官方文档, 有时候甚至不用太关心, 但是依旧可以做一些有效的事来改善它.

定位与解决

根据ElasticSearch官方文档, 要解决429错误, 只有如下途径:

1, 暂停bulk write过程3-5秒钟,  然后重试.

2, 增加thread pool的write线程池相关参数.

thread pool的write线程池相关参数, 可以通过如下命令来获得:

curl -XGET "127.0.0.1:9200/_cat/thread_pool/write?v&h=node_name,ip,name,active,queue,rejected,completed,core,type,pool_size,queue_size"
node_name      ip           name  active queue rejected completed core type  pool_size queue_size
it_elk_node169 172.29.4.169 write      0     0     1174  22801844      fixed        40        200
it_elk_node168 172.29.4.168 write      0     0     1811  32356307      fixed        40        200
it_elk_node167 172.29.4.167 write     30     0    11797  39593375      fixed        40        200
it_elk_node171 172.29.4.171 write      0     0    12304  28194291      fixed        40        200
it_elk_node170 172.29.4.170 write      1     0      743  25107047      fixed        40        200

其中, active和queue表示当前消耗的线程数与队列数, pool_size和queue_size表示系统设置的线程数和队列数.

从ElasticSearch 5开始, 无法再通过cluster API去调整thread pool的设置(官网说明):

Thread pool settings are now node-level settings. As such, it is not possible to update thread pool settings via the cluster settings API.

解决办法

size表示线程数, 不能超过CPU核数.

vim elasticsearch.yml    # 添加如下内容
thread_pool:
    write:
        size: 30
        queue_size: 1000

然后重启ES. 重启之后, 再看, 就会发现pool_size和queue_size的值发生了变化

curl -X GET "127.0.0.1:9200/_cat/thread_pool/write?v&h=node_name,ip,name,active,queue,rejected,completed,core,type,pool_size,queue_size"
node_name      ip           name  active queue rejected completed core type  pool_size queue_size
it_elk_node167 172.29.4.167 write      0     0        0   1305481      fixed        40       3000
it_elk_node169 172.29.4.169 write      0     0        0   4266653      fixed        40        200
it_elk_node168 172.29.4.168 write      0     0     7280   7097169      fixed        40        200
it_elk_node170 172.29.4.170 write      0     0        0   2271912      fixed        40       3000
it_elk_node171 172.29.4.171 write      0     0        0   4377790      fixed        40       3000

同样的, 也可以通过类似的命令, 查看get/ccr(跨集群复制)/search的pool_size/queue_size

# 查看所有node详细的thread_pool
curl -X GET "127.0.0.1:9200/_nodes/thread_pool?pretty=true"


# 查看search的pool_size/queue_size
curl -X GET "127.0.0.1:9200/_cat/thread_pool/search?v&h=node_name,ip,name,active,queue,rejected,completed,core,type,pool_size,queue_size"
node_name      ip           name   active queue rejected completed core type                  pool_size queue_size
it_elk_node167 172.29.4.167 search      0     0        0      8418      fixed_auto_queue_size        61       1000
it_elk_node169 172.29.4.169 search      0     0        0     64923      fixed_auto_queue_size        61       1000
it_elk_node168 172.29.4.168 search      0     0        0    111409      fixed_auto_queue_size        61       1000
it_elk_node170 172.29.4.170 search      0     0        0     50890      fixed_auto_queue_size        61       1000
it_elk_node171 172.29.4.171 search      0     0        0     38820      fixed_auto_queue_size        61       1000

# 查看get的pool_size/queue_size
curl -X GET "127.0.0.1:9200/_cat/thread_pool/get?v&h=node_name,ip,name,active,queue,rejected,completed,core,type,pool_size,queue_size"
node_name      ip           name active queue rejected completed core type  pool_size queue_size
it_elk_node167 172.29.4.167 get       0     0        0       819      fixed        40       1000
it_elk_node169 172.29.4.169 get       0     0        0       331      fixed        40       1000
it_elk_node168 172.29.4.168 get       0     0        0      4756      fixed        40       1000
it_elk_node170 172.29.4.170 get       0     0        0         9      fixed         9       1000
it_elk_node171 172.29.4.171 get       0     0        0      6193      fixed        40       1000


# 查看ccr(跨集群复制)的pool_size/queue_size
curl -X GET "127.0.0.1:9200/_cat/thread_pool/ccr?v&h=node_name,ip,name,active,queue,rejected,completed,core,type,pool_size,queue_size"
node_name      ip           name active queue rejected completed core type  pool_size queue_size
it_elk_node167 172.29.4.167 get       0     0        0       819      fixed        40       1000
it_elk_node169 172.29.4.169 get       0     0        0       331      fixed        40       1000
it_elk_node168 172.29.4.168 get       0     0        0      4756      fixed        40       1000
it_elk_node170 172.29.4.170 get       0     0        0         9      fixed         9       1000
it_elk_node171 172.29.4.171 get       0     0        0      6193      fixed        40       1000


# 查看所有thread pool
curl -XGET "127.0.0.1:9200/_cat/thread_pool?v&h=id,ip,name,queue,rejected,completed"
id                     ip           name                queue rejected completed
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 analyze                 0        0         0
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 ccr                     0        0    140182
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 fetch_shard_started     0        0         0
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 fetch_shard_store       0        0        60
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 flush                   0        0      8700
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 force_merge             0        0         0
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 generic                 0        0   1267594
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 get                     0        0    158496
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 listener                0        0     16736
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 management              0        0   7373132
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 refresh                 0        0  15758014
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 rollup_indexing         0        0         0
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 search                  0        0   1474802
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 search_throttled        0        0         0
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 security-token-key      0        0         0
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 snapshot                0        0         2
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 transform_indexing      0        0         0
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 warmer                  0        0    749996
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 watcher                 0        0     23390
VbBadlehSNqlXKtWGlyUvA 172.29.4.156 write                   0        0  34229769
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 analyze                 0        0         0
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 ccr                     0        0     70051
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 fetch_shard_started     0        0         0
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 fetch_shard_store       0        0        34
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 flush                   0        0      8912
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 force_merge             0        0         0
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 generic                 0        0   1688212
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 get                     0        0     95389
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 listener                0        0     11281
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 management              0        0  11715898
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 refresh                 0        0  15957993
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 rollup_indexing         0        0         0
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 search                  0        0    884079
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 search_throttled        0        0         0
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 security-token-key      0        0         0
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 snapshot                0        0         0
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 transform_indexing      0        0         0
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 warmer                  0        0    482303
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 watcher                 0        0     46692
Py5AN5uKS7OWoDXwdr_ayw 172.29.4.157 write                   0        0  50279093
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 analyze                 0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 ccr                     0        0     70124
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 fetch_shard_started     0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 fetch_shard_store       0        0        50
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 flush                   0        0     14722
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 force_merge             0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 generic                 0        0   1473305
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 get                     0        0    113200
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 listener                0        0     12675
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 management              0        0  27809756
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 refresh                 0        0  12937481
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 rollup_indexing         0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 search                  0        0    987998
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 search_throttled        0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 security-token-key      0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 snapshot                0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 transform_indexing      0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 warmer                  0        0    670607
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 watcher                 0        0         0
wrKFBPZGQCm9kHxDAe8krw 172.29.4.158 write                   0        0  69187927

 

Leave a Comment

ElasticSearch 7.x 解决 TooManyBucketsException 问题

ElasticSearch 7.x 版本出现如下提示: Caused by: org.elasticsearch.search.aggregations.MultiBucketConsumerService$TooManyBucketsException: Trying to create too many buckets. Must be less than or equal to: [10000] but was [10314]. This limit can be set by changing the [search.max_buckets] cluster level setting.

分析: 这是6.x以后版本的特性, 目的是限制大批量聚合操作, 规避性能风险.

解决方法1: 增加ElasticSearch的search.max_buckets限制

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d '{"persistent": { "search.max_buckets": 50000 }}'

解决方法2: 在时间间隔/文档数量上面加一些限制, 缩减buckets的数量

To minimize these either add change min time interval on datasource or panel level or either add min doc count on date histogram to 1.

原因分析

Elasticsearch官网关于buckets的解释如下:

the buckets effectively define document sets. In addition to the buckets themselves, the bucket aggregations also compute and return the number of documents that “fell into” each bucket.

简单说, bucket就是文档的数据集合. 我的理解是, 查询的结果集里, 有多少种不同类型的数据集, 就有多少个bucket. 下面我结合一个实例, 说明一下我理解的bucket是什么(可能我的理解不一定正确, 欢迎指正).

假设我有以下index. type的类型只可能有以下5种: query[A], query[AAAA], forwarded, reply, cached

@timestamp type
Jun 23, 2020 @ 19:32:45.000 query[AAAA]
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 cached
Jun 23, 2020 @ 19:32:45.000 cached
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 query[A]
Jun 23, 2020 @ 19:32:45.000 cached
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 cached
Jun 23, 2020 @ 19:32:45.000 config
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 forwarded
Jun 23, 2020 @ 19:32:45.000 cached
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 reply
Jun 23, 2020 @ 19:32:45.000 cached

假设查询时间为过去15分钟

如果设置”Min time interval”为1s, 则一共有15*(60/1)=900个时间段, 而在每一个时间段里, 一共有5种不同的bucket, 这样会导致产生900*5=4500个bucket

如果设置”Min time interval”为30s, 则一共有15*(60/30)=30个时间段, 而在每一个时间段里, 一共有5种不同的bucket, 这样会导致产生30*5=150个bucket

这样也就很好的解释了, 为什么加大Min time interval的值, 可以解决这个问题.

参考文档:
Increasing max_buckets for specific Visualizations
ElasticSearch search_phase_execution_exception
ElasticSearch 7.x too_many_buckets_exception #17327

Leave a Comment

CentOS 7 安装配置 kafka

通常来说, logstash的处理能力有限, 为了防止高峰期日志数量太高导致kafka挂掉, 一般使用kafka来缓存日志消息.

kafka依赖zookeeper, 因此需要先安装配置zookeeper, 再安装配置kafka.

系统环境:

系统统一采用CentOS 7.8 64bit

IP地址 Zookeeper安装目录 Zookeeper DATA目录 kafka安装目录 kafka内网调用域名
172.29.4.168 /data/zookeeper /data/zookeeper/data /data/kafka kafka1.zhukun.net
172.29.4.169 /data/zookeeper /data/zookeeper/data /data/kafka kafka2.zhukun.net
172.29.4.170 /data/zookeeper /data/zookeeper/data /data/kafka kafka3.zhukun.net

1, 部署并配置Zookeeper

以下操作需要同时在3台服务器上操作

$ wget http://mirror.bit.edu.cn/apache/zookeeper/stable/apache-zookeeper-3.5.8-bin.tar.gz
$ tar zxvf apache-zookeeper-3.5.8-bin.tar.gz
$ mv apache-zookeeper-3.5.8-bin /data/
$ ln -s /data/apache-zookeeper-3.5.8-bin /data/zookeeper && mkdir /data/zookeeper/data

准备zk配置文件

$ vim /data/zookeeper/conf/zoo.cfg    # 写入如下配置
dataDir=/data/zookeeper/data
clientPort=2181
maxClientCnxns=0
admin.enableServer=false
# admin.serverPort=8080
initLimit=10
syncLimit=5
server.1=172.29.4.168:2888:3888
server.2=172.29.4.169:2888:3888
server.3=172.29.4.170:2888:3888

准备系统服务

Leave a Comment