使用Collectd+Prometheus+Grafana监控nginx状态

在安装Nginx时,如果指定了–with-http_stub_status_module, 就可以使用本文的方法进行监控. 不用担心, 不论是从rpm/apt安装的Nginx, 均自带了该Module.

一般的建议是, 在nginx的机器上同时安装Collectd和collectd_exporter, 然后将数据导出到Prometheus(一般位于第三方服务器), 再从Grafana读取Prometheus中的数据.

1, 配置nginx

安装Nginx的过程此处略过, 我们需要确定一下Nginx安装了http_stub_status_module.

$ sudo nginx -V | grep http_sub_status
nginx version: nginx/1.14.0
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC)
built with OpenSSL 1.0.2k-fips  26 Jan 2017
TLS SNI support enabled
configure arguments: --user=nginx --group=nginx --prefix=/usr/local/nginx --conf-path=...

配置Nginx启用该module

location /nginx_status {
    stub_status on;
    access_log off;
    allow 127.0.0.1;
    deny all;
}

然后便可以通过http://ip/nginx_status来获取相关状态信息.

$ curl http://127.0.0.1/nginx_status
Active connections: 29
server accepts handled requests
 17750380 17750380 6225361
Reading: 0 Writing: 1 Waiting: 28

2, 安装Collectd

Collectd建议是安装在跟Nginx同一台主机上, 然后只收集此台主机的meteric.

$ yum install epel-release
$ yum install collectd collectd-nginx

$ vim /etc/collectd.conf       #修改如下选项, 这里我添加了CPU/内存的相关配置, 如果不需要可以自行去掉

LoadPlugin cpu
LoadPlugin memory
LoadPlugin write_http

<Plugin cpu>
  ReportByCpu true
  ReportByState true
  ValuesPercentage true
</Plugin>

<Plugin memory>
        ValuesAbsolute true
        ValuesPercentage false
</Plugin>

<Plugin write_http>
  <Node "collectd_exporter">
    URL "http://127.0.0.1:9103/collectd-post"
    Format "JSON"
    StoreRates false
  </Node>
</Plugin>

这里部分文档来自于collectd_exporter.

$ vim /etc/collectd.d/nginx.conf  #修改如下选项

LoadPlugin nginx                  #LoadPlugin语句在这里
<Plugin "nginx">
    URL "http://localhost/nginx_status"
</Plugin>

然后启动服务

$ sudo systemctl restart collectd && \
sudo systemctl status collectd

$ sudo systemctl enable collectd

3, 安装collectd_exporter

collectd_exporter的作用是把收到到的meteric信息, 以JSON的方式传递给Prometheus(被Prometheus抓取). 同样建议是将collectd_exporter安装在Nginx/Collectd同一台机器上.

$ wget https://github.com/prometheus/collectd_exporter/releases/download/v0.4.0/collectd_exporter-0.4.0.linux-amd64.tar.gz

$ tar zxvf collectd_exporter-0.4.0.linux-amd64.tar.gz

$ nohup collectd_exporter-0.4.0.linux-amd64/collectd_exporter --web.listen-address=":9103" --web.collectd-push-path="/collectd-post" &

这里也可以把collectd_exporter写成系统服务

$ sudo mv collectd_exporter-0.4.0.linux-amd64 /usr/local/collectd_exporter

$ sudo groupadd collectd_exporter
$ sudo useradd -g collectd_exporter -m -d /usr/local/collectd_exporter -s /sbin/nologin collectd_exporter

$ sudo chown -R collectd_exporter:collectd_exporter /usr/local/collectd_exporter
$ sudo vim /etc/systemd/system/collectd_exporter.service  #写入如下内容

[Unit]
Description=Collectd_exporter
After=network-online.target
Requires=network-online.target

[Service]
Type=simple
User=collectd_exporter
ExecStart=/bin/bash -l -c /usr/local/collectd_exporter/collectd_exporter --web.listen-address=":9103" --web.collectd-push-path="/collectd-post"
Restart=on-failure

[Install]
WantedBy=multi-user.target

$ sudo systemctl daemon-reload
$ sudo systemctl restart collectd_exporter.service && sudo systemctl status collectd_exporter.service
$ sudo systemctl enable collectd_exporter.service

确认服务启动成功

$ sudo netstat -anp | grep  :9103
tcp6       0      0 :::9103             :::*            LISTEN      12770/collectd_expo

4, 安装Prometheus

$ wget https://github.com/prometheus/prometheus/releases/download/v2.5.0/prometheus-2.5.0.linux-amd64.tar.gz

$ tar zxvf prometheus-2.5.0.linux-amd64.tar.gz

$ cd prometheus-2.5.0.linux-amd64

$ vim prometheus.yml    #添加2个新的job

global:
  scrape_interval:     10s
  evaluation_interval: 10s
  # scrape_timeout is set to the global default (10s).
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093
rule_files:

scrape_configs:

  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'nginx_172.17.27.238'
    static_configs:
      - targets: ['172.17.27.238:9103']
        labels:
          instance: '172.17.27.238'
  - job_name: 'nginx_172.17.27.239'
    static_configs:
      - targets: ['172.17.27.239:9103']
        labels:
          instance: '172.17.27.239'

$ nohup ./prometheus &    #运行Prometheus

这里我们可以写成systemd服务

$ cd ..
$ sudo mv prometheus-2.5.0.linux-amd64 /usr/local/prometheus

$ sudo groupadd prometheus
$ sudo useradd -g prometheus -m -d /usr/local/prometheus -s /sbin/nologin prometheus

$ sudo chown -R prometheus:prometheus /usr/local/prometheus

$ sudo vim /etc/systemd/system/prometheus.service  #写入如下内容

[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/usr/local/prometheus
Restart=on-failure
[Install]
WantedBy=multi-user.target

$ sudo systemctl daemon-reload
$ sudo systemctl restart prometheus.service && \
sudo systemctl status prometheus.service
$ sudo systemctl enable prometheus.service

确认服务启动成功

$ sudo netstat -antp | grep :9090
tcp6       0      0 :::9090           :::*          LISTEN      5313/prometheus

然后是在Grafana里面添加一个Prometheus类型的DataSource

最终形成的Dashboard

一些Metric的计算公式

参考文档:
监控Nginx连接
 Nginx状态信息（status）配置及信息详解
 安装Glance，InfluxDB和Grafana以监控CentOS 7

月与灯依旧

使用Collectd+Prometheus+Grafana监控nginx状态

Leave a Reply Cancel reply