关于prometheus的配置


普罗米修斯

参考文章

https://blog.csdn.net/qq_45277554/article/details/130917620?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522171659760416800227433055%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=171659760416800227433055&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-2-130917620-null-null.14

https://blog.csdn.net/weixin_72625764/article/details/136681422?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522171659760416800227433055%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=171659760416800227433055&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~baidu_landing_v2~default-6-136681422-null-null.142

部署prometheus(master节点上)
安装prometheus主程序 
wget https://github.com/prometheus/prometheus/releases/download/v2.52.0/prometheus-2.52.0.linux-amd64.tar.gz

tar xf prometheus-2.52.0.linux-amd64.tar.gz -C /usr/local/
cd /usr/local/prometheus-2.52.0.linux-amd64.tar.gz

执行
./prometheus --config.file=prometheus.yml &

然后就可以通过浏览器测试了,建议通过vscode将服务器的端口映射到本地,然后在自己电脑上通过浏览器访问,具体可以查看上述参考文章

监控一个远端业务机器(worker节点上)
安装监控客户端
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.1/node_exporter-1.8.1.linux-amd64.tar.gz

tar xf node_exporter-1.8.1.linux-amd64.tar.gz -C /usr/local/
cd /usr/local/node_exporter-1.8.1.linux-amd64/

后台运行
nohup /usr/local/node_exporter-1.8.1.linux-amd64/node_exporter & [1] 7281

是否运行正确也可以参考上述文章

然后需要在master节点上修改prometheus的配置文件

master节点上运行
nano prometheus.yml
具体内容如下
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "workers" #定义名称

    static_configs:
      - targets: ["192.168.1.14:9100"] //两个workers节点,都要执行前面安装node_exporter的过程
      - targets: ["192.168.1.15:9100"]
在master节点上将prometheus部署成服务
nano /usr/lib/systemd/system/prometheus.service
内容如下,具体的路径配置请自行修改
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io
After=network.target
 
[Service]
Type=simple
ExecStart=/usr/local/prometheus-2.52.0.linux-amd64/prometheus \
--config.file=/usr/local/prometheus-2.52.0.linux-amd64/prometheus.yml \
--storage.tsdb.path=/usr/local/prometheus-2.52.0.linux-amd64/data/ \
--storage.tsdb.retention=15d \
--web.enable-lifecycle
  
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
 
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl start prometheus
systemctl reload prometheus

systemctl status prometheus
如果服务启动失败,可以查看一下是否配置路径写错,或者是9090端口已被占用
lsof -i :9090
如果被占用,通过kill指令杀死相应进程,并重新执行上述指令

上述做完后,也可以在浏览器中查看是否成功,具体可以参考前面的参考文章

nerdctl run -d –net flannel –name prometheus ubuntu:20.04 /bin/bash


文章作者: Swung
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Swung !
  目录