Einfache Serviceerkennung in Prometheus über Consul

Pareto-Gesetz (Pareto-Prinzip, 80/20-Prinzip) - "20% der Bemühungen ergeben 80% des Ergebnisses, und die verbleibenden 80% der Bemühungen ergeben nur 20% des Ergebnisses."
Wikipedia

Grüße, lieber Leser!


Mein erster Artikel über Habr widmet sich einer einfachen und hoffentlich nützlichen Lösung, die mir das Sammeln von Metriken in Prometheus von heterogenen Servern erleichtert hat. Ich werde auf einige Details eingehen, auf die viele bei der Nutzung von Prometheus möglicherweise nicht eingegangen sind, und meinen Ansatz zur Organisation einer leichten Service-Entdeckung darin teilen.


Dazu benötigen Sie: Prometheus, HashiCorp Consul, systemd, etwas Bash-Code und das Bewusstsein dafür, was passiert.


Wenn Sie wissen möchten, wie dies alles zusammenhängt und wie es funktioniert, sind Sie bei cat willkommen.


Prometheus + Bash + Konsul


Treffen: Prometheus


Prometheus , Kubernetes. , Prometheus pull-, , , . Prometheus Kubernetes prometheus.yml kubernetes_sd_configs. kube-apiserver IP- pod' .


scrape_configs:
- job_name: kubernetes-pods
  kubernetes_sd_configs:
  - role: pod
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    insecure_skip_verify: true
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    regex: (.+);(.+)
    replacement: $1:$2
    target_label: __address__
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    action: replace
    regex: (.+)
    target_label: __metrics_path__
  - action: labelmap
    regex: __meta_(kubernetes_namespace|kubernetes_pod_name|kubernetes_pod_label_manifest_sha1|kubernetes_pod_node_name)

, . role=pod kubernetes_sd_configs.


, Prometheus Kubernetes, , DaemonSet prometheus/node_exporter, Kubernetes: node_exporter, Zookeeper, Kafka, ClickHouse, CEPH, Elasticsearch, Tarantool ...


targets static_configs Kafka OSD CEPH . . , , Ansible, , CEPH OSD . prometheus.yml. , .


, . prometheus.yml, , job_name . prometheus.yml Kafka 6 . 3. 3 ClickHouse, 4 . CEPH . , — prometheus/node_exporter . prometheus.yml static_configs.


Don’t specify default values unnecessarily: simple, minimal configuration will make errors less likely.

Kubernetes Documentation, Configuration Best Practices, General Configuration Tips.


* - - , . !


, . , , , .default .original. , , 2-3 - : , GitHub . , .*


HashiCorp Consul


prometheus.yml. Prometheus — consul_sd_configs. , Prometheus, HashiCorp Consul, . :


scrape_configs:
- job_name: SERVICE_NAME
  consul_sd_configs:
  - server: consul.example.com
    scheme: https
    tags: [test] # dev|test|stage|prod|...
    services: [prometheus-SERVICE_NAME-exporter]

Consul, , , agent. HTTP API Consul, . , : , . Consul. , Consul : KV-, HashiCorp Vault, Traefik. , , DNS. , Consul agent. Consul, HTTP API Prometheus , . , HTTPS, .


, Kubernetes StatefulSet Consul, Traefik, service discovery Prometheus. Ingress , Consul’s web UI. Traefik HTTPS- Let`s Encrypt DNS Challenge.


consul.yml
# https://consul.io/docs/agent/options.html

---
apiVersion: v1
kind: Service
metadata:
  name: consul
  labels:
    app: consul
spec:
  selector:
    app: consul
  ports:
  - name: http
    port: 8500

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: consul
  labels:
    app: consul
spec:
  serviceName: consul
  selector:
    matchLabels:
      app: consul
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      storageClassName: cephfs
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
  template:
    metadata:
      labels:
        app: consul
    spec:
      automountServiceAccountToken: false
      terminationGracePeriodSeconds: 60
      containers:
      - name: consul
        image: consul:1.6
        volumeMounts:
        - name: data
          mountPath: /consul/data
        args:
        - agent
        - -server
        - -client=0.0.0.0
        - -bind=127.0.0.1
        - -bootstrap
        - -bootstrap-expect=1
        - -disable-host-node-id
        - -dns-port=0
        - -ui
        ports:
        - name: http
          containerPort: 8500
        readinessProbe:
          initialDelaySeconds: 10
          httpGet:
            port: http
            path: /v1/agent/members
        livenessProbe:
          initialDelaySeconds: 30
          httpGet:
            port: http
            path: /v1/agent/members
        resources:
          requests:
            cpu: 0.2
            memory: 256Mi

---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: consul
  labels:
    app: consul
  annotations:
    traefik.ingress.kubernetes.io/frontend-entry-points: http,https
    traefik.ingress.kubernetes.io/redirect-entry-point: https
spec:
  rules:
  - host: consul.example.com
    http:
      paths:
      - backend:
          serviceName: consul
          servicePort: http

Prometheus, Consul , Consul, . . Bash.


Bash systemd.service


CoreOS Container Linux. DevOpsConf Russia 2018. Docker , systemd.service. Flatcar Linux, , CoreOS Container Linux. CoreOS Flatcar !


systemd.service. systemd — , Linux. systemd.service, [Service], ExecStartPost, ExecStop. - Consul.


prometheus-node-exporter.service. , static_configs 100 .


[Unit]
After=docker.service
[Service]
Environment=CONSUL_URL=https://consul.example.com
ExecStartPre=-/usr/bin/docker rm --force %N
ExecStart=/usr/bin/docker run \
    --name=%N \
    --rm=true \
    --network=host \
    --pid=host \
    --volume=/:/rootfs:ro \
    --label=logger=json \
    --stop-timeout=30 \
    prom/node-exporter:v0.18.1 \
    --log.format=logger:stdout?json=true \
    --log.level=error
ExecStartPost=/opt/bin/consul-service register -e prod -n %N -p 9100 -t prometheus,node-exporter
ExecStop=/opt/bin/consul-service deregister -e prod -n %N
ExecStop=-/usr/bin/docker stop %N
Restart=always
StartLimitInterval=0
RestartSec=10
KillMode=process
[Install]
WantedBy=multi-user.target

/opt/bin/consul-service. , .


:


  • CONSUL_URL — Consul .
  • CONSUL_TOKEN — HTTP- «X-Consul-Token», HTTP API Consul. Consul web UI, ACLs.

:


  • register/deregister — . .
  • -e — — [dev|test|prod|…].
  • -n — , prometheus-node-exporter. %N, systemd.unit .
  • -p — Prometheus . Consul.
  • -t — . Consul. Consul.

consul-service systemd.service . systemd.service , ExecStartPost consul-service register, Consul, Prometheus , . , . ExecStop, consul-service deregister, . Consul Prometheus , . , , Service Availability . , , , .


consul-service hostname search ( ) resolv.conf. DNS-, best practice .


, , !

IP, - . . .


consul-service Bash ~60 . cURL Bash. , . - MIT, , .


scrape_configs prometheus.yml prometheus/node_exporter, , .


scrape_configs:
- job_name: node-exporter
  consul_sd_configs:
  - server: consul.example.com
    scheme: https
    tags: [prod]
    services: [prometheus-node-exporter]

tags: [prod] production . -e consul-service, Consul.


Flatcar Linux /opt/bin Ansible. Playbook tasks. , crontab.


tasks:
- name: Create directory "/opt/bin"
  with_items: [/opt/bin]
  file:
    state: directory
    path: "{{ item }}"
    owner: root
    group: root
    mode: 0755
- name: Download "consul-service.sh" to "/opt/bin/consul-service"
  get_url:
    url: https://raw.githubusercontent.com/devinotelecom/consul-service/master/consul-service.sh
    dest: /opt/bin/consul-service
    owner: root
    group: root
    mode: 0755
    force: yes

Ansible, Python, Flatcar Linux , . Flatcar Linux.



, . , «+++», . .
— “ ! !”.


All Articles