Descoberta simples de serviços em Prometheus via Consul

Lei de Pareto (princípio de Pareto, princípio 80/20) - "20% dos esforços produzem 80% do resultado, e os 80% restantes dão apenas 20% do resultado".
Wikipedia

Saudações, caro leitor!


Meu primeiro artigo sobre Habr é dedicado a uma solução simples e, espero, útil que tornou conveniente a coleta de métricas no Prometheus de servidores heterogêneos. Vou abordar alguns detalhes nos quais muitos podem não ter mergulhado enquanto explorava o Prometheus, e compartilharei minha abordagem para organizar uma descoberta de serviço leve.


Para isso, você precisará de: Prometheus, HashiCorp Consul, systemd, algum código Bash e conhecimento do que está acontecendo.


Se você estiver interessado em saber como tudo isso está conectado e como funciona, bem-vindo ao gato.


Prometeu + Festança + Consul


Conheça: Prometheus


Prometheus , Kubernetes. , Prometheus pull-, , , . Prometheus Kubernetes prometheus.yml kubernetes_sd_configs. kube-apiserver IP- pod' .


scrape_configs:
- job_name: kubernetes-pods
  kubernetes_sd_configs:
  - role: pod
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    insecure_skip_verify: true
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    regex: (.+);(.+)
    replacement: $1:$2
    target_label: __address__
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    action: replace
    regex: (.+)
    target_label: __metrics_path__
  - action: labelmap
    regex: __meta_(kubernetes_namespace|kubernetes_pod_name|kubernetes_pod_label_manifest_sha1|kubernetes_pod_node_name)

, . role=pod kubernetes_sd_configs.


, Prometheus Kubernetes, , DaemonSet prometheus/node_exporter, Kubernetes: node_exporter, Zookeeper, Kafka, ClickHouse, CEPH, Elasticsearch, Tarantool ...


targets static_configs Kafka OSD CEPH . . , , Ansible, , CEPH OSD . prometheus.yml. , .


, . prometheus.yml, , job_name . prometheus.yml Kafka 6 . 3. 3 ClickHouse, 4 . CEPH . , — prometheus/node_exporter . prometheus.yml static_configs.


Don’t specify default values unnecessarily: simple, minimal configuration will make errors less likely.

Kubernetes Documentation, Configuration Best Practices, General Configuration Tips.


* - - , . !


, . , , , .default .original. , , 2-3 - : , GitHub . , .*


HashiCorp Consul


prometheus.yml. Prometheus — consul_sd_configs. , Prometheus, HashiCorp Consul, . :


scrape_configs:
- job_name: SERVICE_NAME
  consul_sd_configs:
  - server: consul.example.com
    scheme: https
    tags: [test] # dev|test|stage|prod|...
    services: [prometheus-SERVICE_NAME-exporter]

Consul, , , agent. HTTP API Consul, . , : , . Consul. , Consul : KV-, HashiCorp Vault, Traefik. , , DNS. , Consul agent. Consul, HTTP API Prometheus , . , HTTPS, .


, Kubernetes StatefulSet Consul, Traefik, service discovery Prometheus. Ingress , Consul’s web UI. Traefik HTTPS- Let`s Encrypt DNS Challenge.


consul.yml
# https://consul.io/docs/agent/options.html

---
apiVersion: v1
kind: Service
metadata:
  name: consul
  labels:
    app: consul
spec:
  selector:
    app: consul
  ports:
  - name: http
    port: 8500

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: consul
  labels:
    app: consul
spec:
  serviceName: consul
  selector:
    matchLabels:
      app: consul
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      storageClassName: cephfs
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
  template:
    metadata:
      labels:
        app: consul
    spec:
      automountServiceAccountToken: false
      terminationGracePeriodSeconds: 60
      containers:
      - name: consul
        image: consul:1.6
        volumeMounts:
        - name: data
          mountPath: /consul/data
        args:
        - agent
        - -server
        - -client=0.0.0.0
        - -bind=127.0.0.1
        - -bootstrap
        - -bootstrap-expect=1
        - -disable-host-node-id
        - -dns-port=0
        - -ui
        ports:
        - name: http
          containerPort: 8500
        readinessProbe:
          initialDelaySeconds: 10
          httpGet:
            port: http
            path: /v1/agent/members
        livenessProbe:
          initialDelaySeconds: 30
          httpGet:
            port: http
            path: /v1/agent/members
        resources:
          requests:
            cpu: 0.2
            memory: 256Mi

---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: consul
  labels:
    app: consul
  annotations:
    traefik.ingress.kubernetes.io/frontend-entry-points: http,https
    traefik.ingress.kubernetes.io/redirect-entry-point: https
spec:
  rules:
  - host: consul.example.com
    http:
      paths:
      - backend:
          serviceName: consul
          servicePort: http

Prometheus, Consul , Consul, . . Bash.


Bash systemd.service


CoreOS Container Linux. DevOpsConf Russia 2018. Docker , systemd.service. Flatcar Linux, , CoreOS Container Linux. CoreOS Flatcar !


systemd.service. systemd — , Linux. systemd.service, [Service], ExecStartPost, ExecStop. - Consul.


prometheus-node-exporter.service. , static_configs 100 .


[Unit]
After=docker.service
[Service]
Environment=CONSUL_URL=https://consul.example.com
ExecStartPre=-/usr/bin/docker rm --force %N
ExecStart=/usr/bin/docker run \
    --name=%N \
    --rm=true \
    --network=host \
    --pid=host \
    --volume=/:/rootfs:ro \
    --label=logger=json \
    --stop-timeout=30 \
    prom/node-exporter:v0.18.1 \
    --log.format=logger:stdout?json=true \
    --log.level=error
ExecStartPost=/opt/bin/consul-service register -e prod -n %N -p 9100 -t prometheus,node-exporter
ExecStop=/opt/bin/consul-service deregister -e prod -n %N
ExecStop=-/usr/bin/docker stop %N
Restart=always
StartLimitInterval=0
RestartSec=10
KillMode=process
[Install]
WantedBy=multi-user.target

/opt/bin/consul-service. , .


:


  • CONSUL_URL — Consul .
  • CONSUL_TOKEN — HTTP- «X-Consul-Token», HTTP API Consul. Consul web UI, ACLs.

:


  • register/deregister — . .
  • -e — — [dev|test|prod|…].
  • -n — , prometheus-node-exporter. %N, systemd.unit .
  • -p — Prometheus . Consul.
  • -t — . Consul. Consul.

consul-service systemd.service . systemd.service , ExecStartPost consul-service register, Consul, Prometheus , . , . ExecStop, consul-service deregister, . Consul Prometheus , . , , Service Availability . , , , .


consul-service hostname search ( ) resolv.conf. DNS-, best practice .


, , !

IP, - . . .


consul-service Bash ~60 . cURL Bash. , . - MIT, , .


scrape_configs prometheus.yml prometheus/node_exporter, , .


scrape_configs:
- job_name: node-exporter
  consul_sd_configs:
  - server: consul.example.com
    scheme: https
    tags: [prod]
    services: [prometheus-node-exporter]

tags: [prod] production . -e consul-service, Consul.


Flatcar Linux /opt/bin Ansible. Playbook tasks. , crontab.


tasks:
- name: Create directory "/opt/bin"
  with_items: [/opt/bin]
  file:
    state: directory
    path: "{{ item }}"
    owner: root
    group: root
    mode: 0755
- name: Download "consul-service.sh" to "/opt/bin/consul-service"
  get_url:
    url: https://raw.githubusercontent.com/devinotelecom/consul-service/master/consul-service.sh
    dest: /opt/bin/consul-service
    owner: root
    group: root
    mode: 0755
    force: yes

Ansible, Python, Flatcar Linux , . Flatcar Linux.



, . , «+++», . .
— “ ! !”.


All Articles