Descubrimiento de servicio simple en Prometeo a través de Cónsul

Ley de Pareto (principio de Pareto, principio 80/20): "El 20% de los esfuerzos rinden el 80% del resultado, y el 80% restante de los esfuerzos dan solo el 20% del resultado".
Wikipedia

Saludos, querido lector!


Mi primer artículo sobre Habr está dedicado a una solución simple y, espero, útil que hizo que la recopilación de métricas en Prometheus desde servidores heterogéneos sea conveniente para mí. Tocaré algunos detalles en los que muchos podrían no haberse zambullido mientras explotaban Prometheus, y compartiré mi enfoque para organizar un descubrimiento de servicio ligero en él.


Para esto necesitará: Prometheus, HashiCorp Consul, systemd, algún código Bash y conocimiento de lo que está sucediendo.


Si está interesado en saber cómo está conectado todo esto y cómo funciona, bienvenido a cat.


Prometeo + Bash + Cónsul


Conoce: Prometeo


Prometheus , Kubernetes. , Prometheus pull-, , , . Prometheus Kubernetes prometheus.yml kubernetes_sd_configs. kube-apiserver IP- pod' .


scrape_configs:
- job_name: kubernetes-pods
  kubernetes_sd_configs:
  - role: pod
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    insecure_skip_verify: true
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    regex: (.+);(.+)
    replacement: $1:$2
    target_label: __address__
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    action: replace
    regex: (.+)
    target_label: __metrics_path__
  - action: labelmap
    regex: __meta_(kubernetes_namespace|kubernetes_pod_name|kubernetes_pod_label_manifest_sha1|kubernetes_pod_node_name)

, . role=pod kubernetes_sd_configs.


, Prometheus Kubernetes, , DaemonSet prometheus/node_exporter, Kubernetes: node_exporter, Zookeeper, Kafka, ClickHouse, CEPH, Elasticsearch, Tarantool ...


targets static_configs Kafka OSD CEPH . . , , Ansible, , CEPH OSD . prometheus.yml. , .


, . prometheus.yml, , job_name . prometheus.yml Kafka 6 . 3. 3 ClickHouse, 4 . CEPH . , — prometheus/node_exporter . prometheus.yml static_configs.


Don’t specify default values unnecessarily: simple, minimal configuration will make errors less likely.

Kubernetes Documentation, Configuration Best Practices, General Configuration Tips.


* - - , . !


, . , , , .default .original. , , 2-3 - : , GitHub . , .*


HashiCorp Consul


prometheus.yml. Prometheus — consul_sd_configs. , Prometheus, HashiCorp Consul, . :


scrape_configs:
- job_name: SERVICE_NAME
  consul_sd_configs:
  - server: consul.example.com
    scheme: https
    tags: [test] # dev|test|stage|prod|...
    services: [prometheus-SERVICE_NAME-exporter]

Consul, , , agent. HTTP API Consul, . , : , . Consul. , Consul : KV-, HashiCorp Vault, Traefik. , , DNS. , Consul agent. Consul, HTTP API Prometheus , . , HTTPS, .


, Kubernetes StatefulSet Consul, Traefik, service discovery Prometheus. Ingress , Consul’s web UI. Traefik HTTPS- Let`s Encrypt DNS Challenge.


consul.yml
# https://consul.io/docs/agent/options.html

---
apiVersion: v1
kind: Service
metadata:
  name: consul
  labels:
    app: consul
spec:
  selector:
    app: consul
  ports:
  - name: http
    port: 8500

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: consul
  labels:
    app: consul
spec:
  serviceName: consul
  selector:
    matchLabels:
      app: consul
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      storageClassName: cephfs
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
  template:
    metadata:
      labels:
        app: consul
    spec:
      automountServiceAccountToken: false
      terminationGracePeriodSeconds: 60
      containers:
      - name: consul
        image: consul:1.6
        volumeMounts:
        - name: data
          mountPath: /consul/data
        args:
        - agent
        - -server
        - -client=0.0.0.0
        - -bind=127.0.0.1
        - -bootstrap
        - -bootstrap-expect=1
        - -disable-host-node-id
        - -dns-port=0
        - -ui
        ports:
        - name: http
          containerPort: 8500
        readinessProbe:
          initialDelaySeconds: 10
          httpGet:
            port: http
            path: /v1/agent/members
        livenessProbe:
          initialDelaySeconds: 30
          httpGet:
            port: http
            path: /v1/agent/members
        resources:
          requests:
            cpu: 0.2
            memory: 256Mi

---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: consul
  labels:
    app: consul
  annotations:
    traefik.ingress.kubernetes.io/frontend-entry-points: http,https
    traefik.ingress.kubernetes.io/redirect-entry-point: https
spec:
  rules:
  - host: consul.example.com
    http:
      paths:
      - backend:
          serviceName: consul
          servicePort: http

Prometheus, Consul , Consul, . . Bash.


Bash systemd.service


CoreOS Container Linux. DevOpsConf Russia 2018. Docker , systemd.service. Flatcar Linux, , CoreOS Container Linux. CoreOS Flatcar !


systemd.service. systemd — , Linux. systemd.service, [Service], ExecStartPost, ExecStop. - Consul.


prometheus-node-exporter.service. , static_configs 100 .


[Unit]
After=docker.service
[Service]
Environment=CONSUL_URL=https://consul.example.com
ExecStartPre=-/usr/bin/docker rm --force %N
ExecStart=/usr/bin/docker run \
    --name=%N \
    --rm=true \
    --network=host \
    --pid=host \
    --volume=/:/rootfs:ro \
    --label=logger=json \
    --stop-timeout=30 \
    prom/node-exporter:v0.18.1 \
    --log.format=logger:stdout?json=true \
    --log.level=error
ExecStartPost=/opt/bin/consul-service register -e prod -n %N -p 9100 -t prometheus,node-exporter
ExecStop=/opt/bin/consul-service deregister -e prod -n %N
ExecStop=-/usr/bin/docker stop %N
Restart=always
StartLimitInterval=0
RestartSec=10
KillMode=process
[Install]
WantedBy=multi-user.target

/opt/bin/consul-service. , .


:


  • CONSUL_URL — Consul .
  • CONSUL_TOKEN — HTTP- «X-Consul-Token», HTTP API Consul. Consul web UI, ACLs.

:


  • register/deregister — . .
  • -e — — [dev|test|prod|…].
  • -n — , prometheus-node-exporter. %N, systemd.unit .
  • -p — Prometheus . Consul.
  • -t — . Consul. Consul.

consul-service systemd.service . systemd.service , ExecStartPost consul-service register, Consul, Prometheus , . , . ExecStop, consul-service deregister, . Consul Prometheus , . , , Service Availability . , , , .


consul-service hostname search ( ) resolv.conf. DNS-, best practice .


, , !

IP, - . . .


consul-service Bash ~60 . cURL Bash. , . - MIT, , .


scrape_configs prometheus.yml prometheus/node_exporter, , .


scrape_configs:
- job_name: node-exporter
  consul_sd_configs:
  - server: consul.example.com
    scheme: https
    tags: [prod]
    services: [prometheus-node-exporter]

tags: [prod] production . -e consul-service, Consul.


Flatcar Linux /opt/bin Ansible. Playbook tasks. , crontab.


tasks:
- name: Create directory "/opt/bin"
  with_items: [/opt/bin]
  file:
    state: directory
    path: "{{ item }}"
    owner: root
    group: root
    mode: 0755
- name: Download "consul-service.sh" to "/opt/bin/consul-service"
  get_url:
    url: https://raw.githubusercontent.com/devinotelecom/consul-service/master/consul-service.sh
    dest: /opt/bin/consul-service
    owner: root
    group: root
    mode: 0755
    force: yes

Ansible, Python, Flatcar Linux , . Flatcar Linux.



, . , «+++», . .
— “ ! !”.


All Articles