Lei de Pareto (princípio de Pareto, princípio 80/20) - "20% dos esforços produzem 80% do resultado, e os 80% restantes dão apenas 20% do resultado".
Wikipedia
Saudações, caro leitor!
Meu primeiro artigo sobre Habr é dedicado a uma solução simples e, espero, útil que tornou conveniente a coleta de métricas no Prometheus de servidores heterogêneos. Vou abordar alguns detalhes nos quais muitos podem não ter mergulhado enquanto explorava o Prometheus, e compartilharei minha abordagem para organizar uma descoberta de serviço leve.
Para isso, você precisará de: Prometheus, HashiCorp Consul, systemd, algum código Bash e conhecimento do que está acontecendo.
Se você estiver interessado em saber como tudo isso está conectado e como funciona, bem-vindo ao gato.

Conheça: Prometheus
Prometheus , Kubernetes. , Prometheus pull-, , , . Prometheus Kubernetes prometheus.yml
kubernetes_sd_configs
. kube-apiserver IP- pod' .
scrape_configs:
- job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: (.+);(.+)
replacement: $1:$2
target_label: __address__
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
regex: (.+)
target_label: __metrics_path__
- action: labelmap
regex: __meta_(kubernetes_namespace|kubernetes_pod_name|kubernetes_pod_label_manifest_sha1|kubernetes_pod_node_name)
, . role=pod
kubernetes_sd_configs
.
, Prometheus Kubernetes, , DaemonSet prometheus/node_exporter, Kubernetes: node_exporter, Zookeeper, Kafka, ClickHouse, CEPH, Elasticsearch, Tarantool ...
targets
static_configs
Kafka OSD CEPH . . , , Ansible, , CEPH OSD . prometheus.yml
. , .
, . prometheus.yml
, , job_name
. prometheus.yml
Kafka 6 . 3. 3 ClickHouse, 4 . CEPH . , — prometheus/node_exporter . prometheus.yml
static_configs
.
Don’t specify default values unnecessarily: simple, minimal configuration will make errors less likely.Kubernetes Documentation, Configuration Best Practices, General Configuration Tips.
* - - , . !
, . , , , .default
.original
. , , 2-3 - : , GitHub . , .*
HashiCorp Consul
prometheus.yml
. Prometheus — consul_sd_configs
. , Prometheus, HashiCorp Consul, . :
scrape_configs:
- job_name: SERVICE_NAME
consul_sd_configs:
- server: consul.example.com
scheme: https
tags: [test] # dev|test|stage|prod|...
services: [prometheus-SERVICE_NAME-exporter]
Consul, , , agent. HTTP API Consul, . , : , . Consul. , Consul : KV-, HashiCorp Vault, Traefik. , , DNS. , Consul agent. Consul, HTTP API Prometheus , . , HTTPS, .
, Kubernetes StatefulSet Consul, Traefik, service discovery Prometheus. Ingress , Consul’s web UI. Traefik HTTPS- Let`s Encrypt DNS Challenge.
consul.yml# https://consul.io/docs/agent/options.html
---
apiVersion: v1
kind: Service
metadata:
name: consul
labels:
app: consul
spec:
selector:
app: consul
ports:
- name: http
port: 8500
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: consul
labels:
app: consul
spec:
serviceName: consul
selector:
matchLabels:
app: consul
volumeClaimTemplates:
- metadata:
name: data
spec:
storageClassName: cephfs
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
template:
metadata:
labels:
app: consul
spec:
automountServiceAccountToken: false
terminationGracePeriodSeconds: 60
containers:
- name: consul
image: consul:1.6
volumeMounts:
- name: data
mountPath: /consul/data
args:
- agent
- -server
- -client=0.0.0.0
- -bind=127.0.0.1
- -bootstrap
- -bootstrap-expect=1
- -disable-host-node-id
- -dns-port=0
- -ui
ports:
- name: http
containerPort: 8500
readinessProbe:
initialDelaySeconds: 10
httpGet:
port: http
path: /v1/agent/members
livenessProbe:
initialDelaySeconds: 30
httpGet:
port: http
path: /v1/agent/members
resources:
requests:
cpu: 0.2
memory: 256Mi
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: consul
labels:
app: consul
annotations:
traefik.ingress.kubernetes.io/frontend-entry-points: http,https
traefik.ingress.kubernetes.io/redirect-entry-point: https
spec:
rules:
- host: consul.example.com
http:
paths:
- backend:
serviceName: consul
servicePort: http
Prometheus, Consul , Consul, . . Bash.
Bash systemd.service
CoreOS Container Linux. DevOpsConf Russia 2018. Docker , systemd.service. Flatcar Linux, , CoreOS Container Linux. CoreOS Flatcar !
— systemd.service. systemd — , Linux. systemd.service, [Service]
, ExecStartPost
, ExecStop
. - Consul.
prometheus-node-exporter.service
. , static_configs
100 .
[Unit]
After=docker.service
[Service]
Environment=CONSUL_URL=https://consul.example.com
ExecStartPre=-/usr/bin/docker rm --force %N
ExecStart=/usr/bin/docker run \
--name=%N \
--rm=true \
--network=host \
--pid=host \
--volume=/:/rootfs:ro \
--label=logger=json \
--stop-timeout=30 \
prom/node-exporter:v0.18.1 \
--log.format=logger:stdout?json=true \
--log.level=error
ExecStartPost=/opt/bin/consul-service register -e prod -n %N -p 9100 -t prometheus,node-exporter
ExecStop=/opt/bin/consul-service deregister -e prod -n %N
ExecStop=-/usr/bin/docker stop %N
Restart=always
StartLimitInterval=0
RestartSec=10
KillMode=process
[Install]
WantedBy=multi-user.target
/opt/bin/consul-service. , .
:
- CONSUL_URL — Consul .
- CONSUL_TOKEN — HTTP- «X-Consul-Token», HTTP API Consul. Consul web UI, ACLs.
:
- register/deregister — . .
- -e — — [dev|test|prod|…].
- -n — , prometheus-node-exporter. %N, systemd.unit .
- -p — Prometheus . Consul.
- -t — . Consul. Consul.
consul-service systemd.service . systemd.service , ExecStartPost
consul-service register
, Consul, Prometheus , . , . ExecStop
, consul-service deregister
, . Consul Prometheus , . , , Service Availability . , , , .
consul-service hostname
search
( ) resolv.conf
. DNS-, best practice .
consul-service Bash ~60 . cURL Bash. , . - MIT, , .
scrape_configs
prometheus.yml
prometheus/node_exporter, , .
scrape_configs:
- job_name: node-exporter
consul_sd_configs:
- server: consul.example.com
scheme: https
tags: [prod]
services: [prometheus-node-exporter]
tags: [prod]
production . -e
consul-service, Consul.
Flatcar Linux /opt/bin
Ansible. Playbook tasks
. , crontab
.
tasks:
- name: Create directory "/opt/bin"
with_items: [/opt/bin]
file:
state: directory
path: "{{ item }}"
owner: root
group: root
mode: 0755
- name: Download "consul-service.sh" to "/opt/bin/consul-service"
get_url:
url: https://raw.githubusercontent.com/devinotelecom/consul-service/master/consul-service.sh
dest: /opt/bin/consul-service
owner: root
group: root
mode: 0755
force: yes
Ansible, Python, Flatcar Linux , . Flatcar Linux.
, . , «+++», . .
— “ ! !”.