Pareto law (Pareto principle, 80/20 principle) - “20% of the efforts yield 80% of the result, and the remaining 80% of the efforts give only 20% of the result."
Wikipedia
Greetings, dear reader!
My first article on Habr is devoted to a simple and, I hope, useful solution that made collecting metrics in Prometheus from heterogeneous servers convenient for me. I will touch on some details that many might not have dived into when using the Prometheus, and share my approach to organizing a lightweight service discovery in it.
For this you will need: Prometheus, HashiCorp Consul, systemd, some Bash code and awareness of what is happening.
If you are interested to know how all this is connected and how it works, welcome to cat.

Meet: Prometheus
Prometheus , Kubernetes. , Prometheus pull-, , , . Prometheus Kubernetes prometheus.yml
kubernetes_sd_configs
. kube-apiserver IP- pod' .
scrape_configs:
- job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: (.+);(.+)
replacement: $1:$2
target_label: __address__
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
regex: (.+)
target_label: __metrics_path__
- action: labelmap
regex: __meta_(kubernetes_namespace|kubernetes_pod_name|kubernetes_pod_label_manifest_sha1|kubernetes_pod_node_name)
, . role=pod
kubernetes_sd_configs
.
, Prometheus Kubernetes, , DaemonSet prometheus/node_exporter, Kubernetes: node_exporter, Zookeeper, Kafka, ClickHouse, CEPH, Elasticsearch, Tarantool ...
targets
static_configs
Kafka OSD CEPH . . , , Ansible, , CEPH OSD . prometheus.yml
. , .
, . prometheus.yml
, , job_name
. prometheus.yml
Kafka 6 . 3. 3 ClickHouse, 4 . CEPH . , — prometheus/node_exporter . prometheus.yml
static_configs
.
Don’t specify default values unnecessarily: simple, minimal configuration will make errors less likely.Kubernetes Documentation, Configuration Best Practices, General Configuration Tips.
* - - , . !
, . , , , .default
.original
. , , 2-3 - : , GitHub . , .*
HashiCorp Consul
prometheus.yml
. Prometheus — consul_sd_configs
. , Prometheus, HashiCorp Consul, . :
scrape_configs:
- job_name: SERVICE_NAME
consul_sd_configs:
- server: consul.example.com
scheme: https
tags: [test] # dev|test|stage|prod|...
services: [prometheus-SERVICE_NAME-exporter]
Consul, , , agent. HTTP API Consul, . , : , . Consul. , Consul : KV-, HashiCorp Vault, Traefik. , , DNS. , Consul agent. Consul, HTTP API Prometheus , . , HTTPS, .
, Kubernetes StatefulSet Consul, Traefik, service discovery Prometheus. Ingress , Consul’s web UI. Traefik HTTPS- Let`s Encrypt DNS Challenge.
consul.yml# https://consul.io/docs/agent/options.html
---
apiVersion: v1
kind: Service
metadata:
name: consul
labels:
app: consul
spec:
selector:
app: consul
ports:
- name: http
port: 8500
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: consul
labels:
app: consul
spec:
serviceName: consul
selector:
matchLabels:
app: consul
volumeClaimTemplates:
- metadata:
name: data
spec:
storageClassName: cephfs
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
template:
metadata:
labels:
app: consul
spec:
automountServiceAccountToken: false
terminationGracePeriodSeconds: 60
containers:
- name: consul
image: consul:1.6
volumeMounts:
- name: data
mountPath: /consul/data
args:
- agent
- -server
- -client=0.0.0.0
- -bind=127.0.0.1
- -bootstrap
- -bootstrap-expect=1
- -disable-host-node-id
- -dns-port=0
- -ui
ports:
- name: http
containerPort: 8500
readinessProbe:
initialDelaySeconds: 10
httpGet:
port: http
path: /v1/agent/members
livenessProbe:
initialDelaySeconds: 30
httpGet:
port: http
path: /v1/agent/members
resources:
requests:
cpu: 0.2
memory: 256Mi
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: consul
labels:
app: consul
annotations:
traefik.ingress.kubernetes.io/frontend-entry-points: http,https
traefik.ingress.kubernetes.io/redirect-entry-point: https
spec:
rules:
- host: consul.example.com
http:
paths:
- backend:
serviceName: consul
servicePort: http
Prometheus, Consul , Consul, . . Bash.
Bash systemd.service
CoreOS Container Linux. DevOpsConf Russia 2018. Docker , systemd.service. Flatcar Linux, , CoreOS Container Linux. CoreOS Flatcar !
— systemd.service. systemd — , Linux. systemd.service, [Service]
, ExecStartPost
, ExecStop
. - Consul.
prometheus-node-exporter.service
. , static_configs
100 .
[Unit]
After=docker.service
[Service]
Environment=CONSUL_URL=https://consul.example.com
ExecStartPre=-/usr/bin/docker rm --force %N
ExecStart=/usr/bin/docker run \
--name=%N \
--rm=true \
--network=host \
--pid=host \
--volume=/:/rootfs:ro \
--label=logger=json \
--stop-timeout=30 \
prom/node-exporter:v0.18.1 \
--log.format=logger:stdout?json=true \
--log.level=error
ExecStartPost=/opt/bin/consul-service register -e prod -n %N -p 9100 -t prometheus,node-exporter
ExecStop=/opt/bin/consul-service deregister -e prod -n %N
ExecStop=-/usr/bin/docker stop %N
Restart=always
StartLimitInterval=0
RestartSec=10
KillMode=process
[Install]
WantedBy=multi-user.target
/opt/bin/consul-service. , .
:
- CONSUL_URL — Consul .
- CONSUL_TOKEN — HTTP- «X-Consul-Token», HTTP API Consul. Consul web UI, ACLs.
:
- register/deregister — . .
- -e — — [dev|test|prod|…].
- -n — , prometheus-node-exporter. %N, systemd.unit .
- -p — Prometheus . Consul.
- -t — . Consul. Consul.
consul-service systemd.service . systemd.service , ExecStartPost
consul-service register
, Consul, Prometheus , . , . ExecStop
, consul-service deregister
, . Consul Prometheus , . , , Service Availability . , , , .
consul-service hostname
search
( ) resolv.conf
. DNS-, best practice .
consul-service Bash ~60 . cURL Bash. , . - MIT, , .
scrape_configs
prometheus.yml
prometheus/node_exporter, , .
scrape_configs:
- job_name: node-exporter
consul_sd_configs:
- server: consul.example.com
scheme: https
tags: [prod]
services: [prometheus-node-exporter]
tags: [prod]
production . -e
consul-service, Consul.
Flatcar Linux /opt/bin
Ansible. Playbook tasks
. , crontab
.
tasks:
- name: Create directory "/opt/bin"
with_items: [/opt/bin]
file:
state: directory
path: "{{ item }}"
owner: root
group: root
mode: 0755
- name: Download "consul-service.sh" to "/opt/bin/consul-service"
get_url:
url: https://raw.githubusercontent.com/devinotelecom/consul-service/master/consul-service.sh
dest: /opt/bin/consul-service
owner: root
group: root
mode: 0755
force: yes
Ansible, Python, Flatcar Linux , . Flatcar Linux.
, . , «+++», . .
— “ ! !”.