Prometheus monitoring of microservice applications. Vitaly Levchenko

Decryption of the 2016 report by Vitaliy Levchenko "Prometheus monitoring of microservice applications"


Prometheus, unlike classical systems, makes it easy to raise and maintain monitoring of rapidly changing and complexly organized systems. I will talk about implementation experience, pitfalls and unexpected behavior, show how to quickly configure the entire system, including notifications and dashboards.



, . , , , job' โ€” . : โ€” , , .. , , , , .


Prometheus Google Borgmon , . , , โ€” . โ€” , . โ€” .


, Prometheus .




, , , . , . Prometheus , . :


  • , , .
  • .
  • Prometheus.
  • - .


. . . .


, , developer. . . .



, , , .



? . . docker. , , , .



, . . , . , , โ€“ . ? , โ€“ . , production .



โ€“ production . production 10 . , continuous delivery. stages. . . .



, . . .


, . , , .


. , โ€“ . , - environments โ€“ . stage environment, environment, production.


, , . . , 20 , .



. The twelve factor app โ€“ Production. , , . โ€“ .



, .


. . . , , , . โ€“ . . . . , . . , . . .


โ€“ . . , , , Graphite, StatsD. , , push based, . .


โ€“ , , . . , , .



, , . , . Graphite, . . , 100 000 โ€“ . โ€“ , 100 000 . โ€“ . , 21 . , , , .



production Zabbix - ? . โ€“ . . . Zabbix โ€“ , . . , .


, . , , Zabbix . Nagios . , , . . . , , - .


, , , . Zabbix . , , Redis. .



, , . InfluxDB . . Graphite, InfluxDB , , .


InfluxDB , . . production. . . , . . .


, . , , . . . (?) . . .


, , , . . . , . . , โ€“ . .


Riemann . : ยซ CPU ยป. CPU โ€“ 100 %, 0 %. .


โ€“ , CollectD . , . InfluxDB. - , - , - , CollectD , . , , .


InfluxDB , time-series , .



Prometheus. ?


Google, , , Google Borgmom, , Facebook. , , , . . .


, . . production ( , , , ) , . . Prometheus .


. . Prometheus , . .


, . deprecate- . Grafana. Grafana , .


, Prometheus . . . . , 20 000 โ€“ . 24- 400 000 . 3 , 1 200 000. .


production . - . , . . , - .



, , , . . . environment. server, . handler . . . Prometheus . , .


, , . , , , . . , . .


? . : ยซ environment , ยป. .


. - , , . , , , โ€“ .



Prometheus , . . , .


. . , 4 . . , . Zookeeper, , โ€ฆ 95 % , 20 100 %. . โ€“ . . . , .



- pull metrics push metrics. , pull metrics .



StatsD โ€“ Graphite, . StatsD , . . .


, exporter . , , 50 000 . . , , production, .


StatsD exporter Prometheus . StatsD, StatsD, gauge etc, .


push gateway pushโ€™. Prometheus. StatsD, push gateway .



. , Prometheus, , . . . , , . . Amazon, Kubernetes, Mesos, Consul . . . , .


, Ansible, . . . - , , , . , Prometheus. reload , .


, . , , . javascript . , . .



. - .


Prometheus . . open source. , Postgres exporter โ€“ , . Node exporter, .


, . , . , . .


systemd. .



, . , .


. , . . , , . , , . Prometheus . . . Influx, .


. .



. , - , . , , . โ€“ โ€“rm โ€“ r <storage path/*. . . , , production. , , .


federation, , federation , .


. . . Prometheus . .


openTSDB . , . openTSDB , , , , Hadoop.



, , , , histogram summary. , .


, , 100 โ€“ . 100 300 โ€“ . , 300 โ€“ . histogram. , โ€ฆ , , , . . . , histogram , . 10 , 10 . - . , recording rules. , , . , .


Summary โ€“ . Summary . Summary , . , , , . . , . , 99- , , . . , . .


summary . . . , .


-. . .



? ? , , , :


  • . , , . . , latency , , . 99 % . , , - . . .
  • .
  • .
  • .
  • + .


?


  • . , . . 10 , 10 , 10 , . ( Service Discovery)
  • , . Grafana . , . : ยซ ยป. 10 . , . . , , .
  • โ€“ , . . . . , , , , .
  • โ€“ , , , , .
  • , Prometheus . . , Prometheus docker. . . โ€“ . . .


! ! Zabbix. , . , . , , Zabbix. environment, ?


. โ€“ environment, . ( Kubernetes Service Discovery)


. ., , developer, ? . . environment?


.


environment?


environment. environment. environment. , environment, stable โ€“ environment. ( Kubernetes Service Discovery)


, . . . - , , . , - . . . , - , , . , ?


Zabbix production, , . , , , -. full time , . Zabbix, , , 3-4- . 3-4- , Zabbix โ€“ . .


, 3 Zabbix?


.


. . developers Prometheus?


.


? ?


.


. environment, - sender, Prometheus .


sender. Prometheus . API httpโ€™, . Consul, Prometheus , . .


. , , - Prometheus?


, . .


. dockerโ€™, , , , - , . overhead, . . . , ?


?


.


health handler, . , , . , , .


! ! 4 . StatsD, , , . StatsD, Influxโ€™ Telegraf?


, StatsD, Influx.


Ok. , Grafana . , . Go, . , .


open source, .


GitHub , . . federation, ?


, . .


Ok. -?


, .


, . ? ?


. , . 100 000 . , 200 000-300 000. , , , , . Zabbix, . Influx , . . , queries .


?


?


, - , , .


, - . , . . .


All Articles