Prometheus监视微服务应用程序。维塔利·列琴科(Vitaly Levchenko)

Vitaliy Levchenko对“微服务应用程序的Prometheus监视”的2016年报告的解密


与传统系统不同,Prometheus使对快速变化且组织复杂的系统的监视和维护变得容易。我将讨论实施经验,陷阱和意外行为,并展示如何快速配置整个系统,包括通知和仪表板。



除了监视单一应用程序的经典问题之外,微服务还为监视带来了许多新的难题。服务的位置不断变化,经常出现新的服务,它们之间的依赖关系发生了变化,临时作业在随机的地方运行-稳定配置的概念消失了。生产的概念消失了:在一种环境中,一项服务的许多版本被启动了-在部署过程中,针对不同受众群体,用于测试等。看到这种幸福感,开发人员倾向于快速改进应用程序,创建许多新指标,不断取消旧指标,尽管如此,他们仍希望能够进行有效的监视并应对新问题。


Prometheus Google Borgmon , . , , — . — , . — .


, Prometheus .




, , , . , . Prometheus , . :


  • , , .
  • .
  • Prometheus.
  • - .


. . . .


, , developer. . . .



, , , .



? . . docker. , , , .



, . . , . , , – . ? , – . , production .



– production . production 10 . , continuous delivery. stages. . . .



, . . .


, . , , .


. , – . , - environments – . stage environment, environment, production.


, , . . , 20 , .



. The twelve factor app – Production. , , . – .



, .


. . . , , , . – . . . . , . . , . . .


– . . , , , Graphite, StatsD. , , push based, . .


– , , . . , , .



, , . , . Graphite, . . , 100 000 – . – , 100 000 . – . , 21 . , , , .



production Zabbix - ? . – . . . Zabbix – , . . , .


, . , , Zabbix . Nagios . , , . . . , , - .


, , , . Zabbix . , , Redis. .



, , . InfluxDB . . Graphite, InfluxDB , , .


InfluxDB , . . production. . . , . . .


, . , , . . . (?) . . .


, , , . . . , . . , – . .


Riemann . : « CPU ». CPU – 100 %, 0 %. .


– , CollectD . , . InfluxDB. - , - , - , CollectD , . , , .


InfluxDB , time-series , .



Prometheus. ?


Google, , , Google Borgmom, , Facebook. , , , . . .


, . . production ( , , , ) , . . Prometheus .


. . Prometheus , . .


, . deprecate- . Grafana. Grafana , .


, Prometheus . . . . , 20 000 – . 24- 400 000 . 3 , 1 200 000. .


production . - . , . . , - .



, , , . . . environment. server, . handler . . . Prometheus . , .


, , . , , , . . , . .


? . : « environment , ». .


. - , , . , , , – .



Prometheus , . . , .


. . , 4 . . , . Zookeeper, , … 95 % , 20 100 %. . – . . . , .



- pull metrics push metrics. , pull metrics .



StatsD – Graphite, . StatsD , . . .


, exporter . , , 50 000 . . , , production, .


StatsD exporter Prometheus . StatsD, StatsD, gauge etc, .


push gateway push’. Prometheus. StatsD, push gateway .



. , Prometheus, , . . . , , . . Amazon, Kubernetes, Mesos, Consul . . . , .


, Ansible, . . . - , , , . , Prometheus. reload , .


, . , , . javascript . , . .



. - .


Prometheus . . open source. , Postgres exporter – , . Node exporter, .


, . , . , . .


systemd. .



, . , .


. , . . , , . , , . Prometheus . . . Influx, .


. .



. , - , . , , . – –rm – r <storage path/*. . . , , production. , , .


federation, , federation , .


. . . Prometheus . .


openTSDB . , . openTSDB , , , , Hadoop.



, , , , histogram summary. , .


, , 100 – . 100 300 – . , 300 – . histogram. , … , , , . . . , histogram , . 10 , 10 . - . , recording rules. , , . , .


Summary – . Summary . Summary , . , , , . . , . , 99- , , . . , . .


summary . . . , .


-. . .



? ? , , , :


  • . , , . . , latency , , . 99 % . , , - . . .
  • .
  • .
  • .
  • + .


?


  • . , . . 10 , 10 , 10 , . ( Service Discovery)
  • , . Grafana . , . : « ». 10 . , . . , , .
  • – , . . . . , , , , .
  • – , , , , .
  • , Prometheus . . , Prometheus docker. . . – . . .


! ! Zabbix. , . , . , , Zabbix. environment, ?


. – environment, . ( Kubernetes Service Discovery)


. ., , developer, ? . . environment?


.


environment?


environment. environment. environment. , environment, stable – environment. ( Kubernetes Service Discovery)


, . . . - , , . , - . . . , - , , . , ?


Zabbix production, , . , , , -. full time , . Zabbix, , , 3-4- . 3-4- , Zabbix – . .


, 3 Zabbix?


.


. . developers Prometheus?


.


? ?


.


. environment, - sender, Prometheus .


sender. Prometheus . API http’, . Consul, Prometheus , . .


. , , - Prometheus?


, . .


. docker’, , , , - , . overhead, . . . , ?


?


.


health handler, . , , . , , .


! ! 4 . StatsD, , , . StatsD, Influx’ Telegraf?


, StatsD, Influx.


Ok. , Grafana . , . Go, . , .


open source, .


GitHub , . . federation, ?


, . .


Ok. -?


, .


, . ? ?


. , . 100 000 . , 200 000-300 000. , , , , . Zabbix, . Influx , . . , queries .


?


?


, - , , .


, - . , . . .


All Articles