In this article, we will describe how we deployed Service Mesh, solved some of the problems of microservice architecture, and reduced the load on developers and infrastructure engineers.
Why we needed a Service Mesh
Service Mesh is gaining popularity. I think it makes no sense once again to define and describe everything why it might come in handy. If you do not know what it is about, but your company has many teams, services, or you are going to share a monolith, then you should familiarize yourself with the subject. You can start by translating an article written by the creator of the first Service Mesh project, William Morgan.
Over the past year, the number of product teams in the company has grown 1.5 times. At the time of this writing, we have more than 20 teams, almost everyone has developers. With the increase in their number, the development of the monolith did not become easier, therefore, we began to move more actively towards the microservice architecture. While we are rather at the beginning of our journey, however, already 3-4 years ago, services for various purposes began to multiply with decent speed.
Of course, the more services, the more difficult it is to keep track of them, the more difficult it is to manage them and the infrastructure configuration. This would be especially difficult in a traditional system in which there is a wall between developers and admins. Therefore, simultaneously with the creation of microservices, we began to give most of the responsibility for the launch and operation of the service to developers, providing Ops-examination as a service. However, developers, of course, do not know all the subtleties of setting up the infrastructure.
Service Mesh just very successfully combines and abstracts from developers such things as:
- Service Discovery
- Distributed tracing;
- Circuit Breaking / Retries / Timeouts;
- Monitoring / telemetry interactions and many others.
, , Docker-. Nomad, Consul. , Kubernetes , Nomad .
, Kubernetes, , HashiCorp-.
Nomad Docker- ( QEMU, raw/isolated fork/exec Java). Consul Service Discovery DNS-.

Nomad- Consul-. - ( ), .
Nomad Consul . Consul , (node), healthcheck . (TCP, HTTP, gRPC, script). Nomad key-value Consul Vault, , -. Nomad consul-template.
Nomad, Consul HTTP API, , , Nomad , โ ip-port - healthcheck. , k8s Pod, Allocation, .
, , operations engineer. , , : , , .
HashiCorp-, . Service Mesh , Redis .
Service Mesh
Service Mesh : (data plane control plane). Data plane , - . control plane, . .
โ sidecar ( Podโe/). , .
Service Mesh . Linkerd linkerd-proxy, Consul Connect Istio third-party . Envoy Lyft. open source , .
, :
- Service Mesh 2018 . , .
- .
- control plane .
control plane
control plane โ , Service Mesh . , , . , , .
Service Mesh , Mesh, . Envoy , , , .
Envoy data plane
Envoy sidecar- . sidecar-, . . (ingress) Consul, (egress) -. , , , .
: Nomad .

Service Discovery
Service Discovery? , ? sidecar, control plane agent, control plane API, API Consul API Envoy. endpoint discovery (EDS). ยซยป - Consul. Envoy , .
Prometheus. Envoy , , Circuit Breaker - . :

, , Vizceral Netflix:

โ , . , - . , Envoy Distributed tracing Zipkin/Jaeger (X-b3-headers).

, , Jaeger Elastic APM (Application Performance Management). โ Kibana, . Elastic , , , Jaeger, , APM Distributed tracing. Jaeger-, Envoy, , , , Elastic APM. Jaeger, Elastic APM , HTTP- trace-id . Jaeger , Elastic , .
Service Mesh . โ . HTTP/1, TCP Redis. Mesh , . , . , Service Mesh.
Ops- Mesh: , , Canary, . , Istio .
Service Mesh
, Service Mesh:
- โ .
- Service Discovery , - . c-plane 10โโฏ , , , , . , . Envoy 150 RAM, . , 100 RPS, 30 .
- . 2โ3 .
- โ , .
: