Diagnose problems in microservice architecture on Node.js using OpenTracing and Jaeger


Hello everyone! In the modern world, the ability to scale the application at the click of a finger is extremely important, because the load on the application can vary greatly at different times. The influx of customers who decide to use your service can bring both large profits and losses. Partitioning an application into separate services solves scaling problems; you can always add instances of loaded services. This will undoubtedly help to cope with the load and the service will not fall from the surging customers. But microservices, along with their undeniable benefits, also introduce a more complex application structure, as well as entanglement in their relationships. What if even successfully scaling up my service, the problems continue? The response time is growing and there are more errors? How to understandwhere exactly is the problem? After all, each request to the API can generate a chain of calls of different microservices, receiving data from several databases and third-party APIs. Maybe this is a network problem, or your partner’s API can’t cope with the load, or maybe this cache is to blame? In this article I will try to tell how to answer these questions and quickly find the point of failure. Welcome to cat.


In order to quickly determine the point of failure and solve the problem, it is necessary to collect metrics for passing through each stage of the request. To solve this problem, you can use the OpenTracing specification . This tool describes the basic principles and data models for working with traces in distributed systems, but does not provide an implementation. In this article we will use implementation for JavaScript, and we will write examples on TypeScript. But in order to proceed to practice, it is necessary to understand the theory.


Theory



The main concepts in the OpenTracing specification are Trace, Span, SpanContext, Carrier, Tracer.


  • Trace. , Span', traceId. Span' c . ChildOf β€” . , span'a . FollowsFrom , span span, .
  • Span. OpenTracing. Span , . , , , span, . Span OpenTracing, Tracer ( ). ( ), Span timestamp spanId. traceId, span , traceId , spana' , . , span finish. Span timestamp , span ( ). span , key:value. , , , OpenTracing. , , Span', , error: true. , , , . , , , timestamp . span'a . .
  • SpanContext. , OpenTracing, , span' . traceId, spanId, key:value, . OpenTracing baggage. , . SpanContext, span', . span.
  • Carrier. key:value , SpanContext. Carrier tracer. OpenTracing . β€” FORMAT_TEXT_MAP, key:value. . FORMAT_BINARY . , tracer. , FORMAT_HTTP_HEADERS, http.
  • Tracer. OpenTracing, span', , (distributed tracing system) Jaeger Elastic APM. Tracer , carrier . inject extract


Extract carrier'a, carrier . Inject SpanContext, carrier , , . , span'.



, , typescript . NATS, Jaeger.


NATS


, , golang. Publish-Subscribe Request-Reply . NATS , , , . NATS , , . NATS, Docker.


docker run -d --name nats -p 4222:4222 -p 6222:6222 -p 8222:8222 nats

Jaeger


, opensource Uber. Jaeger , , , . Jaeger Cassandra, Elasticsearch , . Kafka, , span', . , Jaeger, . :


  • Const. , ( 1) ( 0)


  • Probabilistic. , Jaeger . . , 0.1 1 10.


  • Rate Limiting. .


  • Remote. , Jaeger'a. , .



, Jaeger, , Docker


docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.8


, . http. get /devices/:regionId json , . .



NATS. endpoint , , . , http , api NATS x devices. devices (, mongodb), , ( redis) . devices id users . api. , .


,


//  
export interface Location {
  lat: number;
  lng: number;
}

export interface Device {
  id: string;
  regionId: string;
  userId: string;
  connected: boolean;
}

export interface User {
  id: string;
  name: string;
  address: string;
}

//   
export interface ConnectedDevice extends Device {
  user: User;
  connected: true;
  location: Location;
}

devices users


export const UsersMethods = {
  getByIds: 'users.getByIds',
};

export const DevicesMethods = {
  getByRegion: 'devices.getByRegion',
};

, NATS, publish subscribe. .


import * as Nats from 'nats';
import * as uuid from 'uuid';

export class Transport {
  private _client: Nats.Client;
  public async connect() {
    return new Promise(resolve => {
      this._client = Nats.connect({
        url: process.env.NATS_URL || 'nats://localhost:4222',
        json: true,
      });

      this._client.on('error', error => {
        console.error(error);
        process.exit(1);
      });

      this._client.on('connect', () => {
        console.info('Connected to NATS');
        resolve();
      });
    });
  }
  public async disconnect() {
    this._client.close();
  }
  public async publish<Request = any, Response = any>(subject: string, data: Request): Promise<Response> {
    const replyId = uuid.v4();
    return new Promise(resolve => {
      this._client.publish(subject, data, replyId);
      const sid = this._client.subscribe(replyId, (response: Response) => {
        resolve(response);
        this._client.unsubscribe(sid);
      });
    });
  }
  public async subscribe<Request = any, Response = any>(subject: string, handler: (msg: Request) => Promise<Response>) {
    this._client.subscribe(subject, async (msg: Request, replyId: string) => {
      const result = await handler(msg);
      this._client.publish(replyId, result);
    });
  }
}

http api express. index api Transport, devices.


(async () => {
  const transport = new Transport();
  const port = 5000;

  await transport.connect();
  const api = express();

  api.get('/devices/:regionId', async (request, response) => {
    const result = await transport.publish<GetByRegion, ConnectedDevice[]>(DevicesMethods.getByRegion, {
      regionId: request.params.regionId,
    });

    response.send(result);

    return result;
  });
  api.listen(port, () => {
    console.info(`Server started on port ${port}`);
  });
})();

devices users. devices, users .


. mongodb, redis


export class DeviceRepository {
  private db = 'mongodb';
  private devices: Device[] = [...];
  public async getByRegion(regionId: string): Promise<Device[]> {
    return new Promise(resolve => {
      setTimeout(() => resolve(this.devices), 300);
    });
  }
}

export class LocationRepository {
  private db = 'redis';
  private locations = new Map<string, Location>([...]);

  public async getLocation(deviceId: string): Promise<Location> {
    return new Promise(resolve => {
      setTimeout(() => resolve(this.locations.get(deviceId)), 40);
    });
  }
}

β€” devices, getByRegion. .


export async function getByRegion(request: Msg<GetByRegion>) {
  try {
    const deviceRepository = new DeviceRepository();
    const locationRepository = new LocationRepository();

    const regionId = request.regionId;
    const devices = await deviceRepository.getByRegion(regionId);

    const connectedDevices = await Promise.all(
      devices.map(async device => {
        const location = await locationRepository.getLocation(device.id);
        return { ...device, location };
      }),
    );

    const users: User[] = await transport.publish<GetByIds, User[]>(UsersMethods.getByIds, {
      ids: devices.map(device => device.id),
    });

    return connectedDevices.map(device => {
      const user = users.find(user => user.id === device.userId);
      return {
        ...device,
        user,
      };
    });
  } catch (error) {
    console.error(error);
    (this as any).createError(error);
  }
}

index devices Transport .


export const transport = new Transport();

(async () => {
  try {
    await transport.connect();

    transport.subscribe(DevicesMethods.getByRegion, getByRegion);
  } catch (error) {
    console.error(error);
    process.exit(1);
  }
})();

, endpoint, , json . , , , Tracer. OpenTracing jaeger-client.


import { JaegerTracer, initTracer } from 'jaeger-client';

export class Tracer {
  private _client: JaegerTracer;
  constructor(private serviceName: string) {
    this._client = initTracer(
      {
        serviceName,
        reporter: {
          agentHost: process.env.JAEGER_AGENT_HOST || 'localhost',
          agentPort: parseInt(process.env.JAEGER_AGENT_PORT || '6832'),
        },
        sampler: {
          type: 'const',
          param: 1,
        },
      },
      {},
    );
  }
  get client() {
    return this._client;
  }
}

Jaeger, . const. span' publish subscribe, . , Transport, , Tracer.


  constructor(private tracer?: Tracer) {}

. subscribePerfomance subscribe


export function subscribePerfomance(target: any, propertyKey: string, descriptor: PropertyDescriptor) {
  const origin = descriptor.value;
  descriptor.value = async function() {
    if (this.tracer) {
      const { client } = this.tracer as Tracer;
      const subject: string = arguments[0];
      const handler: Handler = arguments[1];
      const wrapperHandler = async (msg: Msg) => {
        const childOf = client.extract(FORMAT_TEXT_MAP, msg[CARRIER]); // 1
        if (childOf) {
          const span = client.startSpan(subject, { childOf }); // 2
          this[CONTEXT] = span; // 3
          try {
            const result = await handler.apply(this, [msg]); // 4
            span.finish(); // 5
            return result;
          } catch (error) {
            span.setTag(Tags.ERROR, true); // 6
            span.log({
              'error.kind': error, 
            });
            span.finish();
            throw error;
          }
        } else {
          return handler(msg);
        }
      };
      return origin.apply(this, [subject, wrapperHandler]);
    }
    return origin.apply(this, arguments);
  };
}

  1. Tracer extract carrier . , .
  2. span SpanContext
  3. span this. , .
  4. span.
  5. , , span , span


    publishPerfomance publish


    export function publishPerfomance(target: any, propertyKey: string, descriptor: PropertyDescriptor) {
    const origin = descriptor.value;
    let isNewSpan = false;
    descriptor.value = async function() {
    if (this.tracer) {
      const { client } = this.tracer as Tracer;
      const subject: string = arguments[0];
      let data: Msg = arguments[1];
      let context: Span | SpanContext | null = this[CONTEXT] || null; // 1
      if (!context) {
        context = client.startSpan(subject); // 2
        isNewSpan = true;
      }
    
      const carrier = {};
      client.inject(context, FORMAT_TEXT_MAP, carrier); // 3
      data[CARRIER] = carrier; // 4
      try {
        const result = await origin.apply(this, [subject, data]);
        if (isNewSpan) {
          (context as Span).finish();
        }
        return result;
      } catch (error) {
        if (isNewSpan) {
          const span = context as Span;
          span.setTag(Tags.ERROR, true);
          span.log({
            'error.kind': error,
          });
          span.finish();
        }
        throw error;
      }
    }
    return origin.apply(this, arguments);
    };
    }

  6. this . , , publish , . . api devices, , users.
  7. , span.
  8. carrier . context , . traceId.
  9. , carrier.

Transport, . Jaeger , . .




, getByRegion. 97.22% . , . , , , users, , . ? span' c . .


export function repositoryPerfomance({ client }: Tracer) {
  return function(target: any, propertyKey: string, descriptor: PropertyDescriptor) {
    const original = descriptor.value;
    descriptor.value = async function() {
      if (this.parent[CONTEXT]) { // 1
        const span = client.startSpan(propertyKey, { 
          childOf: this.parent[CONTEXT], // 2
        });
        span.setTag(Tags.DB_TYPE, this.db); // 3
        try {
          const result = await original.apply(this, arguments);
          span.finish();
          return result;
        } catch (error) {
          span.setTag(Tags.ERROR, true);
          span.log({
            'error.kind': error,
          });
          span.finish();
          throw error;
        }
      } else {
        return original.apply(this, arguments);
      }
    };
  };
}

  1. , span'a. span' , .
  2. span.
  3. , . OpenTracing . ip .

Jaegere, .




. , . 74.38% devices . , , github.


NATS, , . , OpenTracing, . http . , , . β€” , , . , , , , . , , , . , Jaeger, , .


Collecting and analyzing traces from a distributed application is similar to performing an MRI with contrast in medicine, with which you can not only solve current problems, but also identify a serious disease at an early stage.


All Articles