CloudNativeJS/appmetrics-zipkin

The number of traces and spans are shown incorrectly in Zipkin

Opened this issue · 5 comments

Tried this library, unfortunately does not give the right results under high concurrency scenarios.
Ex:
Client—> Service-A —> Service-B

In my case Service-A and Service-B are microservices built using node.js. To simulate the high concurrency scenario, do the following :
-In Service-A, after a request is received from the client, add a setTimeout() of about 10 seconds, after which Service-A calls Service-B
-Have the Client send 5 requests one after the other within 10 seconds.
-Service-A will receive all 5 requests, before it is forwarded to Service-B

Under the above scenario, the expectation is to see 5 different traces with 2 spans in each trace. Instead I see a single trace with 6 spans in it.

Thanks for reporting this. Do you have a the same code for Service A and Service B anywhere (to save time creating a test)?

Service-A code....

'use strict';

const appzip = require('appmetrics-zipkin')({
    host: 'localhost',
    port: 9411,
    serviceName:'service-a',
    sampleRate: 1.0
});

const request = require('request');
const express = require('express');
const app = express();

app.get('/hello', (req, res) => {

    setTimeout(()=> {//simulate a DB operation with high latency via setTimeout
        request('http://localhost:3001/world', (error, response, body) => {
            if (error) {
                res.send(error);
            } else {
                res.send(body);
            }

        });
    },10000);
});

app.listen(3000, () => console.log('service-a listening on port 3000'));

Service-B code...

'use strict';

const appzip = require('appmetrics-zipkin')({
    host: 'localhost',
    port: 9411,
    serviceName:'service-b',
    sampleRate: 1.0
});

const express = require('express');
const app = express();

app.get('/world', (req, res) => {
    res.send('Hello World!');
});

app.listen(3001, () => console.log('service-b listening on port 3001'));

Use curl to invoke Service-A. Invoke it 5 times consecutively
curl -XGET http://localhost:3000/hello &

Check Zipkin and you will see 1 trace with 6 spans.

FYI, work has been happening on this. It looks like the issue is in the way that we propagate the async context - the use of setTimeout() is what's causing the problem.

This means that appmetrics-zipkin does work with concurrent requests/responses, but currently looses the context if their an async call (ie the use of async() or setTimeout() etc) between the incoming request and the outbound request to the downstream service.

@gdams is working on improving the async context propagation for this.

@gadams this could also be related to #40