opentracing/opentracing-python

Implementation of ScopeManager for event loop frameworks

yurishkuro opened this issue ยท 20 comments

@carlosalberto I finally had a chance to look at the new Scope API. It looks fine, but what about the implementations? There was a PR almost a year ago that showed how the API works with different async frameworks like Tornado, gevent, etc.

Hey!

So I had brought the Java examples to python under https://github.com/carlosalberto/python-examples (there's a branch with the previous PR, and I'm in the middle of updating it - mostly start_manual -> start and similar stuff). Once that is done, prior to a second release, I will create a PR so they live within the actual opentracing-python repo.

That should happen between later today and tomorrow ;)

Hey - so I was about to create a PR to import https://github.com/opentracing-contrib/python-examples (which it's now updated with OT 2.0.0rc1), and realized the tests use basictracer as dependency (to try out code against actual, simple Tornado/Gevent/Asyncio ScopeManagers).

I'm wondering what should we do here:

  1. Have basictracer as dependency ONLY for the new examples directory (similar to what happens for tests).
  2. Add examples under basictracer itself?

Ideas?

@yurishkuro @palazzem

@carlosalberto I think the preferred approach is to use mock tracer for tests, not the basic tracer.

@carlosalberto I just had a look at the examples in your branch, but couldn't find implicit context propagation with Tornado, which afaik can only be done with StackContext, e.g.

https://github.com/uber-common/opentracing-python-instrumentation/blob/master/opentracing_instrumentation/request_context.py#L217

I am not familiar with other event loop based frameworks like gevent, but I assume they'll have similar requirement since a simple thread-local doesn't work on event loop frameworks.

@yurishkuro Sorry for the late reply - so some of these examples are kinda doing manual Span propagation (such as listener_per_request or common_request_handler, which try to show how things would work using middleware to create a Span).

But other ones, such as nested_callbacks and subtask_span_propagation show actual Span propagation, i.e (from https://github.com/opentracing-contrib/python-examples/blob/master/tests/test_nested_callbacks/test_tornado.py)

        with TracerStackContext():
            with self.tracer.start_active_span('one', False):
                self.submit()

and then:

    @gen.coroutine
    def submit(self):
        span = self.tracer.scope_manager.active.span

        @gen.coroutine
        def task1():
            self.assertEqual(span, self.tracer.scope_manager.active.span)
            span.set_tag('key1', '1')

            @gen.coroutine
            def task2():
                self.assertEqual(span, self.tracer.scope_manager.active.span)

I have also added a small description in https://github.com/opentracing-contrib/python-examples/blob/master/README.md to how the ScopeManager prototypes for each platform works (no other than Tornado support out-of-the-box context propagation). Hope that's a good start ;) Let me know.

NOTE: I liked better the implementation you showed of Tornado propagation, and I will use it in the examples, as well as updating the codebase for further inclusion in the opentracing-python repository ;)

Jumping in the thread a bit late to share our current implementation.

We added support to Tornado using "a kind of" ScopeManager (internally was called ContextProvider). As @yurishkuro wrote, especially at that time, there wasn't a real way to do context propagation in a Tornado app without using a TracerStackContext, and I can confirm that a reliable storage is not available (at the time of writing and for any Tornado version).

Because how Tornado is working, I would say that we don't have so many options with/without ScopeManagers. Indeed, our implementation relies on it and it was enough to cover our Tornado use cases.

For other loops, we support asyncio and gevent out of the box. There, the ScopeManager works very nicely since we've a reliable storage where we can put the stack of opened scopes (respectively, the Task object and the Greenlet local storage.

Now I have a question (applies also to gevent and threads):

and does not provide automatic propagation from parent Tasks to their children, which needs to be done manually

What do you think about providing helpers or something to do the propagation? It will be manual instrumentation but part of a contrib API, that will make the propagation reliable and under our control. Since this could be off-topic, feel free to move this conversation in another thread.

@palazzem Hey hey.

What do you think about providing helpers or something to do the propagation? It will be manual instrumentation but part of a contrib API, that will make the propagation reliable and under our control. Since this could be off-topic, feel free to move this conversation in another thread.

Yes, this is definitely the plan ;) I didnt want to put something around that here, but we should definitely help with Span propagation in instrumentation libraries (i.e. opentracing_tornado and others, besides the actual TornadoScopeManager, that is).

@yurishkuro So I was refactoring basictracer to have a MockTracer version for the examples, getting rid of the sampling, the binary support and abc package - and ended up still with lots of code (and pretty similar looking to basictracer).

So I'm wondering if it should be better to keep the Python examples in their own repo, as it happens now (we don't have any equivalent of the opentracing-mock package right in opentracing-python, as we do in opentracing-java, and using the mock module could get too messy). Else, either:

  1. Have a dependency to basictracer for the examples only (when they exist in opentracing/examples)
  2. Keep this port of basictracer I did locally, and put it as part of the examples, not for public usage, but only to be used along with these examples.

Opinion?

option 1 makes more sense

Upon talking a little bit more to @tedsuo, I started wondering if, given that the basic tracers will probably be deprecated, we should simply port the basictracer as mocktracer under the opentracing repository (as I had already mentioned as a possibility), and later probably have basictracer extend mocktracer. However, I have a few doubts still for this potential scenario.

Usually Python packages use a single, simple directory organization for a package (including LICENCE, setup.py, etc) - as opposed to Java builds, where a set of jars can be part of the same solution (even if distributed as separated artifacts, i.e. opentracing-api, opentracing-mock, etc).

Given this, should we (if we were to have a mocktracer):

  1. Keep extra packages such as mocktracer or utils (mimicking Java) in opentracing as subfolders, with their entire setup.py and other files under them?

  2. Look for a way to do what Java does, creating different packages from a single setup.py or similar? (After a quick look, I haven't found anything like this).

Let me know your thoughts. I already have such code locally, and I'd only need to do the hooks here and there for inclusion ;)

@yurishkuro @palazzem

I am not an expert in Python conventions, but my preference would be to keep Mock in the same opentracing module. It could live in a subdirectory, but still be part of the same module. For example, in the Jaeger client we have a submodule metrics, so you can do

from jaeger_client.metrics.prometheus import PrometheusMetricsFactory

The Prometheus client is not a runtime dependency of Jaeger client though, only a dev dependency (which might cause some versioning issues, in theory). But Mock tracer doesn't need any out of band reporting, so I don't think it will have any additional dependencies.

Having mocktracer in a subdirectory within opentracing sounds fine by me, specially given the example you gave. Thanks, will do that and will push it in a PR as part of the second RC ;) Thanks again for the feedback!

Agreed with @yurishkuro. I think it's better to have it in the opentracing module.

@palazzem sweet, looks like we have an agreement :)

After 2ffae7e, shall we close this issue @yurishkuro ?

I'm going to take this for a spin this w/e.

Sounds good! Let me know ;)

Covered by (under review) #83

@yurishkuro Shall we close this PR, now that we merged #83?

Hi all!
We need the same ScopeManager implementation for Twisted https://twistedmatrix.com/trac/.
Does someone have an idea of how it can be implemented in Twisted?
I don't see any similar thing to current task (in asyncio), current greenlet (in greenlet) etc.
Would appreciate any help