openzipkin/zipkin

Consolidation of effort on .net tracers or instrumentation

Closed this issue ยท 32 comments

I'm opening this issue on the main Zipkin repo, as I think it is important to guide users towards the most sustainable trace instrumentation, and this repo has the most watchers.

This issue is one about the future of the zipkin-csharp repo. Should we keep it and maintain it? Should we replace it with a more used codebase? This discussion is about reuse and sustainability and what makes most sense moving forward.

I'm hoping people chime in with feedback such as...

  • We use have an internal tracer and are not likely to use zipkin-csharp
  • We use another open source tracer and are not likely to use zipkin-csharp
  • We use another .net language and are not likely to use zipkin-csharp
  • We are waiting for the code inside zipkin-csharp to be changed to X

Then, ideally a follow-up such as

  • I didn't know about zipkin-csharp, I'll try it or review pull requests
  • I think projects shouldn't be in OpenZipkin unless they have at least 3 users.. drop zipkin-csharp as it confuses people
  • I'd work on or use a new project, but I don't want to work on other ones.
  • I'd rather tracer X was the official zipkin .net tracer. Let's contribute to that instead
  • I don't care about a tracer being in the OpenZipkin org or not, I use or contribute to X

Here's a survey of known tracers on github:

There are also some who are using internal tracers and/or likely to open source those.

For some background on zipkin-csharp, I noticed people were complaining about lack of a low-level zipkin tracer and facilitated the move of what is now zipkin-csharp into the org. It hasn't had any maintenance, despite having issues raised. One particular issue discussed what folks are looking for. @aliostad started working on zipkin-csharp recently, which will likely sort some issues. However, this won't automatically solve the lack of users problem.

With all this said, I'd appreciate if folks would chime in, so that energy is well spent and highly re-used.

cc @cwe1ss @dawallin @jcarres-mdsol @haf @bvillanueva-mdsol @johnberzy-bazinga @lschreck-mdsol @fedj

fedj commented

At Criteo, we use a custom tracer in C#. It's basically a translation of finagle's tracer with few modifications. It also brings a way to pass the trace context without having to modify every API.
We're not likely to use another once since we have great constraints about its performance and are likely to open source it.
To be honest, since we created our implementation long before zipkin-csharp or any implementation in C#, we didn't look for any existing implementation

In fact, I have a few ideas about a tracer project but as I said before, this project only implements key abstractions (such as span, annotations, etc) which can be useful to tracer projects.

There are actually tracer projects such as this or this out there.

I've talked to @bvillanueva-mdsol @lschreck-mdsol and the conclusions are:

  • Having one good tracer which most of the people like is ideal. We would like that also.
  • Do not stress too much about other tracers, if they are not good eventually they will die. It's a pity they do not use the official one but not much can be done.
  • We think our tracer is the most complete but it has big problems. The name sucks, depends on Owin and it is not .NET core compatible. We are aware of these problems and we want to improve them all. Unfortunately our developer team switches projects often so it is difficult to tell how much effort we can put on this.
  • We are not very familiar with @aliostad work, the idea of a core component and then specific projects for Owin, winAPI or whatever sound good, we have just not checked the project , same with the work by @sbebrys which seems like a cleanup over ours (most of the code is the same, a pity he did not keep the git history, it woudl be easier to see the changes)

My recommendation would be to pickup one and add it to openzipkin org. Then refuse to support any other attempt other than that one.

fedj commented

We could also open source ours which is transport independent, compatible with .Net core and is close to finagle APIs (it would bring familiarity). However, there is some work to do to be more open-tracing compliant.

fedj commented

FYI, we just open sourced our .Net tracer https://github.com/criteo/zipkin4net. It doesn't depend on any specific mechanism such as owin or httpcontext.

@fedj @jcarres-mdsol @sbebrys I have now reviewed the code from all different repositories. It will take some time for me to document the pros and cons of each approach and could potentially lead to endless discussions.

On one hand, I feel this project, being part of the openzipkin initiative, needs a code worth its name and looked at as the de-facto library. On another hand, I do not want to single-handedly make decisions on it, before getting community help and feedback and risking alienating the community who have already spent a lot of personal time building their implementations.

Do you think it is a plan to bring all the good parts of them? And would you be able to contribute?

If so, I could document a proposal for the future of the project and we can discuss.

One thing is clear, is that I need a working tracer pretty soon for my company and I am hoping we could make this project to be the one.

OK, I have created this gist with my ideas. We could potentially move this conversation to zipkin-csharp if @adriancole sees fit.

haf commented

So what about fsharp? Does the project have to be named by the language? Why not name it dotnet?

fedj commented

@haf One could argue that zipkin-java can be used in scala, groovy, ... and is not called zipkin-jvm. I think the idea is that the package is called after the language it's built in

haf commented

@fedj So you'd maintain a package I create named zipkin-fsharp separate from zipkin-csharp?

I would agree with @fedj and frankly do not like the name of this project. But the scope I believe is to provide a .NET library that can be consumed by all .NET languages including C#, F# and VB.NET. It is true that a fully functional F# library is a possibility but do not believe is the scope of this project.

OK, any comments, objections, improvements on my proposal? We would need to decide pretty soon.

fedj commented

@aliostad I think that it should not depend on the user to decide when to send traces to the collector. I understand it's async but it's still not enough. Furthermore, creating a N-th library from existing libraries looks like a good idea but I have the feeling that I will personally not use since we have a working one (by working, I mean proven scalable). Furthermore, this library seems to be not supported anymore.
To be a bit more clear, why are we trying to create (or twist hard) an unsupported library while others already exist and address your needs? Shouldn't we replace this lib ?

fedj commented

I'm also always a bit afraid by using a library that has been unsupported for more than 6 months and then completely rewritten. It makes me feel that it's unstable.
I'm of course subjective here but our library (https://github.com/criteo/zipkin4net) works every day in production with a lot of traffic without any glitch, with various safety nets.
To be honest, using a library made from scratch and used by nobody seems a bit risky.

I think that it should not depend on the user to decide when to send traces to the collector.

You are absolutely right. That is why the user does not decide, dispatcher does. I must say out of all libraries, I find yours as the best - but it was not available when we started this conversation. Also I have issues with it which I can discuss here or in your repo if needed. But just as an example, there seems to be two concepts for dispatcher (dispatcher and sender which amounts to considerable part of the code) which I find unnecessary and confusing. A common dispatcher can do some in memory buffering and then you will have transport specific implementations.

creating a N-th library from existing libraries looks like a good idea but I have the feeling that I will personally not use since we have a working one

If everyone feels there is no need for another library, that too could in the end be the solution and the fruit of this conversation. I just had to ask if you guys are happy to consolidate your contribution here or would just prefer to use what you have built and seen it working in production.

why are we trying to create (or twist hard) an unsupported library

If this gets off the ground I make sure I for one support it. As I said I have a business use case and it is an area we will be investing moving forward.

The .net topic is a pretty important one. There have been several occurrences where someone has entered the gitter channel asking about .net tracing and we are stuck between a rock and a hard place: Mention the incomplete zipkin-csharp project or mention one of the two full featured second-party projects such as criteo and mdsol's. Unless I hear strong objections from folks in @openzipkin/core, we'll move forward with setting aside zipkin-csharp in favor of what's currently criteo/zipkin4net

On stopping the zipkin-csharp experiment

I've initiated an effort to move zipkin-csharp either to the attic or under @aliostad's personal account. You can follow that there, but suffice to say it is stalled and not healthy compared to other options. I'm asking folks to comment on this issue if you have strong feelings about zipkin-csharp: openzipkin-attic/zipkin-csharp#36

On Criteo's tracer going into OpenZipkin

On moving forward, I've informally asked @openzipkin/core about .net or at least csharp. There's unsurprisingly still support to have a full-featured tracer available in the OpenZipkin repository. While great work exists both in MdSol and Criteo repositories, Criteo have extra resources available to carry forward a transition.

Why Criteo?

Code aside, I've paid attention to folks at criteo. Like MdSol, Criteo actively participate in distributed tracing workshop and tough problems here in Zipkin. @fedj takes time to address requests made by others or those that make it fit well with projects like zipkin-js. He's also one of the few people who volunteered to help do Zipkin builds, and has fixed bugs on the server. Their site is one of the highest volume sites Zipkin has, so keeping things working is of mutual benefit. Finally, it is more than reasonable to have a tracer written by one of our core team members be the official one. Knowing our other core teammates MdSol are supportive of this is equally important.

But what about fsharp etc?

If we are honest, it is not the right time to hold back a healthy csharp option in favor of one that doesn't exist. There is no project that covers all languages in dotnet, but if that happens, we can revisit it. It is encouraging that Criteo have taken time to address things they may not need themselves, such as .Net Standard 1.5 compat.

But what about OpenTracing?

@fedj has also raised work towards a bridge for those who want to use opentracing. That's currently a pull request, but could be pulled out as a separate repo to address version drift if needed openzipkin/zipkin4net#74

Thank you @adriancole .

Frankly, in recent days I have been looking deep into the details of this project I can see that this could have never been properly working:

  • There was no Duration defined in the Span
  • TraceHeader did not have IsSampled
  • IDs defined as unsigned long and then being converted to long
  • I changed all of this without a single test breaking - obviously not related to functionality but it seems there were more that could have been done.

My view was to fix the primitives and this project could serve as the basic building block to build upon. But of course, the primitives are pretty small and implementing them all is not an issue warranting a dedicated project mainly because everyone has moved on now and built their own. So Criteo project is one that shines since it has all that and more.

But there are problems, not of taste and implementation strategy but actual practicality:

As the name implies, the package is Criteo.Profiling.Tracing - and not even mention of zipkin. It is very specific to how Criteo does things. They are on the new world of .NET (core and .net standards) which include only a very small percentage of .NET ecosystem - my guess is less than 2%. The builds are all purely bash, no bat or powershell, which I guess are run on mac or linux unless you want to yse Windows 10's bash for windows (which comes with no warranty). This probably covers 5% of people's work machines - frankly many devs will panic seeing a file having .sh extension.

I work for a company which is one of the top (if not toppest) Azure consumers on this side of Atlantic, with 40 development teams. And it is ahead in many respect to the rest of the "Dark Matter" Enterprise. But Criteo is just at the bleeding edge, and they do not seem to make an effort to cover anyone else (and nothing wrong with that as the name of the project implies). I personally cannot use any of the stuff, at work I am running Windows 7 along with the rest of the company (and I use my personal Mac for ssh, etc) and we are only trialing .NET Core in one of our 40 dev teams.

There is not even a .NET 4.5-4.6 package of the library on nuget - it is all new world. Yes, you can run using the new world's backward compatibility but that is just not something we can do at this point. So if for me this is not usable, you can guess what it is for all those "Dark Matter" developers out there.

As such, my guess is we still need another project:

  • Builds the primitives in all supported .NET frameworks (4.5.2 and plus)
  • Has multi-target build which targets .NET Standard
  • Zipkin is the first class citizen. Logging/Monitoring/Alerting is a big domain and it needs to be just about the Zipkin side of it.

So here are my 2 cents. Frankly I wish I could just use Criteo packages but I cannot, so I have to build it from scratch.

fedj commented

Hi @aliostad,

Thanks for the context and the history behind this project. I figured Iโ€™d answer a few things mentioned.

As a part of moving to openzipkin, we'd expect to change the package names to Zipkin. No worries about that.

From an environment POV, we run both on OSX and Windows 7. We did setup a bash script for travis, and would be happy to accept a powershell script PR if you find it helpful.

We compile both .Net standard and v4.5 but haven't published nuget v4.5 yet. As a part of publishing openzipkin packages, we'll start with .Net standard and if users ask for it, certainly can find a solution for 4.5

We hope that in moving to openzipkin, folks can share what they need in the issues list or even as a pull request. Either way is far better than guessing.

@fedj thank you for your response.

In fact, it addresses many of my concerns. I am happy to help if it is needed to make it useable for 4.5 users as well. I am really pleased that one project is selected, this means the efforts in the community gets more focused and consolidated.

So please let me know how and if I can help. I surely need a Zipkin addon for my Perfit project and need it quickly.

hmmm.... I have supplied fixes for .NET 4.5 build but it has been deafening silence on zipkin4net repo. As I said, I am desperate to get .NET 4.5 of primitives going so I can carry on with my other work, are you guys on holiday?

you opened a pull request over the a weekend, right?

Weekend? Yes, weekend. Out of everyone, I least expected this from you. As a community leader, I am sure you are aware many people are not paid to work on OSS and weekends and late nights are the only time any progress gets made. At least in the circle of people I am, our day job usually has very little to do with our community contributions. I have 54 repos and 99% of the time spent on them were outside work hours.

Also I opened the issue on Friday (not on a weekend), I am sorry that I have been impatient, as I said I am desperate to get primitives in .NET 4.5:
openzipkin/zipkin4net#94

We compile both .Net standard and v4.5 but haven't published nuget v4.5 yet. As a part of publishing openzipkin packages, we'll start with .Net standard and if users ask for it, certainly can find a solution for 4.5

I was really encouraged to hear this but I am sorry, the repo shows zero sign of it. Everything is .NET Standard, even the Travis build is on Linux. I can do all that contribution myself but please confirm that you are happy for me to do what it takes to make it fully cross-platform.

fedj commented

@aliostad I'm sorry to hear your frustration about this. I began to work on Friday about this but I was out for the week-end. As you can see here, the work is in progress but I didn't want to PR an unfinished/non-compiling work.

As I said, we're trying to make it v4.5 compatible again. Again because we were compatible until version 0.3.1. It's a bit complicated for now to have cross-builds working on travis but we can target to have a nuget working in v4.5 and net standard for now. From what I understood, this would solve your issue.

Am I mistaken ?

@fedj
Thank you so it seems that you have already started the work. It would be good to such work expressed as an issue on the repo so others do not spend time doing exactly the same thing.

So do you need any help from me or you are happy to carry on?

Ali, I'd recommend in general not using a "demand" tone when asking for
others.

I did apologise. I am sorry I was too demanding.

fedj commented

@aliostad Btw, new version 0.4.1 is compatible .Net 4.5 and netstandard 1.5

FYI there's been a wonderful amount of collaboration in https://github.com/criteo/zipkin4net including repackaging efforts. We don't have a timeline for migration to openzipkin org, but I suspect @fedj will drive this, right?

fedj commented

@adriancole I would be honored to, I would delighted if we could speak together to see what needs to be done in order to prepare a plan.