SQL Commenter donation
SergeyKanzhelev opened this issue · 20 comments
Summary
SQL Commenter is an Apache-licensed open source project created and (currently) maintained by Google. It helps users of ORMs to quickly investigate performance problems by correlating database query executions with application code and parameters. It does this by embedding machine-readable comments in the generated SQL statements -- a form of distributed tracing.
Google proposes to contribute SQL Commenter to the Cloud Native Computing Foundation (CNCF) OpenTelemetry project. SQL Commenter libraries already rely on OpenTelemetry’s implementation of in-process context, so this would be a very natural combination.
SQL Commenter project consists of
- A specification outlining data formats and semantics
- Context injection modules for sql drivers and ORMs
- Components to instrument popular server libraries to collect this context
How does SQL Commenter enhance OpenTelemetry?
Making SQL commenter specification a part of the larger OpenTelemetry specs will allow collaboration among interested vendors, and foster broader adoption of this distributed tracing technique. In addition, components collecting execution context of this project may over time be merged with the corresponding SDKs collecting traces and metrics. Additional integration can be implemented based on user feedback to make the user experience better.
How will SQL Commenter continue to be maintained?
Today, all the SQL Commenter language implementations are developed in a single Git repository. Once the project becomes part of OpenTelemetry, the community will decide how to maintain the code in the long term -- whether and when to migrate from this “monorepo” to the “-contrib” (or "-instrumentation") repositories for each language.
Going forward, Google will continue to maintain this project. We welcome the opportunity to work with more contributors and maintainers, such as AWS and Dynatrace who have expressed interest.
Legal questions
SQL Commenter is NOT a registered trademark and name can be re-used as is.
Google is OK to change the copyright for the project to OpenTelemetry copyright in accordance with this suggestion.
This is awesome news! Look forward to contributing further.
Looks great!
Thank you, Google 💙
IIUC (e.g., here), SQL Commenter is already OTel-compatible... correct, yes?
IIUC (e.g., here), SQL Commenter is already OTel-compatible... correct, yes?
Yes, it OTel-comptible. More info on the launch blog
@SergeyKanzhelev Is there an issue you are tracking the TC evaluation on for SQLCommenter?
@SergeyKanzhelev Is there an issue you are tracking the TC evaluation on for SQLCommenter?
we do not typically create a separate issue. I think we discussed at GC meeting (unfortunately not reflected in notes) that we may take a shortcut in this case and make a decision on accepting this without in-depth due diligence. We can bring it up at Thursday meeting again to make a decision if there are still no concerns.
How does SQL Commenter enhance OpenTelemetry
Everything in that section can be achieved by making SQLCommenter an independent sandbox project at CNCF. That would be my preference.
We need to be careful about bloating the scope of OTEL project, which is already so large that we barely made it to Incubation due to many pieces of the project being pre-GA. SQLCommenter itself has a potential for larger scope, for example it may host plugins to different databases that would extract the context from the SQL comments and do something with it other than put into logs, e.g. actually participate in the distributed trace.
The due diligence document https://docs.google.com/document/d/1xSqEpXhRAWUt30S8OPDRmZYktJO9HOzBdzoEBMwqk9w/edit?usp=sharing
Discussed at today's GC meeting, this donation is accepted! Sqlcommenter team need to work on next steps to bring code into OpenTelemetry org
I have been looking through sqlcommenter as part of open-telemetry/opentelemetry-java-contrib#205
Looking through it, it appears to be instrumentation of SQL queries, notably propagation of context through comments in the queries. Seems like a pretty cool feature.
But I feel the current approach, with a separate repo for sqlcommenter doesn't align with OpenTelemetry. OpenTelemetry == Instrumentation - we already have instrumentation of database clients and having a separate one only for this commenting functionality will cause inconsistency, reduced performance for users, and fragmentation (self-inflicted). When adding instrumentation to OpenTelemetry, we usually go through the spec to define conventions to ensure they have been reviewed by the spec team to align with OTel and produce a consistent user experience, and implement in the instrumentations in each language. Such work can happen in parallel with actual integration into the existing database instrumentation behind an experimental flag. I don't see any reason this sort of approach doesn't align well with sqlcommenter and it provides a much better UX than random instrumentation.
I am seeing that this seems to have been a code-first process, dump existing sqlcommenter code into OTel, sidestepping (seemingly intentionally) work in the spec or any other sort of technical due diligence from language maintainers. Instead, a concept/talent-first process is usually what we'd follow I think, people are interested in adding sqlcommenting to OpenTelemetry, so they discuss in an OTep or spec issue / PR, get alignment / buyin, and integrate among the existing language SIGs. I hope a shift can still be made here.
Governance-wise, instrumentation is currently presented to users by the language maintainers. I think it's unfair that a decision to host language instrumentation in a separate repo was made by the GC+TC independently, at least from what I can tell from this thread, without wider discussion among the language maintainers to understand the approach to integrating this into OpenTelemetry in a way that it's not just a rebranding but a real part of the project.
By the way apologies for commenting on such an old issue - it was referenced in the relatively new one in java-contrib and I didn't notice how far back this one is. If there is a different way to discuss how to move forward with sqlcommenter within the project as a whole, let me know.
Governance-wise, instrumentation is currently presented to users by the language maintainers. I think it's unfair that a decision to host language instrumentation in a separate repo was made by the GC+TC independently, at least from what I can tell from this thread, without wider discussion among the language maintainers to understand the approach to integrating this into OpenTelemetry in a way that it's not just a rebranding but a real part of the project.
This is a temporary situation, and is similar to how we approached the donation of Stanza into OpenTelemetry.
Specifically, it's a multi-step process:
- Accept the code into OpenTelemetry under our governance for folks to see and prototype with. This is what you see today (the straight dump, bring users + contributors into the OTel community).
- Determine how best to contribute it back into OpenTelemetry, including where and what. I've been working w/ folks on SQLCommentor to help outline what this looks like, but having a separate repository is NOT the long term plan. This instrumentation should (eventually) be merged into
-contrib
and-instrumentation
packages. I'd view the existing repository like the proof-of-concept we use for fleshing out the Specification for SQL-Commentor-like features and functionality.
@jsuereth If the goal is to merge into OTel instrumentation that's good to hear. The thread seems to indicate wanting to publish OTel artifacts from this separate repository for use so it's ok to push back on that then?
open-telemetry/opentelemetry-java-contrib#205
Also governance-wise, I want to point out I am specifically talking about instrumentation. I'd expect infrastructure type components like Stanza to be treated differently indeed.
I think the idea was to make a release from OpenTelemetry of the initial form for folks to evaluate. We'd like to take this version and use it as the basis for refining + specification and then migrate into instrumentation components. It is NOT meant to compete with OTel instrumentation, but we would like to transition the user base from the existing SQLCommentor into the OTel community.
Also governance-wise, I want to point out I am specifically talking about instrumentation. I'd expect infrastructure type components like Stanza to be treated differently indeed.
If stanza was a bad analogy, we'll use flowmill (an eBPF instrumentation component) instead. Also, I'm not sure the component type changes the mechanism of donation much. We need a location to look at the component's source (where that source is under OTel governance) and decide how it best fits into the rest of OTEL before contributing back into the long-term home.
Perhaps someone from the Governance committee can comment here? @bhs @alolita @dyladan ?
Have to admit I haven't followed the sqlcommenter donation that closely. The JS implementation to me seems really simple and surprisingly up to date with the latest otel js api/sdk. I agree with @anuraaga though we already instrument both packages the js sqlcommenter supports and it isn't clear how they are meant to work together. It looks like at least the JS implementation require the end-user to make code changes and add hooks so I wouldn't expect it to conflict with the autoinstrumentation at all, but if we ever add these comments using the instrumentation then there could be some conflict.
Personally, this looks like a propagation format that should be added to the spec and integrated into the existing SQL instrumentations. I wouldn't expect the JS code in a separate repo to be maintained long-term. I didn't even know it was there for example and I'm the JS maintainer. Is there a maintainer for this sqlcommenter code?
I think the idea was to make a release from OpenTelemetry of the initial form for folks to evaluate. We'd like to take this version and use it as the basis for refining + specification and then migrate into instrumentation components.
This seems reasonable.
I do think that this evaluation/experimentation should take place in the language contrib repos though. This will help raise awareness of the component, and provide for better community review, guidance, and release tool infra.
And I think it should be understood by everyone that this component is only for evaluation/experimentation (no long term support), and the component will be removed from the repo once sql propagation is spec'd and available in the regular instrumentation components.
For reference I listed out options for Java here
open-telemetry/opentelemetry-java-contrib#205 (comment)
Publishing the existing code in some way from contrib is OK but it seems unproductive and unhelpful. sqlcommenter will just get a rebrand - it will cause work for users to migrate to the new artifact, and then migrate again when it is incorporated into normal DB instrumentation and deleted.
It's unclear what feedback we would hope to get through this user unfriendly experiment given sqlcommenter existing usage should already be a strong enough base to go with, e.g. it doesn't require a code dump of some existing messaging instrumentation to come up with messaging semantic conventions. This is compounded by the sqlcommenter codebase being quite small and easy to reason about - and taking the delta compared to our existing instrumentation it is actually tiny. This is probably not obvious to the TC but would be for any language maintainer, hence the call to engage them proactively and early when accepting donations of instrumentation code.
We need a location to look at the component's source (where that source is under OTel governance)
Calling this out specifically, I would ask why? This is all OSS, we could form a technical game plan just based on the google/ repository I don't see how bringing source into OTel governance is a requirement for this sort of process. Perhaps with a discussion including the language maintainers, we would have thought forking in has some value but maybe not. It doesn't seem like it needs to be a requirement.
hence the call to engage them proactively and early when accepting donations of instrumentation code.
This is why I asked the SQL commentor team to reach out to maintainers and figure out how to contribute the code ASAP. This is that opportunity.
If you're looking for more input before accepting the donation, I guess there could be some improvement to the process. However, a few points:
- The Donation request was opened in July (in public).
- One month later TC posted their due diligence evaluation (in public) on this issue.
- One month later GC accepted the donation (in public).
This is similar to Specification issues where we try to leave the issue open for enough time that maintainers can comment prior to acceptance. If you feel there's improvements to made here, please let us know. We can do a better job of advertising community donation PRs, perhaps, but I'm curious what form of engagement you're looking for here.
In terms of how to contribute SQLcommentor into each language instrumentation, this is the time when we're figuring that out. I think your plan to put things into contrib and attach to existing instrumentation w/ experimental flags is reasonable. My main concerns are:
- SQLCommentor team knows where they should post bug fixes for their current users. At one point, it was assumed that would be OTEL, but it sounds like it's a better idea keep those in the google repository for now.
- SQLCommentor team should be contributing Specification PRs with their proposed format for SQL instrumentation, and it needs to go through formal review.
- SQLCommentor team should be reaching out (and following guidelines) of language maintainers for where/how to integrate their existing instrumentation for proof of concept and validation that integration into OTEL will be succesful.
I like the proposals you put forward and those each seem reasonable next steps.
I don't see how bringing source into OTel governance is a requirement for this sort of process.
This brings the code into OTEL CLA + copyright, which alleviates a lot of clunky license tracking you'd have to do if you sourced directly form the Google repository (or any non-CNCF repository).
We can do a better job of advertising community donation PRs, perhaps, but I'm curious what form of engagement you're looking for here.
One concrete idea is the due technical diligence document - reading over it, there doesn't seem to be a lot of technical diligence in it from my reading, mostly conceptual. For instrumentation, especially when the languages are clear as in the java, nodejs, python, ruby of this project, it seems reasonable to have a signoff table of one maintainer from each of those languages. It's both information sharing and also getting an unofficial PoC involved from the start. This seems like a reasonable amount of process to gate an incoming submission.
I like the proposals you put forward and those each seem reasonable next steps.
Cool - I agree with your point 1 that it seems like both existing and new users will be best served by keeping the support channels separate at least for the time being. I'll provide some more concrete details for 3) in Java then and hope we can find a driver for 2). Thanks!
One concrete idea is the due technical diligence document - reading over it, there doesn't seem to be a lot of technical diligence in it from my reading, mostly conceptual. For instrumentation, especially when the languages are clear as in the java, nodejs, python, ruby of this project, it seems reasonable to have a signoff table of one maintainer from each of those languages. It's both information sharing and also getting an unofficial PoC involved from the start. This seems like a reasonable amount of process to gate an incoming submission.
This is a great idea. I'll bring this up in the next TC meeting I'm able to attend.