ImFlog/schema-registry-plugin

Ability to toggle pretty-print of outputted schema files for Avro/JSON formats

Closed this issue · 6 comments

Currently, all downloaded schemas are saved "as-received" from the Schema Registry, i.e. one-line with a single terminating newline character.

Since generated files have tons of serde fields, their readability is not that great. It would be nice if there was a toggle to save the schema information as formatted JSON for the relevant schema types (avro, json).

In addition, if we want to check in the files to VCS, e.g. Git would always overwrite the files if we were to format them manually and check them in, which is not ideal.

@ImFlog If you also think that something like this is in the spirit of the plugin, I could take a stab at a PR. IMO, it improves my workflow significantly (download schemas, generate, check-in to track schema changes in Git).

As a suggested solution, the best way that I can see to support this would be to add an optional prettyOutput: boolean flag to the global configuration so as to not pollute the subject and subjectPattern signatures. This would output formatted files for the json and avro schema formats and be a no-op for protobuf (unless there is an obvious way to support that use case that I'm not seeing - I'm not very familiar with protobuf).

Hello @filpano,
That's a good point. I think we should definitely support this.
The flag seems to be a good way to do this, I think I saw somewhere that on the confluentinc schema-registry library we could use specify a way to pretty print instead of doing the toString (formattedString for instance ? An issue about formatting was opened in the past => confluentinc/schema-registry#876 (comment)).
Might be worth to check their API before bringing a formatting library or doing it ourselves in the plugin.

We can check later for protobuf (not using it either).

I'll be glad to assist you on this PR, feel free to start tinkering and push something even if not finished (with the allow edits from maintainer) so I can step up to help when you need ;)

Thanks for the quick reply. I had a much deeper look at that whole thread than I would've liked to :) but it doesn't seem like that'll work for this use case:

  1. (Undocumented) support for the format query param was added only for the /schemas/ids/{id} endpoint. From the discussion it seems that this is the intended operation for "retrieve a raw schema".
  2. The io.confluent.kafka.schemaregistry.rest.resources.SubjectVersionsResource#getSchemaOnly resource which allows retrieving the raw schema for a subject-version pair via /subjects/{subject}/versions/{version}/schema does not offer this support at this time. I wouldn't mind opening a PR for this in https://github.com/confluentinc/schema-registry, but...
  3. ...going off the the linked discussion, it doesn't seem like the Confluent team wishes to support a formatting option on /subjects/{subject}/versions/{version}/schema as the schema parsing is pluggable to provide generic support for Avro/JsonSR/Protobuf. @rayokota may be able to provide some insight on whether this is actually the case or if I've misinterpreted.
  4. Lastly, the schema-registry-plugin (this project) uses the /subjects/{subject}/versions/{version} endpoint via schemaRegistryClient#getSchemaMetadata, which is also probably the correct one for the use cases it applies to. I don't think it makes sense to both open a PR to add pretty-print formatting in SR and another one here to use the /schemas/ids/{id} endpoint instead (which would be a significant rework and not at all justified for this issue).

So I think either the schema registry needs support for format/pretty in /subjects/{subject}/versions/{version}/schema (unlikely from what I've read so far), or we'd have to parse and format it ourselves here, as initially thought - though I'd prefer the former. If the latter is OK for you, I can get started on a PR.

What do you think?

I think we should add support on our side for now while opening an issue on the confluent repository (and linking to this issue).
We have less inertia than the confluent project and thus we can have something pretty quickly. But as you said it makes more sense to put this on the schema-registry directly.

I have a small issue with the plugin publishing (an error on Gradle side it seems) but I didn't forget to release this.

It's been released in https://github.com/ImFlog/schema-registry-plugin/releases/tag/v1.13.0
Thank you again for your contribution 🙇