scalapb/ScalaPB

`_typemapper` are defined as package private, causing issues when deriving schemas

Closed this issue · 4 comments

With a proto and a nested / internal enum where I'm trying to get rid of the enum type by mapping it to string to be avro compatible (since it will otherwise be recognized as a nested union, which isn't a valid avro type):

syntax = "proto3";

package test;

import "scalapb/scalapb.proto";

option (scalapb.options) = {
  preserve_unknown_fields: false,
  lenses: false,
  retain_source_code_info: true,
  single_file: true,
  aux_field_options : [
    {
      target: "test.NestedExampleEvent.action"
      options: {
        type: "String"
      }}
  ],
};

message NestedExampleEvent {
  string id = 1;
  Action action = 2;
  enum Action {
    Undefined = 0;
    Allow = 1;
    Deny = 2;
  }
}

This will generate

final case class NestedExampleEvent(
    id: _root_.scala.Predef.String = "",
    action: String = test.nested.NestedExampleEvent._typemapper_action.toCustom(test.nested.NestedExampleEvent.Action.Undefined)
    ) 

with

private[nested] val _typemapper_action: _root_.scalapb.TypeMapper[test.nested.NestedExampleEvent.Action, String] = implicitly[_root_.scalapb.TypeMapper[test.nested.NestedExampleEvent.Action, String]]

In the NestedExampleEvent companion object.

This means, even if I define a mapper as

def enumMapper[E <: GeneratedEnum](implicit ec: GeneratedEnumCompanion[E]) =
  TypeMapper[E, String](_.name)(ec.fromName(_).get)

given TypeMapper[test.nested.NestedExampleEvent.Action, String] = enumMapper[test.nested.NestedExampleEvent.Action]

I cannot do

import test.nested.NestedExampleEvent
import test.nested.NestedExampleEvent.given 
import com.sksamuel.avro4s.Encoder as AvroEncoder

trait Thing[A <: scalapb.GeneratedMessage: AvroEncoder]
case object ExampleThing extends Thing[NestedExampleEvent]

Since

value _typemapper_action cannot be accessed as a member of test.nested.NestedExampleEvent.type from object ExampleThing.
  private[nested] value _typemapper_action can only be accessed from package test.nested in package test.
    case object ExampleThing extends Thing[NestedExampleEvent]

AvroEncoder is trying to use magnolia to derive a schema, I believe, but the same problem will apply for anything that parses the constructor.

Can I stop scalapb from making _typemapper_* private?

If that isn't possible (some sed post-hook magic notwithstanding), I might qualify this somewhere halfway between bug and feature request, I suppose, unless there's a reason for this being the way it is that I'm not seeing?

unless there's a reason for this being the way it is that I'm not seeing?

The cached _typemapper is an implementation detail that is subject to change and not meant to be directly accessed by users and therefore is private.

I'm curious how far the automated derivation of avro4s can take you anyway (considering bytestrings, oneofs) and other thing that may show up? Wouldn't it make sense to implement custom Avro Decoder and Encoder for protos using avro4s similarly to what we do for json4s?

Thanks for the response! That being an implementation detail makes sense, but I wonder if an "evolving" annotation or something of that nature might work too?

Some background -

Wouldn't it make sense to implement custom Avro Decoder and Encoder for protos using avro4s

The neat thing about deriving the schemas (and coders) from the protos is that I don't have to do that and expanding my consumer system is super simple and doesn't force our non-Scala devs to deal with the Scala code base when working on protos. This includes schema changes as well as net new messages.

All we need to do is run a build and the system will compile both the proto sources as well as the derived schemas, meaning there is no need for code changes, provided we map the non compatible types to a primitive first.

That is mostly a one time effort - for enums, the code in the post works well, for things like custom types, timestamps, byte strings (...), we can come up with simple mappers that cover these cases. I've prototyped this with the enums and it works well, provided I remove the private modifier first.

Scalapb being as awesome as it is has made that very simple, since I can just use transformers with package level proto files to make this change carte blanche for all our protos.

On a side note, the generated companion objects and the ability to add custom imports and inheritance via proto also allowed me to make the whole system very generic with simple type bounds and some type class constraints, which is also very much appreciated!

Maybe you have another idea on how to feed the generated classes to magnolia (or others)?

Can you help me reproduce the problem you described? I've created this repo and it seems to compile with no issues: https://github.com/thesamet/scalapb-issue1664

Can you help me reproduce the problem you described? I've created this repo and it seems to compile with no issues: https://github.com/thesamet/scalapb-issue1664

This was an interesting morning.

Turns out -

In our repo, -Yretain-trees was enabled, since avro4s sets that: https://github.com/sksamuel/avro4s/blob/2beb8cbdcbb021609bd1b30f922444b81fbc0096/project/Build.scala#L38

Unfortunately, avro4s uses that to set default values in avro schemas (see scala/scala3#16176, I don't think this is documented anywhere).

See https://github.com/chollinger93/scalapb-issue1664

So when you run it w/o -Yretain-trees as sbt 'set scalacOptions ~= (_.filterNot(Set("-Yretain-trees")))' run, you can get a schema, albeit w/o defaults:

{"type":"record","name":"NestedExampleEvent","namespace":"test.nested","fields":[{"name":"id","type":"string"},{"name":"action","type":"string"}]}

Compare this to

sbt clean
sbt run
# [error] one error found ....
sed -ie 's/private[nested]//g' target/scala-3.4.0/src_managed/main/scalapb/test/nested/NestedProto.scala
sbt run

Gives

{"type":"record","name":"NestedExampleEvent","namespace":"test.nested","fields":[{"name":"id","type":"string","default":""},{"name":"action","type":"string","default":"Undefined"}]}