another starlark / proto binding
alandonovan opened this issue · 6 comments
Hi, author of go.starlark.net here. Just wanted to let you know that I have an implementation of Starlark support for protocol messages that we use inside Google. It presents a slightly different API that what you have in skycfg. See the example below for comparison.
I haven't open-sourced it yet because it depends on Google's internal implementation of dynamic message and message reflection. The Go protobuf team is hard at work adding support for these features to the open-source proto.Message API (golang/protobuf#364). Once that's done I will port our starlarkproto package to use it and then make it available.
proto:
syntax = "proto2";
package bugtracker;
enum Status {
NEW = 0;
ASSIGNED = 1;
ACCEPTED = 2;
FIXED = 3;
WONTFIX = 5;
}
enum Type {
BUG = 0;
FEATURE = 1;
OTHER = 2;
}
message Issue {
required int64 id = 1;
optional string category = 2;
optional Status status = 3;
optional string reporter = 4;
optional string assignee = 5;
optional string verifier = 6;
repeated string cc = 7;
optional Type type = 8;
optional int32 priority = 9;
optional uint32 severity = 10;
repeated string note = 11;
repeated KeyValuePair metadata = 12;
message Nested {
optional string x = 1;
optional Nested y = 2;
}
optional Nested nested = 13;
optional float score = 14;
}
Starlark:
# Calling a message Descriptor instantiates a message.
issue = bugtracker.Issue(
id = 12345,
status = bugtracker.Status.ASSIGNED,
reporter = "adonovan",
note = ["A note", "Another note"],
metadata = [bugtracker.KeyValuePair(key = "greeting", value = "hello")],
ext1 = 1,
ext2 = 2,
)
# Fields may be updated:
issue.type = bugtracker.Type.FEATURE
# Unset fields act like an immutable default value.
assert.eq(issue.nested.y.y.y.y.y.x, "")
# Calling a descriptor with a single positional argument makes a shallow copy.
issue2 = bugtracker.Issue(issue)
assert.eq(str(issue2), str(issue))
issue2.reporter = "nobody"
assert.eq(issue2.reporter, "nobody")
assert.eq(issue.reporter, "adonovan") # unchanged
# Submessages can be expressed using dicts.
assert.eq(str(bugtracker.Issue(nested = {"x": "x"})), 'bugtracker.Issue(nested={x="x"})')
# The str function prints the entire structure.
assert.eq(str(issue), 'bugtracker.Issue(id=12345, status=Status.ASSIGNED, reporter="adonovan", type=Type.FEATURE, note=["A note", "Another note"], metadata=[bugtracker.KeyValuePair(key="greeting", value="hello")], ext1=1, ext2=2)')
# marshal, marshal_text encode a message.
data = proto.marshal(issue)
text = proto.marshal_text(issue)
assert.eq(data, "\b\xb9`\x18\x01\"\badonovan@\x01Z\x06A noteZ\fAnother noteb\x11\n\bgreeting\x12\x05hello\xa0\x06\x01\xa8\x06\x02")
assert.eq(text, '''id: 12345
status: ASSIGNED
reporter: "adonovan"
type: FEATURE
note: "A note"
note: "Another note"
metadata {
key: "greeting"
value: "hello"
}
[bugtracker.Outer.ext2]: 2
[bugtracker.ext1]: 1
''')
# unmarshal, unmarshal_text decode a message.
msg = proto.unmarshal(bugtracker.Issue, data)
assert.eq(str(msg), str(issue)) # messages are equal
Thanks! I was planning to upstream some of the Protobuf code from Skycfg into upstream once the go-protobuf reflection API landed, but it looks like that won't be necessary.
How stabilized is your API? There are a couple parts of it that are a bit awkward to work with in the context of config data, which is why Skycfg diverges. If possible, I would love to resolve the API differences in favor of a single unified starlarkproto
package that both codebases could share[0].
Specifically:
- If an unset field is an immutable default value, then
if issue.nested == None
becomes difficult/impossible to express. I assume you have aproto.HasField(issue, "nested")
or similar helper? I found users without a background in*_pb2.py
generated code got very confused when I tried that approach here. - Do you also do shallow copies when assigning to fields? The Ruby generated code does and it's caused some confusion here, so Skycfg does a bit of magic to make
meta = ObjectMeta(); job = CronJob(meta=meta); meta.name = "hellocron"
work. - We allow strings to be assigned to enums, so
status = bugtracker.Status.ASSIGNED
andstatus = "ASSIGNED"
both work. Does your package support this, or if not, would you be willing to have it do so?
[0] Skycfg would need some hooks to handle weird bits of Kubernetes' pseudo-protos.
I've copied the API of the package below; as you can see, it's pretty straightforward. (descpool is Google's internal descriptor pool.)
If an unset field is an immutable default value, then if issue.nested == None becomes difficult/impossible to express. I assume you have a proto.HasField(issue, "nested") or similar helper? I found users without a background in *_pb2.py generated code got very confused when I tried that approach here.
Yes, you need to use proto.has(msg, "fieldname"). Sorry I didn't show that in the example, but it's part of the Starlark module, along with {,un}marshal{,_text}.
Do you also do shallow copies when assigning to fields? The Ruby generated code does and it's caused some confusion here, so Skycfg does a bit of magic to make meta = ObjectMeta(); job = CronJob(meta=meta); meta.name = "hellocron" work.
A proto field is either a scalar, a list (RepeatedField), or a reference to a message. Assigning to a scalar field is trivial. Assigning an iterable to a repeated field causes allocation of a new RepeatedField value and a check that each element of the iterable can be assigned to T. Assigning to an element of a repeated field causes an array element update; RepeatedFields are basically mutable lists of a particular element type. Assigning to a message field x.f=y field causes x.f to alias y. Mutations to one will be observed by the other.
We allow strings to be assigned to enums, so status = bugtracker.Status.ASSIGNED and status = "ASSIGNED" both work. Does your package support this, or if not, would you be willing to have it do so?
Yes, strings, numbers and EnumValueDesciptor values can all be assigned directly to enum-typed message fields.
package starlarkproto
type Descriptor struct{Desc descpool.Descriptor}
method (Descriptor) Attr(name string) (starlark.Value, error)
method (Descriptor) AttrNames() []string
method (Descriptor) CallInternal(thread *starlark.Thread, args starlark.Tuple, kwargs []starlark.Tuple) (starlark.Value, error)
method (Descriptor) Field(name string) (starlark.Value, error)
...boilerplate Value methods skipped...
type EnumDescriptor struct{Desc descpool.EnumDescriptor}
method (EnumDescriptor) Attr(name string) (starlark.Value, error)
method (EnumDescriptor) AttrNames() []string
method (EnumDescriptor) CallInternal(_ *starlark.Thread, args starlark.Tuple, kwargs []starlark.Tuple) (starlark.Value, error)
type EnumValueDescriptor struct{Desc descpool.EnumValueDescriptor}
method (EnumValueDescriptor) Attr(name string) (starlark.Value, error)
method (EnumValueDescriptor) AttrNames() []string
method (EnumValueDescriptor) CompareSameType(op syntax.Token, y_ starlark.Value, depth int) (bool, error)
type FileDescriptor struct{Desc descpool.FileDescriptor}
method (FileDescriptor) Attr(name string) (starlark.Value, error)
method (FileDescriptor) AttrNames() []string
type Message struct{...}
method (*Message) Attr(name string) (starlark.Value, error)
method (*Message) AttrNames() []string
method (*Message) Descriptor() descpool.Descriptor
method (*Message) Marshal() ([]byte, error)
method (*Message) MarshalText(w io.Writer) error
method (*Message) SetField(name string, v starlark.Value) error
type RepeatedField struct{...}
method (*RepeatedField) Index(i int) starlark.Value
method (*RepeatedField) Iterate() starlark.Iterator
method (*RepeatedField) Len() int
method (*RepeatedField) SetIndex(i int, v starlark.Value) error
func SetPool func(thread *starlark.Thread, pool descpool.Pool)
func Pool func(thread *starlark.Thread) descpool.Pool
func Load func(thread *starlark.Thread, name string) (starlark.StringDict, error)
func LoadProtoModule func() (starlark.StringDict, error)
func Unmarshal func(pool descpool.Pool, desc descpool.Descriptor, data []byte) (*Message, error)
func UnmarshalText func(pool descpool.Pool, desc descpool.Descriptor, data []byte) (*Message, error)
@alandonovan, any update on this issue? I see that work seems to be progressing on golang/protobuf#364. Is everything on track for open-sourcing your starlark/protobuf bindings? Also, I assume that those bindings will be dynamic (ie. based on descriptors, not on Golang reflection against the generated protobuf code). Could you please confirm that this is correct? Thanks!
I've started to look at applying these API bindings here: https://github.com/afking/starlarkproto . Only have basic message construction but will update to support the full API soon.
@emcfarlane Thanks! Do you use it somewhere?
@alandonovan Any plans/chance that such bindings be available for Bazel?
I'd prefer to write Starlark code rather than pbtxt :D
FWIW, the starlarkproto package was committed to go.starlark.net back in November: google/starlark-go#318
If there's something that it should support, but does not, please open an issue at go.starlark.net.