Question on copy and paste between msgs.
msackman opened this issue · 8 comments
Say I receive a msg and I decode it. I then save a ptr to some part of it:
seg, _, err := capn.ReadFromMemoryZeroCopy(data)
if err != nil {
return nil, err
}
varCap := ReadRootVarCap(seg)
idCap := varCap.Id()
positions := mt.Positions(idCap.Positions())
return &positions
Then, at some later point, I want to include positions in some output msg:
func AddToSegAutoRoot(seg *capn.Segment, pos *Positions) VarCap {
vCap := AutoNewVarCap(seg)
idCap := NewVarIdPosCap(seg)
idCap.SetPositions(capn.UInt8List(*pos))
vCap.SetId(idCap)
The question is: does the entire segment from which the data was drawn (in the incoming msg), get copied/included with the output segment, or is only the exact data needed used?
Whilst the docs say things about "pointer semantics", and I've read some bits of the spec for encoding, far pointers, that sort of thing, and I've also looked through the code, I can't quite come across a definitive answer.
Basically, I'm trying to avoid the work of decoding things that I only need to relay through. But I don't want the whole of the incoming segment included in the output.
Obviously, I can write a test to check what the code is doing right now, but I'm more curious as to what the design intention is for this.
The intent and the implementation are the same. The pinted to object tree is copied from one segment to the other.
For reference, the code that does the copying is in capn.go, in routines writePtr() and copyStructHandlingVersionSkew(), which are mutually recursive.
@msackman -- I added a convenience public method that copies objects between two different segments. You can use this to verify that you are getting the behavior you want. Simply make two segments, then copy objects between them as you like. You will have to cast both dest and src structs to capn.Object. See CopyToFrom() at https://github.com/glycerine/go-capnproto/blob/master/capn.go#L1491
@glycerine Many thanks.
I'm sorry if you found the question naive or lazy. There are many things determined by the spec, but much is left open to different implementations, and have caught me out. For example, I hadn't realised that you need to populate objects from the children upwards - to me the API is very functional in implementation rather than OO-based - i.e. you can't just construct a nest of pointers and expect it all to work out when you write it out. As that had initially tripped me up, I wanted to tread carefully with other aspects of the API/implementation. Thanks once again.
@msackman, no worries. Sorry for my terse earlier reply. I'm just super busy at the moment. It is a good question you asked.
Since you mention usability and getting tripped up, I'll mention my bambam project which aims to make using go-capnproto easier/more convenient.
The idea with bambam is that you simply write the structures you want in Go, and then run the bambam code-generator on the .go file to generate capnproto Save() and Load() methods. In this way the .go file can act as your original schema.
Even if you don't want/use the original Go structs that mirror the capnproto structs (as such a struct is an extra copy of the data, of course, and for performance you may aim for fewer copies), the resulting generated code is instructional for how to use the go-capnproto API. It can also be quite convenient when performance is not critical.
https://github.com/glycerine/bambam
While this goes somewhat against the performance-oriented grain of capnproto, is still very useful for generating the boilerplate associated with serialization and understanding the go-capnproto API. I often use bambam generated code as a starting point for customization. Also note that not all capnproto features are supported by bambam. For example, unions and groups -- space saving features of the capnproto schema language -- aren't available in Go, and so you'll frequently still need to customize further the schema.capnp files generated by bambam.
Lastly, I'll note in response to your comment that I don't believe it is should be strictly necessary to construct a capnproto object tree from the leaves up. I frequently allocate my root object first, then children, and then Set the child pointers in the root object to point to the already-allocated-in-the-same-segment child objects. If you were having difficulties that didn't make sense, feel free to discuss them here or in the capnproto forum.
@glycerine Many thanks for your reply and very helpful suggestions - I'll certainly take a look at bambam and learn from it.
With regards to the leaves-up point, after some further testing, I think I've narrowed it down: If you have a list to a custom type:
struct Foo {
bars @0: List(Bar);
}
struct Bar {
thing @0: Bool;
}
then the following shows the problem:
bars := []bool{true, false, true}
seg := capn.NewBuffer(nil)
foo := NewRootFoo(seg)
barsList := NewBarList(seg, len(bars))
foo.SetBars(barsList)
for idx, b := range bars {
bar := NewBar(seg)
barsList.Set(idx, bar) // set into list before setting the contents of bar
bar.SetThing(b)
}
buf := new(bytes.Buffer)
seg.WriteTo(buf)
fmt.Println(buf.Bytes())
seg = capn.NewBuffer(nil)
foo = NewRootFoo(seg)
barsList = NewBarList(seg, len(bars))
foo.SetBars(barsList)
for idx, b := range bars {
bar := NewBar(seg)
bar.SetThing(b) // this time set the contents first
barsList.Set(idx, bar) // now set into the list
}
buf2 := new(bytes.Buffer)
seg.WriteTo(buf2)
fmt.Println(buf2.Bytes())
The results are different.
[0 0 0 0 9 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 31 0 0 0 12 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
[0 0 0 0 9 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 31 0 0 0 12 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
But yes - this does seem to be the only case where it appears to go wrong. I'd wrongly concluded that it always had to be child-up and that I'd just happened to write a lot of code where I'd mainly written it child-up and just made the mistake a few times. But you're suggesting that it should never have to be child-up, so the above is a bug?
Assuming for a moment we have only a single segment (which is what go-capnproto will always give you by default unless you make more than one), Capnproto makes a linear flat segment of memory/disk that contains a tree of objects.
Inside the segment, the flattened tree's nodes can be in any order.
The only constraint is that the root appear first, so you can easily find it.
The order of children/pointers after the root is unconstrained.
So there are a large number of possible flattenings of any given tree.
Caproto will allocate as it needs to, so the data won't look the same if you construct it in a different order. So in general two binaries won't be the same, nor are they expected to be the same. In order to compare two binaries, you would want to write them out to a text format (or do a topological sort). The guys at sandstorm have been working with a canonicalizer to make binary comparison easier (so objects can be signed), but that's not been published.
Thank you for your reply.
However, in the above, the objects are added to the segment in the same order, and the binary result differs only in two bits which correspond to the bool "true" values not being set in the first version.
To show this further, at the bottom of each loop, add
fmt.Printf("desired: %v; actual: %v\n", b, barsList.At(idx).Thing())
Now the output is:
desired: true; actual: false
desired: false; actual: false
desired: true; actual: false
[0 0 0 0 9 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 31 0 0 0 12 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
desired: true; actual: true
desired: false; actual: false
desired: true; actual: true
[0 0 0 0 9 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 31 0 0 0 12 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
After some thought, it seems that I'm assuming the semantics of a BarList to be the same as []*Bar. Whereas it actually seems to be a []Bar. That suggests that when you do a Set, it's doing a copy, whereas I had assumed that the members of the list would be pointers, so you're just writing the pointer. That suggests I shouldn't have to create and add a new Bar to the seg in the first place. Changing the first loop to:
for idx, b := range bars {
barsList.At(idx).SetThing(b)
fmt.Printf("1: desired: %v; actual: %v\n", b, barsList.At(idx).Thing())
}
does indeed work.
However, I've been back through the docs both at https://godoc.org/github.com/glycerine/go-capnproto and https://capnproto.org/language.html and I can't find any documentation of this behaviour. It's implied by https://capnproto.org/encoding.html#lists with "A list value is encoded as a pointer to a flat array of values.", but it would be great if this behaviour in the Go binding was made clearer. I note that at https://godoc.org/github.com/glycerine/go-capnproto#hdr-Structs, the Set method doesn't appear and isn't documented.
@msackman Ah, yes. Lists are always of values. Consider creating a pull request to the capnproto docs with verbiage you would have found helpful. But yes: lists are always values, and this has profound implications. You saw how setting a value on a list results in a copy into the list.
Primarily this is performance optimization. Capnproto is all about performance. Otherwise you could use some slower system. However, it does have an impact on schema evolution too, since a List(struct1) cannot losslessly hold a List(struct2) if struct2 evolved from struct1 by adding a new field. Those new/ extra fields will be truncated off during copy through the older List.
But: but you should also recognize that it is trivial to add a level of indirection should you desire. Structs are pointers, so structs inside structs are handled by setting pointers, and we can use that.
While doing a direct List(AnyPointer) isn't supported at the moment (no idea why myself), instead you can simply add the indirection through your own new struct that holds only a pointer to another struct. In effect the struct that gets copied into the list is simply a pointer.
Ultimately capnproto provides you the ability to choose: you can design for either performance or flexibility.
Example:
## revised schema file
@0xeab0c53a5087582b;
# schema.capnp
using Go = import "go.capnp";
$Go.package("main");
$Go.import("schema");
## previous schema:
struct Foo {
bars @0: List(Bar);
}
struct Bar {
thing @0: Bool;
}
######### new: roll-your-own List of pointers example:
struct HoldList {
myptrs @0: List(MyPointer);
}
struct MyPointer {
bar @0: Bar;
}
// in main.go:
func x2() {
bars := []bool{true, false, true}
seg := capn.NewBuffer(nil)
foo := NewRootHoldList(seg)
myptrsList := NewMyPointerList(seg, len(bars))
foo.SetMyptrs(myptrsList)
for idx, b := range bars {
bar := NewBar(seg)
myptrsList.At(idx).SetBar(bar) // set MyPointer in the myptrsList before setting the contents of bar
bar.SetThing(b) // since we used a layer of indirection (MyPointer), now this works too.
fmt.Printf("desired: %v; actual: %v\n", b, myptrsList.At(idx).Bar().Thing())
}
buf := new(bytes.Buffer)
seg.WriteTo(buf)
fmt.Println(buf.Bytes())
seg = capn.NewBuffer(nil)
foo = NewRootHoldList(seg)
myptrsList = NewMyPointerList(seg, len(bars))
foo.SetMyptrs(myptrsList)
for idx, b := range bars {
bar := NewBar(seg)
bar.SetThing(b) // this time set the contents first
myptrsList.At(idx).SetBar(bar) // now set into the list.
fmt.Printf("desired: %v; actual: %v\n", b, myptrsList.At(idx).Bar().Thing())
}
buf2 := new(bytes.Buffer)
seg.WriteTo(buf2)
fmt.Println(buf2.Bytes())
}
with the results now being identical whichever order you do the setting.
~/mypointer $ ./mypointer
desired: true; actual: true
desired: false; actual: false
desired: true; actual: true
[0 0 0 0 9 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 31 0 0 0 12 0 0 0 0 0 1 0 8 0 0 0 1 0 0 0 8 0 0 0 1 0 0 0 8 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
desired: true; actual: true
desired: false; actual: false
desired: true; actual: true
[0 0 0 0 9 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 31 0 0 0 12 0 0 0 0 0 1 0 8 0 0 0 1 0 0 0 8 0 0 0 1 0 0 0 8 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
~/mypointer $