StructScan infinite loop

Question

StructScan infinite loop

elwinar opened this issue 11 years ago · 20 comments

I had a case earlier which made me pull my hair for a full day and half :

Let's say we have the following structs :

type Entity struct {
    ID int64 `db:"id"`
    Service *Service
}

type Service struct {
    Entity *Entity
}

func NewEntity(id int64) *Entity {
    e := &Entity{}
    e.Service = &Service{}
    e.Service.Entity = e
    return e
}

When StructScaning, the function tries to establish a field map by exploring the tree of embedded types of the struct. I my case, the exploration never end because of the pointers.

This issue bring the following question to my mind : what is the point of exploring non-embedded fields ? It seem rather non-semantic to me…
And independantly, why making a field map and not simply look for the fields as needed when scanning ?

Answer 1 · 2014-05-14T15:14:44.000Z

Thanks for the bug report; I only saw the title on my way into work this morning and thought right away "hmm, structscan would choke on circular data structures".

Scanning non-embedded structs was a mistake I am going to be fixing in a "API Breaking" release coming up.
Getting the reflect introspection work out of the way up front saves a lot of time. I'd like to change it so that the maps provide a direct way of getting the field. Right now they contain the BFS position in the tree, which means it still must be traversed. It'd be better to store a list positions at each depth.

Does this need to be fixed for embedded structs as well, or does the compiler prevent you from doing embedded struct loops? If not, it'l be fixed soon with the api cleanup release.

Answer 2 · 2014-05-14T16:37:26.000Z

If you don't scan non-embedded structs, I don't think the BFS will be necessary. I didn't researched much the way reflection work in Go, but I think that one way or another, each exported field of the struct should be accessible right away with the right RTTI function.

It doesn't need to be fixed for embedded structs, as the compiler complains about recursive embedded (see http://play.golang.org/p/Nkz7n-1W6m).

Answer 3 · 2014-05-14T16:45:40.000Z

Okay, excellent. I suspected the compiler might complain.

BFS is still sadly required to emulate Go's shadowing rules. reflect.FieldByName works on embedded fields, but it won't utilize tags, so it has to be somewhat re-implemented.

Removing the descent into non-embedded fields is going to simplify things quite a lot. There is a pull request for adding embedded struct support to modl and the code is way simpler because it doesn't have to check for Scanner or Valuer or do any weird stuff.

Answer 4 · 2014-05-14T16:51:25.000Z

I was trying at the same time, but for me the Tags are embedded at the same time… http://play.golang.org/p/PbEtNt_JOh

Answer 5 · 2014-05-14T16:53:23.000Z

Yes but it's backwards; when you scan, you only have the tag name, and you must go and find the field that fits it.

Answer 6 · 2014-05-14T16:55:13.000Z

I think that the StructScan function should just iterate over the given struct fields and populate a map[string]string with the correspondences sql -> struct. No need to do something fancy…

Answer 7 · 2014-05-14T17:01:54.000Z

I'll do some benchmarking to see if Value.FieldByName carries a big penalty compared to just accessing by offset. If not, then you're right, each reachable name in a struct is unique and should get me there. It's still nice to cache the field mapping, they are actually expensive to build.

Answer 8 · 2014-05-14T17:07:19.000Z

Access the value's fields by offset should be better, that's right.
Here is an example of the field mapping loop : http://play.golang.org/p/UzZzrOr1AU

Answer 9 · 2014-05-14T17:11:53.000Z

That doesn't do embedded structs (eg: 'age' is not accessible). More and more I'm thinking that sqlx's reflect stuff can be genericized and pushed into an sqlx/reflect package. I need to reuse some of it in modl anyway.

Answer 10 · 2014-05-14T17:20:45.000Z

You're right, it doesn't do embedded struct. I didn't saw it… but the Field.Anonymous field can differentiate embedded field from others, so it can be resolved with either a recursion or a stack (as you did).

Answer 11 · 2014-05-14T17:22:23.000Z

Yes, I think it will end up looking a bit like https://github.com/sqs/modl/commit/dae428523dbab11adfed450f45782f5c7152027b

Answer 12 · 2014-05-14T17:23:53.000Z

Exactly.
I think that you should externalize your extended reflection methods into another package. That would simplify both sqlx and modl…

Answer 13 · 2014-05-14T17:25:13.000Z

And you could call it reflectx. :)

Answer 14 · 2014-05-16T02:21:49.000Z

@wfreeman that was probably a joke but I can't think of a better way to make it live with the regular reflect package ...

Answer 15 · 2014-05-16T02:29:26.000Z

I was actually entirely serious. And willing to help--I'd use it.

Answer 16 · 2014-05-16T02:48:00.000Z

It seems like all sqlx needs is a GetFieldByName(s) which:

understands embedded structs
can optionally obey a struct tag
caches as much reflect work as it can (whatever benchmarks fastest)

sqlx also needs both the values (for named queries) and the addresses (for scanning) of these fields.

Can you think of anything else that would be useful?

Edit This is all complicated by the NameMapper in sqlx. This ends up being quite a lot like FieldByNameFunc(), so maybe I can make a similar API.

Answer 17 · 2014-05-16T14:48:07.000Z

commit 8aa2977

PASS
BenchmarkFieldNameL1    10000000               261 ns/op
BenchmarkFieldNameL4     1000000              2933 ns/op
BenchmarkFieldPosL1     20000000               112 ns/op
BenchmarkFieldPosL4     20000000               137 ns/op
ok      github.com/jmoiron/sqlx/reflect 11.120s

Lookups by field index are way, way faster than lookups by field name, especially when accessing embedded names.

Answer 18 · 2014-05-16T16:24:37.000Z

Interesting. I tried the FieldByIndex([]int{0,0,0,0}) for the last one and it was slightly slower than doing f = f.Field(0) 4 times, but might be the same in practice. So, are you thinking of caching the indexes?

Answer 19 · 2014-05-16T16:33:05.000Z

It's slightly slower but nowhere near FieldByName:

PASS
BenchmarkFieldNameL1    10000000               258 ns/op
BenchmarkFieldNameL4     1000000              2926 ns/op
BenchmarkFieldPosL1     20000000               112 ns/op
BenchmarkFieldPosL4     20000000               136 ns/op
BenchmarkFieldByIndexL4 10000000               147 ns/op

This is to be expected as there will be some overhead to the iteration that isn't there when just knowing the indexes all up front. Notably it's still nearly twice as fast to descend 4 levels than to get a top level field by name.

Answer 20 · 2014-06-07T18:01:41.000Z

This has all been merged into master now.