Infinite recursion and OOM
cznic opened this issue · 10 comments
Repro
package main
import (
"fmt"
"github.com/cznic/cc"
)
func newModel() *cc.Model {
return &cc.Model{ // 64
Items: map[cc.Kind]cc.ModelItem{
cc.Ptr: {8, 8, 8, nil},
cc.UintPtr: {8, 8, 8, nil},
cc.Void: {0, 1, 1, nil},
cc.Char: {1, 1, 1, nil},
cc.SChar: {1, 1, 1, nil},
cc.UChar: {1, 1, 1, nil},
cc.Short: {2, 2, 2, nil},
cc.UShort: {2, 2, 2, nil},
cc.Int: {4, 4, 4, nil},
cc.UInt: {4, 4, 4, nil},
cc.Long: {8, 8, 8, nil},
cc.ULong: {8, 8, 8, nil},
cc.LongLong: {8, 8, 8, nil},
cc.ULongLong: {8, 8, 8, nil},
cc.Float: {4, 4, 4, nil},
cc.Double: {8, 8, 8, nil},
cc.LongDouble: {8, 8, 8, nil},
cc.Bool: {1, 1, 1, nil},
cc.FloatComplex: {8, 8, 8, nil},
cc.DoubleComplex: {16, 16, 16, nil},
cc.LongDoubleComplex: {16, 16, 16, nil},
},
}
}
func main() {
t, err := cc.Parse("", []string{"lapacke.h"}, newModel())
fmt.Println(err, t)
}
(Include lapacke.h
in the test directory)
cc /@kortschak
@kortschak FTR a "working" version (on my machine). Printing the tree does not work but I don't know if the tree is otherwise valid or not*. At least it can preprocess and parse lapacke.h
. That's all I know ATM. More later, hopefully.
*: I think the error either in the tree being invalid or in the code making a string of it.
package main
import (
"fmt"
"github.com/cznic/cc"
)
func newModel() *cc.Model {
return &cc.Model{ // 64
Items: map[cc.Kind]cc.ModelItem{
cc.Ptr: {8, 8, 8, nil},
cc.UintPtr: {8, 8, 8, nil},
cc.Void: {0, 1, 1, nil},
cc.Char: {1, 1, 1, nil},
cc.SChar: {1, 1, 1, nil},
cc.UChar: {1, 1, 1, nil},
cc.Short: {2, 2, 2, nil},
cc.UShort: {2, 2, 2, nil},
cc.Int: {4, 4, 4, nil},
cc.UInt: {4, 4, 4, nil},
cc.Long: {8, 8, 8, nil},
cc.ULong: {8, 8, 8, nil},
cc.LongLong: {8, 8, 8, nil},
cc.ULongLong: {8, 8, 8, nil},
cc.Float: {4, 4, 4, nil},
cc.Double: {8, 8, 8, nil},
cc.LongDouble: {8, 8, 8, nil},
cc.Bool: {1, 1, 1, nil},
cc.FloatComplex: {8, 8, 8, nil},
cc.DoubleComplex: {16, 16, 16, nil},
cc.LongDoubleComplex: {16, 16, 16, nil},
},
}
}
func main() {
predefined, includePaths, sysIncludePaths, err := cc.HostConfig()
if err != nil {
fmt.Println(err)
return
}
t, err := cc.Parse(
predefined+`
#define __attribute__(...)
#define __extension__
#define __inline
#define __restrict
unsigned __builtin_bswap32 (unsigned x);
unsigned long long __builtin_bswap64 (unsigned long long x);
`,
[]string{"lapacke.h"},
newModel(),
cc.IncludePaths(includePaths),
cc.SysIncludePaths(sysIncludePaths),
)
if err != nil {
fmt.Println(err)
return
}
//TODO Does not work before fixing this issue.
// fmt.Println(t)
_ = t
}
The code above fails for me with
$ go run lapacke.go
/usr/include/stdlib.h:145:29: unexpected char, expected one of [')', ',', ...]
because of
extern double atof (__const char *__nptr)
^
This is on an old machine (ubuntu 12.04 sadly), so I will try to replicate on a newer one when I have a chance. I imagine it is due to an absence of __const
somewhere on this machine, and indeed adding #define __const const
to predefined
fixes it.
In looking into the original problem, I have been using utter.Dump
, which performs a similar function to strutil.PrettyPrint
. utter.Dump
also goes into a recursion cycle and does not terminate in reasonable time (though does not exhaust memory because it writes to the io.Writer
as it goes). This shows that the issue is in the structure of the TranslationUnit.
Now, both utter.Dump
and strutil.PrettyPrint
have protection against cyclic structures, which is clearly being evaded here. The protection in utter.Dump
is based on pointer equality and is only invoked when walking pointer chains (this now obviously needs fixing). This means that the cycles are likely to be the result of walking slice index chains. I can check this by adding checks into the structure walker in utter.Dump
.
Here is the reproducer for utter.Dump
package main
import "github.com/kortschak/utter"
func main() {
r := make([]interface{}, 1)
r[0] = r
utter.Dump(r)
}
I imagine the same value will cause the recursion problem in strutil.PrettyPrint
.
Quite possibly. PrettyPrint
checks only pointer values, including those in interface values but this never come to my mind - and slice is not a pointer.
and slice is not a pointer.
Except when it is :)
I've fixed utter.Dump
s handling of []interface{}
recursion, but the map[T]interface{}
is harder. Trying to do and utter dump on the return TranslationUnit
still does not terminate in reasonable time, so I suspect that the cycle is in a map without pointer elements. This would suggest the type Binding
may be the problem.
@kortschak Some bugs are hard to find. This one was one level harder, because it isn't.
Let's use this slightly modified version of the OP repro.
package main
import (
"fmt"
"github.com/cznic/cc"
)
func newModel() *cc.Model {
return &cc.Model{ // 64
Items: map[cc.Kind]cc.ModelItem{
cc.Ptr: {8, 8, 8, nil},
cc.UintPtr: {8, 8, 8, nil},
cc.Void: {0, 1, 1, nil},
cc.Char: {1, 1, 1, nil},
cc.SChar: {1, 1, 1, nil},
cc.UChar: {1, 1, 1, nil},
cc.Short: {2, 2, 2, nil},
cc.UShort: {2, 2, 2, nil},
cc.Int: {4, 4, 4, nil},
cc.UInt: {4, 4, 4, nil},
cc.Long: {8, 8, 8, nil},
cc.ULong: {8, 8, 8, nil},
cc.LongLong: {8, 8, 8, nil},
cc.ULongLong: {8, 8, 8, nil},
cc.Float: {4, 4, 4, nil},
cc.Double: {8, 8, 8, nil},
cc.LongDouble: {8, 8, 8, nil},
cc.Bool: {1, 1, 1, nil},
cc.FloatComplex: {8, 8, 8, nil},
cc.DoubleComplex: {16, 16, 16, nil},
cc.LongDoubleComplex: {16, 16, 16, nil},
},
}
}
func main() {
t, err := cc.Parse("", []string{"lapacke.h"}, newModel())
fmt.Println(err)
fmt.Println(t.Declarations)
fmt.Println(t.Macros)
fmt.Println(t.Model)
for ; t != nil; t = t.TranslationUnit {
fmt.Println(t.ExternalDeclaration)
}
}
$ go build && time ./0 > log
real 0m6.406s
user 0m9.052s
sys 0m0.244s
$ wc log
1098191 19952113 77039546 log
$
Printing the translation unit list at once instead of walking it as above brings a quadratic change in the size of the output. The reason is that TranslationUnit
is recursive (it's a list) and it's .String()
method will show nice intendantion for all of the ExternalDeclaration
fields (plus more for the respective sub-fields.). For every level of recursion there's one more string of "· "
added to every line for the indentation (the list has 17000+ items so the last ExternalDeclaration
would start with a ~34kB indent string 😃). The output has milion+ lines so OOM is hard to avoid on most computers.
#Working as intended.
Edit 2016-09-24 18:10 CEST: s/lever/level/
Thank you Jan, Sorry for leading you up the garden path.
Absolutely no need for apologies. Finally "getting it" was a big joy! Aha moments are priceless ;-)