cznic/cc

Infinite recursion and OOM

cznic opened this issue · 10 comments

cznic commented

Repro

package main

import (
    "fmt"

    "github.com/cznic/cc"
)

func newModel() *cc.Model {
    return &cc.Model{ // 64
        Items: map[cc.Kind]cc.ModelItem{
            cc.Ptr:               {8, 8, 8, nil},
            cc.UintPtr:           {8, 8, 8, nil},
            cc.Void:              {0, 1, 1, nil},
            cc.Char:              {1, 1, 1, nil},
            cc.SChar:             {1, 1, 1, nil},
            cc.UChar:             {1, 1, 1, nil},
            cc.Short:             {2, 2, 2, nil},
            cc.UShort:            {2, 2, 2, nil},
            cc.Int:               {4, 4, 4, nil},
            cc.UInt:              {4, 4, 4, nil},
            cc.Long:              {8, 8, 8, nil},
            cc.ULong:             {8, 8, 8, nil},
            cc.LongLong:          {8, 8, 8, nil},
            cc.ULongLong:         {8, 8, 8, nil},
            cc.Float:             {4, 4, 4, nil},
            cc.Double:            {8, 8, 8, nil},
            cc.LongDouble:        {8, 8, 8, nil},
            cc.Bool:              {1, 1, 1, nil},
            cc.FloatComplex:      {8, 8, 8, nil},
            cc.DoubleComplex:     {16, 16, 16, nil},
            cc.LongDoubleComplex: {16, 16, 16, nil},
        },
    }
}

func main() {
    t, err := cc.Parse("", []string{"lapacke.h"}, newModel())
    fmt.Println(err, t)
}

(Include lapacke.h in the test directory)

cc /@kortschak

cznic commented

@kortschak FTR a "working" version (on my machine). Printing the tree does not work but I don't know if the tree is otherwise valid or not*. At least it can preprocess and parse lapacke.h. That's all I know ATM. More later, hopefully.

*: I think the error either in the tree being invalid or in the code making a string of it.

package main

import (
    "fmt"

    "github.com/cznic/cc"
)

func newModel() *cc.Model {
    return &cc.Model{ // 64
        Items: map[cc.Kind]cc.ModelItem{
            cc.Ptr:               {8, 8, 8, nil},
            cc.UintPtr:           {8, 8, 8, nil},
            cc.Void:              {0, 1, 1, nil},
            cc.Char:              {1, 1, 1, nil},
            cc.SChar:             {1, 1, 1, nil},
            cc.UChar:             {1, 1, 1, nil},
            cc.Short:             {2, 2, 2, nil},
            cc.UShort:            {2, 2, 2, nil},
            cc.Int:               {4, 4, 4, nil},
            cc.UInt:              {4, 4, 4, nil},
            cc.Long:              {8, 8, 8, nil},
            cc.ULong:             {8, 8, 8, nil},
            cc.LongLong:          {8, 8, 8, nil},
            cc.ULongLong:         {8, 8, 8, nil},
            cc.Float:             {4, 4, 4, nil},
            cc.Double:            {8, 8, 8, nil},
            cc.LongDouble:        {8, 8, 8, nil},
            cc.Bool:              {1, 1, 1, nil},
            cc.FloatComplex:      {8, 8, 8, nil},
            cc.DoubleComplex:     {16, 16, 16, nil},
            cc.LongDoubleComplex: {16, 16, 16, nil},
        },
    }
}

func main() {
    predefined, includePaths, sysIncludePaths, err := cc.HostConfig()
    if err != nil {
        fmt.Println(err)
        return
    }

    t, err := cc.Parse(
        predefined+`
#define __attribute__(...)
#define __extension__
#define __inline
#define __restrict
unsigned __builtin_bswap32 (unsigned x);
unsigned long long __builtin_bswap64 (unsigned long long x);
`,
        []string{"lapacke.h"},
        newModel(),
        cc.IncludePaths(includePaths),
        cc.SysIncludePaths(sysIncludePaths),
    )
    if err != nil {
        fmt.Println(err)
        return
    }

    //TODO Does not work before fixing this issue.
    // fmt.Println(t)
    _ = t
}

The code above fails for me with

$ go run lapacke.go 
/usr/include/stdlib.h:145:29: unexpected char, expected one of [')', ',', ...]

because of

extern double atof (__const char *__nptr)
                            ^

This is on an old machine (ubuntu 12.04 sadly), so I will try to replicate on a newer one when I have a chance. I imagine it is due to an absence of __const somewhere on this machine, and indeed adding #define __const const to predefined fixes it.

In looking into the original problem, I have been using utter.Dump, which performs a similar function to strutil.PrettyPrint. utter.Dump also goes into a recursion cycle and does not terminate in reasonable time (though does not exhaust memory because it writes to the io.Writer as it goes). This shows that the issue is in the structure of the TranslationUnit.

Now, both utter.Dump and strutil.PrettyPrint have protection against cyclic structures, which is clearly being evaded here. The protection in utter.Dump is based on pointer equality and is only invoked when walking pointer chains (this now obviously needs fixing). This means that the cycles are likely to be the result of walking slice index chains. I can check this by adding checks into the structure walker in utter.Dump.

Here is the reproducer for utter.Dump

package main

import "github.com/kortschak/utter"

func main() {
    r := make([]interface{}, 1)
    r[0] = r
    utter.Dump(r)
}

I imagine the same value will cause the recursion problem in strutil.PrettyPrint.

cznic commented

Quite possibly. PrettyPrint checks only pointer values, including those in interface values but this never come to my mind - and slice is not a pointer.

and slice is not a pointer.

Except when it is :)

I've fixed utter.Dumps handling of []interface{} recursion, but the map[T]interface{} is harder. Trying to do and utter dump on the return TranslationUnit still does not terminate in reasonable time, so I suspect that the cycle is in a map without pointer elements. This would suggest the type Binding may be the problem.

cznic commented

@kortschak Some bugs are hard to find. This one was one level harder, because it isn't.

Let's use this slightly modified version of the OP repro.

package main

import (
    "fmt"

    "github.com/cznic/cc"
)

func newModel() *cc.Model {
    return &cc.Model{ // 64
        Items: map[cc.Kind]cc.ModelItem{
            cc.Ptr:               {8, 8, 8, nil},
            cc.UintPtr:           {8, 8, 8, nil},
            cc.Void:              {0, 1, 1, nil},
            cc.Char:              {1, 1, 1, nil},
            cc.SChar:             {1, 1, 1, nil},
            cc.UChar:             {1, 1, 1, nil},
            cc.Short:             {2, 2, 2, nil},
            cc.UShort:            {2, 2, 2, nil},
            cc.Int:               {4, 4, 4, nil},
            cc.UInt:              {4, 4, 4, nil},
            cc.Long:              {8, 8, 8, nil},
            cc.ULong:             {8, 8, 8, nil},
            cc.LongLong:          {8, 8, 8, nil},
            cc.ULongLong:         {8, 8, 8, nil},
            cc.Float:             {4, 4, 4, nil},
            cc.Double:            {8, 8, 8, nil},
            cc.LongDouble:        {8, 8, 8, nil},
            cc.Bool:              {1, 1, 1, nil},
            cc.FloatComplex:      {8, 8, 8, nil},
            cc.DoubleComplex:     {16, 16, 16, nil},
            cc.LongDoubleComplex: {16, 16, 16, nil},
        },
    }
}

func main() {
    t, err := cc.Parse("", []string{"lapacke.h"}, newModel())
    fmt.Println(err)
    fmt.Println(t.Declarations)
    fmt.Println(t.Macros)
    fmt.Println(t.Model)
    for ; t != nil; t = t.TranslationUnit {
        fmt.Println(t.ExternalDeclaration)
    }
}
$ go build && time ./0 > log

real    0m6.406s
user    0m9.052s
sys 0m0.244s
$ wc log
 1098191 19952113 77039546 log
$

Printing the translation unit list at once instead of walking it as above brings a quadratic change in the size of the output. The reason is that TranslationUnit is recursive (it's a list) and it's .String() method will show nice intendantion for all of the ExternalDeclaration fields (plus more for the respective sub-fields.). For every level of recursion there's one more string of "· " added to every line for the indentation (the list has 17000+ items so the last ExternalDeclaration would start with a ~34kB indent string 😃). The output has milion+ lines so OOM is hard to avoid on most computers.

#Working as intended.

Edit 2016-09-24 18:10 CEST: s/lever/level/

Thank you Jan, Sorry for leading you up the garden path.

cznic commented

Absolutely no need for apologies. Finally "getting it" was a big joy! Aha moments are priceless ;-)