oleg-st/ZstdSharp

Heap corruption in release build when decompressing thousands of times

xPaw opened this issue · 13 comments

xPaw commented

See ValveResourceFormat/ValveResourceFormat#454 for context.

I took Sandbox in this repo and put this code in:

        static void Main(string[] args)
        {
            var src = File.ReadAllBytes("test.zst");
            var i = 0;

            while (true)
            {
                var decompressor = new Decompressor();

                var outdec = new Span<byte>(new byte[1555065]);

                decompressor.TryUnwrap(src, outdec, out var written);
                //var uncompressed = decompressor.Unwrap(src);

                if (i++ % 1000 == 0)
                {
                    Console.WriteLine($"{i}");
                }
            }
        }

File: https://s.xpaw.me/zstdsharp_crash.zst

Some runs it takes just 1000 loops to crash, others it takes 300000 loops, it's not deterministic.

I don't think the file particularly matters because we've been getting crashes non deterministically. I just took one random file out.

On my machine decompressor.Unwrap(src); only crashes when multithreading, but on other users machine it crashes in a single thread too.

In event viewer I see the following:

Faulting application name: Sandbox.exe, version: 1.0.0.0, time stamp: 0x63558578
Faulting module name: ntdll.dll, version: 10.0.22621.755, time stamp: 0x8a328c67
Exception code: 0xc0000374
Fault offset: 0x000000000010c249
Faulting process id: 0x0x4F80
Faulting application start time: 0x0x1D8F78F09C1359F
Faulting application path: D:\GitHub\ZstdSharp\src\Sandbox\bin\Release\net6.0\Sandbox.exe
Faulting module path: C:\WINDOWS\SYSTEM32\ntdll.dll
Report Id: c2b3fde0-4654-40e8-995a-1c179d60bda3
Faulting package full name: 
Faulting package-relative application ID: 
Application: Sandbox.exe
CoreCLR Version: 6.0.1122.52304
.NET Version: 6.0.11
Description: The process was terminated due to an internal error in the .NET Runtime at IP 00007FFA4FD6237B (00007FFA4FB90000) with exit code c0000005.

Sometimes it throws an access violation in random parts of decompress code.

image

I have had other users run that code on their machines, and they get crashes too. Both AMD and Intel cpus crash.

I tested master, 0.6.4 and 0.3.0 versions.

Can confirm, release only.
image

@xPaw I think it's related to the garbage collector.
It's looks as use after free.
You can add decompressor.Dispose() to your cycle or reuse the same decompressor.
I'll take a deeper look into this case.

xPaw commented

I think it's related to the garbage collector.

It's plausible because in my code adding GC.Collect did seem to fix the crash. Also indeed my mistake not disposing the decompressor (VS didn't want me for me for it).

Decompressor's finalizer is called during decompression.
If decompression is the last method called and no other access to decompressor's object.
We need to protect the decompressor object from being garbage collected.

xPaw commented

Decompressor's finalizer is called during decompression.

You think it's being called on the current object? I don't understand why that would happen.

@xPaw Yeah, I reproduced it with this sample:

works in release only

internal unsafe class TestClass : IDisposable
{
    private readonly int* _mem;

    public TestClass()
    {
        _mem = (int*) Marshal.AllocHGlobal(sizeof(int));
        *_mem = 0;
    }

    public bool Method()
    {
        Calc(_mem);
        return true;
    }

    // work with memory
    private void Calc(int* mem)
    {
        for (int i = 0; i < 1000; i++)
        {
            if (*mem == -1)
            {
                // mem has been freed during work
                Console.WriteLine("Freed");
            }
        }
    }

    private void ReleaseUnmanagedResources()
    {
        *_mem = -1;
        Marshal.FreeHGlobal((IntPtr) _mem);
    }

    public void Dispose()
    {
        ReleaseUnmanagedResources();
        GC.SuppressFinalize(this);
    }

    ~TestClass()
    {
        ReleaseUnmanagedResources();
    }
}
class Program
{
    static void Main(string[] args)
    {
        var i = 0;

        while (true)
        {
            var testClass = new TestClass();
            // alloc some memory
            var buffer = new Span<byte>(new byte[1555065]);
            testClass.Method();

            if (i++ % 1000 == 0)
            {
                Console.WriteLine($"{i}");
            }
        }
    }
}
xPaw commented

Is there a good reason that requires finalizer, shouldn't dispose be enough here?

Doesn't this sound like a .NET issue?

EDIT: I can confirm your example prints Freed in release build.

Without finalizer there is be a memory leak without dispose called.

.NET have some solutions to manage with issues like this (for P/Invoke) for IntPtr's such as SafeHandle, HandleRef etc.
But I still don't understand how to use them correctly here.

seems like GC.KeepAlive can do the job it makes fake access to the object to make it live longer.

like this:

    public bool Method()
    {
        Calc(_mem);
        GC.KeepAlive(this);
        return true;
    }
xPaw commented

Fixed in 0.6.5
Thank you @xPaw

xPaw commented

Thanks!