Washi1337/AsmResolver

Potential Performance Hotspot

Closed this issue · 13 comments

Describe the bug

For writing my generated assembly, setting the FileStream position accounts for 66.8% of the time executing AssemblyDefinition.Write.

Screenshots

Screenshot 2021-12-19 192408

Platform

  • OS: Windows 10
  • AsmResolver Version: 4.7.1

Additional context
Profiled with tracing on JetBrains Rider 2021.3.1

Would it be possible to get a binary for which this is the case? On my machine I cannot seem to reproduce these types of percentages on set_Position calls. I tried saving AsmResolver.DotNet, as well as writing System.Private.CoreLib.

Here is my project: https://github.com/ds5678/AssemblyDumper/tree/SwitchToAsmResolver

Running it on any json file in this folder produces the issue: https://github.com/ds5678/TypeTreeDumps/tree/main/InfoJson

Tried on both Linux and Windows 11 x64, and still see a hotspot graph similar to the one below:

x

This was tested with both nuget version 4.7.1, as well as the source available on master (which should be in sync with nuget's latest version). Target application was compiled for .NET 6.0, and I tried running it through this json file.

As can be seen from the screenshot, the set_Position calls made by the SegmentBuilder class only account for a very small fraction of all the processing time for me. Compared to ToPEImage, which almost takes all time, which is expected.

Perhaps it is hardware specific? Are you writing to a HDD or an SSD (since it is file IO related probably) ?

HDD

I going to try AsmResolver.DotNet and System.Private.CoreLib like you did.

Is it an old HDD? I am writing to a HDD now as well and still see similar results.

Perhaps, to reduce the amount of set_Position calls, you could perhaps try hotpatching the code at BinaryStreamWriter.cs:22 to something in the lines of :

/// <inheritdoc />
public ulong Offset
{
    get => (uint) _stream.Position;
    set
    {
        if (_stream.Position != (long) value)
            _stream.Position = (long) value;
    }
}

Maybe the underlying SetFilePointer calls are expensive.

My hard drive is about a year old. I have:

Seagate BarraCuda 2TB Internal Hard Drive HDD - 3.5 Inch SATA 6Gb/s 7200 RPM 256MB Cache 3.5-Inch

I used some simple code for testing:

public static void Main(string[] args)
{
	var assembly = AssemblyDefinition.FromFile(@"pathToAssembly");
	assembly.Write("output.dll");
}

I tried AsmResolver.DotNet. It was inline with your experience:
asmresolver_write

However, System.Private.CoreLib was worse than my initial experience:
systemprivatecorlib_write

These were done with the 4.7.1 nuget package. I'm going to try your hotpatch now.

I figured out the problem. For some reason, this is way less efficient when overwriting an existing assembly.

Here is AsmResolver.DotNet when overwriting using the nuget package:
asmresolver_overwrite

Your offset patch does improve performance slightly, but the set_Position calls under BinaryStreamWriter.set_offset are separate from the thousands of set_Position calls under SegmentBuilder.Write.

It's still difficult to get a consistent test for me.

I have another suggestion however that you could perhaps try on your machine and see if it improves anything. Change the implementation of SegmentBuilder.Write with the following:

/// <inheritdoc />
public void Write(IBinaryStreamWriter writer)
{
    for (int i = 0; i < _items.Count; i++)
    {
        var current = _items[i];
        writer.Align(current.Alignment);
        current.Segment.Write(writer);
    }
}

This avoids the set_Position all together and just aligns with zeroes.

That combined with your other patch worked marvelously.
Screenshot 2021-12-24 025424

Perfect, I'll make sure this will be in 4.8 then. Does that mean everything got resolved regarding this issue?

Yeah it's wonderful. I can't notice a difference between overwriting and creating a new file