This is my C# implementation for the One Billion Row Challenge (1BRC) as defined in the gunnarmorling/1brc GitHub repository.
Make sure you have the latest .NET 8 SDK installed. Building the project is done using the following commands (change the runtime identifier from win-x64 to the one for your OS):
dotnet build -c Release
dotnet publish -r win-x64 -c Release
Then you can find the 1brc.exe
inside the bin/Release/net8.0/publish/win-x64
directory. This executable takes a single argument which is the file path to the measurements file.
While my solution doesn't require AVX2, it is written assuming that it is available on the machine this is running on. If AVX2 is not available, then it will fall back to simulating 256-bit vectors using two 128-bit vectors as is provided by the Vector256
type in the standard library.
My system has the following specs
- CPU: AMD Ryzen 9 5950X @ 3.4GHz (default clock speed)
- RAM: 32GB 3600MHz DDR4
- SSD: Samsung 980 PRO
- OS: Windows 11
I generated my input file using the CreateMeasurements
script from the original repo. They recently changed it so that it doesn't generate carriage returns on Windows, so my solution assumes that lines are separated only by newlines.
I have a Stopwatch
that is started when the program starts and prints the elapsed time just before the program exits. I also am using pbench to time how long it takes to invoke the whole program so that it includes the time spent launching the runtime. I have my application compiled with NativeAOT so there is no JIT needed.
After running my program 10 times in a row, these are my measurements on my system:
- Stopwatch: Min=1.318s, Avg=1.335s, Max=1.350s
- Process Time: Min=1.329s, Avg=1.346s, Max=1.361s
I also ran buybackoff's C# solution which also has a Stopwatch at the start and stop of the program. I noticed when running it that there was a much larger gap between the Stopwatch time and the process time. I believe this is because the stopwatch time is not timing how long it takes to close/dispose any of the file handles. I have a suspicion that maybe these issues are only showing up because I am using Windows and the results would be different on Linux.
And I also ran royvanrijn's Java solution which is currently winning the competition in the original repo. I ran it using the latest GraalVM JDK. It does not have a stopwatch in the source code, so I can only time the whole process time.
buybackoff's C# solution:
- Stopwatch: Min=1.433s, Avg=1.453s, Max=1.503s
- Process Time: Min=2.202s, Avg=2.215s, Max=2.270s
royvanrijn's Java solution:
- Process Time: Min=2.501s, Avg=2.549s, Max=2.597s
Right now I'm still waiting for some other people to test out my code on their hardware and get some more performance measurements. In particular I would like to see how it fares when comparing on Linux instead of Windows, as it may be that this improved performance is not reproducible on Linux. I would take these comparisons with a grain of salt for now until then.