Performance degradation in recent commits
michaelp opened this issue · 11 comments
I am trying to take a look at the performance degradation in the last few commits.
Regardless of the results, I would like to share the initial benchmark that I created.
Introduction
There are several test projects in the openmcdf solution that have name/intention to be performance targeting. I might be missing something, but it seems like they are more like an internal tool used by core maintenance team and not an attempt to have a benchmark test suite of any kind. In addition, the lack of the standard format for benchmark results makes it difficult to reproduce results on different machines and difficult to communicate the changes/finding/improvements.
Assumptions:
- To get stable results I would like to focus on the file that resides completely in memory
- The focus of my first scenario is purely on the reader perfromance.
Goals:
- Provide a simple way to point out the irregularitites
- Establish a framework for justification for performance changes
In addition, my initial scenario is very simple.
- Create in memory a file with a single storage and a few streams of 1 MB each
- measure reading a stream using preallocated buffer of different sizes (it means i would expect that ideally, no additional memory should be allocated and performance should great )
Here are the starting numbers, right of the master
Notice that we read same amount of information and the results are sorted from the fastest to slowest. Last lines are the problematic one.
BenchmarkDotNet=v0.11.5, OS=Windows 10.0.17134.648 (1803/April2018Update/Redstone4)
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100-preview3-010431
[Host] : .NET Core 2.2.3 (CoreCLR 4.6.27414.05, CoreFX 4.6.27414.05), 64bit RyuJIT
Core : .NET Core 2.2.3 (CoreCLR 4.6.27414.05, CoreFX 4.6.27414.05), 64bit RyuJIT
Job=Core Runtime=Core
Method | BufferSize | TotalStreamSize | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
Test | 1048576 | 1048576 | 309.6 us | 6.064 us | 5.672 us | 188.4766 | 93.7500 | - | 1.11 MB |
Test | 524288 | 1048576 | 333.9 us | 6.485 us | 7.468 us | 195.8008 | 97.6563 | - | 1.12 MB |
Test | 262144 | 1048576 | 403.8 us | 5.643 us | 5.002 us | 213.8672 | 106.9336 | - | 1.16 MB |
Test | 131072 | 1048576 | 436.4 us | 2.850 us | 2.666 us | 210.4492 | 104.9805 | - | 1.22 MB |
Test | 4096 | 1048576 | 5,211.3 us | 58.252 us | 48.643 us | 1000.0000 | 382.8125 | - | 5.42 MB |
Test | 1024 | 1048576 | 20,646.4 us | 409.656 us | 613.153 us | 3343.7500 | 593.7500 | 125.0000 | 18.42 MB |
Test | 512 | 1048576 | 39,732.5 us | 791.207 us | 1,385.735 us | 7000.0000 | 500.0000 | 214.2857 | 35.74 MB |
Quick investigation led to an hypotisis that
private static void EnsureUniqueSectorIndex(int nextSecID, HashSet<int> processedSectors)
in the file CompoundFile.cs is one of the causes of performance slowdown. Quick review of the area shows that HashSet without proper capacity set will be constantly growing and reallocating internal arrays. Which in turn will cause GC pressure and might slow down the performance.
To validate my hypothesis i made a single change
private static void EnsureUniqueSectorIndex(int nextSecID, HashSet<int> processedSectors)
{
//if (processedSectors.Contains(nextSecID))
//{
// throw new CFCorruptedFileException("The file is corrupted.");
//}
//processedSectors.Add(nextSecID);
}
And results are way better
BenchmarkDotNet=v0.11.5, OS=Windows 10.0.17134.648 (1803/April2018Update/Redstone4)
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100-preview3-010431
[Host] : .NET Core 2.2.3 (CoreCLR 4.6.27414.05, CoreFX 4.6.27414.05), 64bit RyuJIT
Core : .NET Core 2.2.3 (CoreCLR 4.6.27414.05, CoreFX 4.6.27414.05), 64bit RyuJIT
Job=Core Runtime=Core
Method | BufferSize | TotalStreamSize | Mean | Error | StdDev | Median | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|
Test | 1048576 | 1048576 | 263.8 us | 5.409 us | 6.012 us | 262.6 us | 181.6406 | 90.3320 | - | 1.09 MB |
Test | 524288 | 1048576 | 290.8 us | 4.712 us | 4.407 us | 291.4 us | 183.1055 | 91.3086 | - | 1.1 MB |
Test | 262144 | 1048576 | 347.4 us | 6.872 us | 9.407 us | 344.8 us | 190.4297 | 94.7266 | - | 1.11 MB |
Test | 131072 | 1048576 | 399.2 us | 7.768 us | 13.399 us | 394.5 us | 199.7070 | 99.6094 | - | 1.13 MB |
Test | 4096 | 1048576 | 3,337.0 us | 66.090 us | 127.332 us | 3,278.9 us | 398.4375 | 199.2188 | - | 2.27 MB |
Test | 1024 | 1048576 | 12,268.9 us | 199.769 us | 186.864 us | 12,289.2 us | 1000.0000 | 453.1250 | - | 5.83 MB |
Test | 512 | 1048576 | 23,864.0 us | 464.640 us | 456.339 us | 23,761.8 us | 1750.0000 | 375.0000 | 125.0000 | 10.57 MB |
Conclusion:
* I would recommend to revert the commit [de52b97](https://github.com/ironfede/openmcdf/commit/de52b9783864dceea3ca1c65334863ba63ab65b3). In general validation should be an external functionality that allows user to check integrity of the storage layer
One of the reasons for those changes were to avoid some problems with the reader getting stuck in an infinite loop and/or going round in circles before throwing OutOfMemoryException, which doesn't seem like a thing to be left to callers?
Possibly it can be done in a better way though (seems a shame that HashSet doesn't have a means of reserving capacity until the latest .NET runtime versions)
In essence, we are facing the standard fat filesystem consistency check. I don't remember by heart but I am confident that we can come with a solution ( :) this is a standard interview question to find a loop in linked list)
Benchmark project commited. @michaelp do you have the time to try somthing different from Hashset container to check for loop AND keep performance at optimal levels?
@ironfede yes I will try it this weekend. I will try Floyd's algorithm for loop detection
@ironfede
Federico, it would be nice if we could discuss a few general topics like
- Blocks management
- Fragmentation
- Decoupling between page & IO management and logic of the file system chains which will improve performance, and unblock concurrency
As you're looking in this area, and #40 is related to the checks anyway:
If I debug the Test_CorruptedSectorChain_Doc test, then it gets to
openmcdf/sources/OpenMcdf/CompoundFile.cs
Line 1276 in a62d584
With header.DIFATSectorsNumber set to 268435456, which in #40 was causing it to run out of memory before the existing 'validationCount' check failed.
Is it possible in such cases to detect that the value is invalid before even going into the parse loop? (e.g. because there aren't that many sectors available?). (my knowledge of the format is too basic here to be sure of the best way to validate)
@michaelp sorry for delay in responses but I'm currently really busy with main work activities. I'm really interested in your points to improve performance and architectur, especially the last one.. Do you think that a separate branch could work to explore some ideas?
@Numpsy , your idea could be useful but would only work for local header corruption and not for sector chain corruption where you can't predict block linking without traversing linked list: this is the more subtle type of corruption and probably the hardest to detect without a lot of allocation in a lookup data structure (=> I'm really interested in a Floyd algorithm implementation as suggested by @michaelp)
@ironfede I am writing some tests and I need a few more evening to come with a solution to the current task. (Just keeping you in the loop). I want to deliver what I've started and then we can talk.
I do realize it has been 5 years, but did anything ever come out of this?
We are currently looking into a performance issue when creating lots of Storages/Streams, and it seems to point to the EnsureUniqueSectorIndex
function.
I have used this in the OpenMcdf.PerfTest project:
class Program
{
static int MAX_STORAGE_COUNT = 500;
static int MAX_STREAM_COUNT = 50;
static String fileName = "PerfLoad.cfs";
static void Main(string[] args)
{
var stopWatch = Stopwatch.StartNew();
File.Delete(fileName);
if (!File.Exists(fileName))
{
CreateFile(fileName);
}
Console.WriteLine($"Create took {stopWatch.Elapsed}");
CompoundFile cf = new CompoundFile(fileName);
stopWatch.Restart();
CFStream s = cf.RootStorage.GetStream("Test1");
Console.WriteLine($"Read took {stopWatch.Elapsed}");
Console.Read();
}
private static void CreateFile(String fn)
{
CompoundFile cf = new CompoundFile();
for (int j = 0; j < MAX_STREAM_COUNT; j++)
{
cf.RootStorage.AddStream("Test" + j.ToString()).SetData(Helpers.GetBuffer(300));
}
for (int i = 0; i < MAX_STORAGE_COUNT; i++)
{
var storage = cf.RootStorage.AddStorage("Storage" + i.ToString());
for (int j = 0; j < MAX_STREAM_COUNT; j++)
{
storage.AddStream("Test" + j.ToString()).SetData(Helpers.GetBuffer(300));
}
}
cf.Save(fileName);
cf.Close();
}
}
Would you accept a PR for an additional Configuration flag in the CFSConfiguration
enum that disables the Unique Sector Index check?
I wanted to have a look at the earlier suggested algorithm for detecing loops at some point, but then fell over a number of bugs that needed to be fixed first, and never got to it :-(