A collection of libraries that support fast and efficient forward-only reading of various popular archives.
- 🚀 Fast and efficient: Only extracts matched file. Forward-only access. Uses
Task
to offloadIO
to separate threads. - 😀 Licensed under MIT. Similar projects are licensed under GPL.
- 😍 100% test coverage
Format | Package | Documentation |
---|---|---|
Ar | Get Community.Archives.Ar on nuget |
Get started |
Cpio | Get Community.Archives.Cpio on nuget |
Get started |
Rpm | Get Community.Archives.Rpm on nuget |
Get started |
Tar | Get Community.Archives.Tar on nuget |
Get started |
Apk | Get Community.Archives.Apk on nuget |
Get started |
- .Net Standard 2.1
- .Net 5
- .Net 6
On any platform that's supported by the above frameworks, including Windows, Linux and MacOS.
Each package exports an implementation of IArchiveReader
.
var reader = new TarArchiveReader(); // or RpmArchiveReader or ...
await foreach (
var entry in reader
.GetFileEntriesAsync(stream, IArchiveReader.MATCH_ALL_FILES)
) {
// entry.Name
// entry.Content
Console.WriteLine($"Found file {entry.Name} ({entry.Content.Length} bytes)")
}
var reader = new TarArchiveReader(); // or RpmArchiveReader or ...
// use regular expression to match files (path + file name)
await foreach (
var entry in reader
.GetFileEntriesAsync(stream, "[.]md$", "[.]txt$")
) {
// found a Markdown or text file
}
var reader = new RpmArchiveReader();
var metaData = await reader.GetMetaDataAsync(stream);
Console.WriteLine(metaData.Package); // for example: "gh"
Console.WriteLine(metaData.Version); // for example: "2.4.0"
❗ Only
rpm
archives contain meta data. Checkreader.SupportsMetaData
at runtime or the documentation of the reader before using it.
The implementations of IArchiveReader
allow forward-only access of supported archives.
But why forward-only and not random-access?
All of these archive formats do not have an central index of files. That means that (in worst case) the complete archive needs to be scanned to find a file. In addition, archives like tar
are usually compressed. Decompressing them is easy but because the tar
archive as a whole and not individual files are compressed, the whole file needs to be decompressed for random-access.
There are many different archive extractors (for example 7z
) that can easily extract any modern archive.
The purpose of IArchiveReader
is to quickly and efficiently find and extract one or more files. Without using native or fat dependencies like RecursiveExtractor or SharpZipLib.
IArchiveReader
will only allocate memory (byte[]
) for matched files.
You can either register IArchiveReader
and a single implemenation of it. Or, if you are using multiple implementations in the same project, register the implementation directly. All implementations are using virtual
functions. You can easily mock the classes using your favorite mocking framework.
Please create an issue and attach the file (if it's not confidental or contains personally identifiable information (PII)).
Pull requests are always welcome 😍
This software is released under the MIT License.