icsharpcode/SharpZipLib

`WriteEntry(TarEntry, bool)` writes the entry without bytes if the file is used by another process

HofmeisterAn opened this issue · 6 comments

Describe the bug

This issue makes it inconvenient to use SharpZipLib, because disposing the tar archive will throw a TarException that hides the underlying issue. When a file is used by another process, WriteEntry(TarEntry, bool) writes the entry, but obviously not the bytes (which is fine). Due to the dangling tar entry, disposing (leaving the using block) the tar archive throws:

Entry closed at '0' before the 'x' bytes specified in the header were written.

It is not clear which file is responsible for the error. Following line throws an IOException:

using (Stream inputStream = File.OpenRead(entryFilename))

if a file is used by another process, but a few lines before, the entry is already added:

tarOut.PutNextEntry(entry);

Reproduction Code

No response

Steps to reproduce

-

Expected behavior

WriteEntry(TarEntry, bool) fails with a clear error message that considers the context.

Operating System

Windows, macOS, Linux

Framework Version

.NET 7, .NET 6

Tags

Tar

Additional context

As workaround I check in advance if the process can read the file.

When a file is used by another process, WriteEntry(TarEntry, bool) writes the entry, but obviously not the bytes (which is fine).

That doesn't seem right. If the file is locked, it should throw an IO error when adding it. We don't have anything that catches such an error and proceeds without writing.

Perhaps whatever is using the file truncates it (setting the size to 0)?

It throws an IOException first. Due to the exception, it leaves the using block and disposes the tar archiv, which throws TarException:

TarOutputStream.CloseEntryAsync(CancellationToken cancellationToken, Boolean isAsync)
TarOutputStream.FinishAsync(CancellationToken cancellationToken, Boolean isAsync)
TarOutputStream.Finish()
TarOutputStream.Dispose(Boolean disposing)
TarArchive.Dispose(Boolean disposing)
TarArchive.Dispose()

if (currBytes < currSize)
{
string errorText = string.Format(
"Entry closed at '{0}' before the '{1}' bytes specified in the header were written",
currBytes, currSize);
throw new TarException(errorText);
}

What using block are you referring to here? Wouldn't just putting a try { WriteEntry(...) } catch (IOException) { ... } work?

I think what you want is to be using TarOutputStream directly instead of using the TarArchive helper. It's meant for handling all the processing of files/paths etc and only requiring to be pointed at a directory. That also means that it cannot provide separate exceptions for IO errors (since they are opened internally and shouldn't be left open on errors).
Since you are manually iterating files (and fighting the automatic path naming), it's probably much easier to skip the helper.

See https://github.com/icsharpcode/SharpZipLib/wiki/GZip-and-Tar-Samples#-create-a-tar-or-tgz-with-control-over-filenames-and-data-source

What using block are you referring to here? Wouldn't just putting a try { WriteEntry(...) } catch (IOException) { ... } work?

I am referring to the TarArchive instance.

I think what you want is to be using TarOutputStream directly instead of using the TarArchive helper.

Thanks, I will look into it.

It's meant for handling all the processing of files/paths etc and only requiring to be pointed at a directory. That also means that it cannot provide separate exceptions for IO errors (since they are opened internally and shouldn't be left open on errors).

I still think there is room for improvements. Even in the use case you mention here, the TarException hides the underlying exception. The exception is not helpful, developers do not understand what went wrong. The current implementation does not consider CA1065: Do not raise exceptions in unexpected locations too.

I still think there is room for improvements.

It should be possible to just move the tarOut.PutNextEntry(entry) inside the File.Open closure (and the directory case) in this case.