ArchiveFactory fails with this "tar.gz" file
jbrockerville opened this issue · 5 comments
I made "archive-linux-tar.tar.gz" file with tar -czf archive.tar.gz tar.gz.txt
(attachment renamed) and the ArchiveFactory
contains an entry with a null Key
. I think it might be using GZipArchive
instead of TarArchive
?
I then made "archive-linux-tar-gzip.tar.gz" with tar -cf archive.tar tar.gz.txt
and gzip -5 archive.tar
(attachment renamed) and that was fine.
NGL, I'm not a huge Linux guy, so I'm assuming both methods of creating a "tar.gz" are valid.
Tar files are forward reading archives (think old tape backup), try out the ReaderFactory, I think it will work much better for you.
I do see the detection for file.tar. is not really well supported for the ArchiveFactory, this could be improved, not sure what GZipArchive is supposed to do, it's not really an archive format as such.
Perhaps @adamhathcock can give advice on what is expected.
gzip as such may not contain a name for the file you compressed , check the "-N, --name" option, you can do "man gzip" or search on the web for more info regarding that.
Thanks for replying @Erior. The ReaderFactory
works for all the different tar-gz files I made. However, I'm not using the ReaderFactory
because using the ArchiveFactory
gets me 7Zip support. The TarArchive
class should handle the file I made. According to FORMATS.md anyways.
gzip as such may not contain a name for the file you compressed
Did you maybe get that backwards? The one-step file made with only tar
has the null Key
. The two-step file made with tar
and then gzip
is handled just fine.
Problem, it is not detected as a Tar Archive, if you skip the name and just open the internal entry stream again you would get the tar archive.
For the second part, If you did "gzip -5n" you would get the same "no name" scenario for both streams.
if you skip the name and just open the internal entry stream again you would get the tar archive.
Yeah, but I don't want to do that. But that's just my preference in this specific scenario--I want all the entries to have names. But a null
Key
is prolly the same design decision I'd make here. Oh well. I'll work around it.
If you did "gzip -5n" you would get the same "no name" scenario for both streams.
Hmm... I guess that's what tar
is doing with the -z
flag. Prolly the other compression options, too. Seems valid then.
Given all that, I guess this isn't a bug. Closing. Thanks for the discussion.
gzip is a compression around tar which is just a file format. Can't really have random access around a tar.gz
There is a header for gzip that may contain the filename of the tar but it's not required....haven't looked at the file format for gzip for years.
Hope this helps