bodgit/sevenzip

Fail to read .exe in .7z

hongtw opened this issue · 2 comments

hongtw commented

When extracting a file containing an exe file,
the different reading methods for the reader will give different results and generate panic.

Use io.Copy will be fine.

	reader, _ := sevenzip.OpenReader(fp)
	for _, fileIn7z := range reader.File {
		r, _ := fileIn7z.Open()
		var buffer bytes.Buffer
		io.Copy(&buffer, r)
		defer reader.Close()
		data := buffer.Bytes()

		fmt.Println("Data size:", len(data))
		// use data to do something...
	}

Use ioutil.ReadAll will be ok in most cases.
But if the coming file is .exe, it will cause a panic: panic: runtime error: slice bounds out of range [:5] with capacity 3

	reader, _ := sevenzip.OpenReader(fp)
	for _, fileIn7z := range reader.File {
		r, _ := fileIn7z.Open()
		data, _ := ioutil.ReadAll(r)

		fmt.Println("Data size:", len(data))
		// use data to do something...
	}

Here is the complete test code.

package main

import (
	"bytes"
	"fmt"
	"io"
	"io/ioutil"
	"path"

	"github.com/bodgit/sevenzip"
)

func checkErr(err error) {
	if err != nil {
		panic(err)
	}
}

func readByIoutil(fp string) {
	fmt.Println("[readByIoutil] start")

	reader, err := sevenzip.OpenReader(fp)
	checkErr(err)
	for _, fileIn7z := range reader.File {
		filename := path.Base(fileIn7z.Name)
		fmt.Println("Processing", filename)
		reader, err := fileIn7z.Open()
		checkErr(err)
		data, err := ioutil.ReadAll(reader)
		checkErr(err)

		fmt.Println("Data size:", len(data))
		// use data to do something...
	}
	fmt.Println("[readByIoutil] finished\n")
}

func readByCopy(fp string) {
	fmt.Println("[readByCopy] start")
	reader, err := sevenzip.OpenReader(fp)
	checkErr(err)
	for _, fileIn7z := range reader.File {
		filename := path.Base(fileIn7z.Name)
		fmt.Println("Processing", filename)
		r, err := fileIn7z.Open()
		checkErr(err)
		var buffer bytes.Buffer
		_, err = io.Copy(&buffer, r)
		defer reader.Close()
		data := buffer.Bytes()

		fmt.Println("Data size:", len(data))
		// use data to do something...
	}
	fmt.Println("[readByCopy] finished\n")
}

func main() {
	fp := "putty.7z"

	readByCopy(fp)
	readByIoutil(fp)
}

And may get output like below

[readByCopy] start
Processing putty.exe
Data size: 1647912
[readByCopy] finished

[readByIoutil] start
Processing putty.exe
panic: runtime error: slice bounds out of range [:5] with capacity 3

goroutine 1 [running]:

I used putty.exe and compressed it into putty.7z as a test file, which is also attached here.
(putty is a common ssh tool used by windows user)

Source: https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html
64-bit x86: putty.exe

File information:

$ file putty.exe 
putty.exe: PE32+ executable (GUI) x86-64, for MS Windows
$ 7z a putty.7z putty.exe 

7-Zip [64] 17.05 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.05 (locale=utf8,Utf16=on,HugeFiles=on,64 bits,8 CPUs LE)

Open archive: putty.7z
--
Path = putty.7z
Type = 7z
Physical Size = 816140
Headers Size = 122
Method = LZMA2:21 BCJ
Solid = -
Blocks = 1

Scanning the drive:
1 file, 1647912 bytes (1610 KiB)

Updating archive: putty.7z

Items to compress: 1

     
Files read from disk: 1
Archive size: 816140 bytes (798 KiB)
$ file putty.7z
putty.7z: 7-zip archive data, version 0.4

Attached File:
putty.7z.zip

bodgit commented

It's down to calling Read(p) with a small p when using any of the branch converters, i.e. BCJ in this case. io.ReadAll() seems to do this occasionally compared to io.Copy() which is why one works and the other fails.

I've forced it to trip up when incorporating iotest.OneByteReader() into my tests to force small reads so I can figure out how to fix it.

This should now be fixed in the next release. Thanks for the detailed bug report!