isaacs/node-tar

tar.Extract is very slow

Closed this issue · 4 comments

I'm trying to use node-tar to extract a tar file. The code is at:

https://github.com/raymondfeng/node-tar-perf/blob/master/untar.js

The performance is really bad comparing to tar command. For a 152MB tar with some big files, it took more than 1 min.

After debugging, I found out the tar entries are written out in chunks of 512 bytes. That is probably due to the tar format.

I did an experiment to add a buffered stream before sending to fs. The new code is:

https://github.com/raymondfeng/node-tar-perf/blob/master/untar-with-buffer.js

Now it only took around 1 second to extract the tar.

I initially tried to add the buffered-stream with node-tar code but I had trouble getting it working. Maybe fstream is a special.

I also wonder if Node stream/fs apis should have options to support buffering.

I also wonder if Node stream/fs apis should have options to support buffering.

As of 0.10, they do.

fstream and node-tar both need to be rewritten to use the streams2 classes (ie, stream.Readable and friends).

@isaacs is this still something that needs to be done?

@terinjokes Yeah, still an issue.

# Get an example file
$ wget https://s3.amazonaws.com/node-webkit/v0.9.2/node-webkit-v0.9.2-linux-x64.tar.gz -O - | gunzip > node-webkit-v0.9.2-linux-x64.tar

# Linux tar utility
$ time tar xf node-webkit-v0.9.2-linux-x64.tar

real    0m0.119s
user    0m0.000s
sys 0m0.119s

# node-tar extracting the same file
$ time node examples/extracter.js 
done

real    2m52.815s
user    2m43.038s
sys 0m13.689s

Native tar is ~1450 times faster.