elastio/ssstar

Extract performance from a local file or stream is poor

Closed this issue · 2 comments

Bug description

@arybitskyi tested extract performance both from elastio stream restore as well as from a local tar file, and reported the findings in a Google Sheet (NOTE: link is internal Elastio only)

To Reproduce

Steps to reproduce the behavior:

  1. Extract a tar archive either from a local file with --file, or from the output of elastio stream restore with the --stdin argument
  2. Note that extract performance is much slower than archive create performance

Expected behavior

Extract performance should be limited only by the speed of the network connection to S3. In this case meaning roughly the same as create performance.

Screenshots

N/A

Environment

ssstar 0.2.0
EC2 instance type m5.2xlarge
Amazon Linux 2

Additional context

N/A

NDVas commented

Retested on first.app.elastio.com - Release: e8c2660fecf21fcb3b5c2732ceb86b570fa7d9d2
and ssstar 0.3.0 and 0.2.0
Elastio CLI: 0.22.28 (22c95e0 2022-12-08 19:46:20)
Account-level stack: 2022-11-21
Region-level stack: 0.22.28

Results are here

Extraction became slightly better when extract is from file to s3, but not much changed in stream.

Also, not sure if this is important, but I've got this error one time (possibly because my poor mobile internet connection)

[ec2-user@ip-172-31-29-209 ~]$ ssstar create s3://ssstar-test/ --file backup.tar
⠁           D/L parts (all): file122486 (part 4)                                     [##############>-----] 7.03 GiB/9.54 GiB (210.26 MiB/s)
⠈       D/L parts (ordered): file122486 (part 5)                                     [##############>-----] 7.01 GiB/9.54 GiB (182.18 MiB/s)
⠈        Write parts to tar: file122486 (part 5)                                     [##############>-----] 7.01 GiB/9.54 GiB (187.83 MiB/s)
⠚           D/L parts (all): file122486 (part 7)                                     [##############>-----] 7.05 GiB/9.54 GiB (426.82 KiB/s)
⠚       D/L parts (ordered): file122486 (part 12)                                    [##############>-----] 7.05 GiB/9.54 GiB (427.22 KiB/s)
⠁        Write parts to tar: file122486 (part 8)                                     [##############>-----] 7.03 GiB/9.54 GiB (427.00 KiB/s)
⠖         Tar bytes written: Writing                                                 [##############>-----] 7.02 GiB/9.54 GiB (440.22 MiB/s)
⠁  Tar bytes uploaded to S3:                                                         [--------------------] 0B/9.54 GiB (0B/s)                                                                            
The application panicked (crashed).
Message:  called `Option::unwrap()` on a `None` value
Location: /home/ec2-user/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/fs/file.rs:642

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
Error:
   0: Error reading byte stream for object 'file122486' in S3 bucket 'ssstar-test'
   1: error reading a body from connection: Connection reset by peer (os error 104)
   2: error reading a body from connection: Connection reset by peer (os error 104)
   3: Connection reset by peer (os error 104)

Location:
   /home/ec2-user/.cargo/registry/src/github.com-1ecc6299db9ec823/ssstar-cli-0.2.0/src/main.rs:338

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.

@NDVas good catch. That's definitely a bug, please create a separate issue for that. You were running that restore on an EC2 instance so it can't be related to your mobile internet, although even if it were it's a bug we need to resolve.

When you create the issue, please include some details about what's in ssstar-test. I think I have access to that bucket so I can try to repro myself.