httpie/cli

[macOS] Lag when piping stdout

gsakkis opened this issue · 5 comments

When redirecting httpie output to a pipe, there is a significant lag in the total execution time:

  1. Without pipe
time http http://localhost:8508/api/v1 > /dev/null
http http://localhost:8508/api/v1> /dev/null  0.12s user 0.05s system 59% cpu 0.282 total
  1. With pipe
time http http://localhost:8508/api/v1 | cat > /dev/null
http http://localhost:8508/api/v1  0.12s user 0.04s system 86% cpu 0.193 total
cat > /dev/null  0.00s user 0.00s system 0% cpu 0.973 total

Note that the lag appears to be happening in the cat command after httpie, which doesn't make sense and doesn't happen for curl or wget:

time curl -s http://localhost:8508/api/v1  > /dev/null
curl -s http://localhost:8508/api/v1 > /dev/null  0.01s user 0.01s system 56% cpu 0.024 total

time curl -s http://localhost:8508/api/v1 | cat > /dev/null
curl -s http://localhost:8508/api/v1  0.00s user 0.01s system 44% cpu 0.035 total
cat > /dev/null  0.00s user 0.00s system 9% cpu 0.034 total

Checklist

  • I've searched for similar issues.
  • I'm using the latest version of HTTPie.

I have tried to reproduce your result but so far without meaningful success. Nothing on the scale presented by your result. My results seem to vary by server response time.

⬢[lukas@toolbox ~]$ time http https://restcountries.com/v3.1/alpha/cz?fields=currencies,capital > /dev/null

real	0m0.617s
user	0m0.259s
sys	0m0.050s
⬢[lukas@toolbox ~]$ time http https://restcountries.com/v3.1/alpha/cz?fields=currencies,capital | cat > /dev/null

real	0m0.641s
user	0m0.279s
sys	0m0.056s
⬢[lukas@toolbox ~]$ time http https://restcountries.com/v3.1/alpha/cz > /dev/null

real	0m0.683s
user	0m0.279s
sys	0m0.043s
⬢[lukas@toolbox ~]$ time http https://restcountries.com/v3.1/alpha/cz | cat > /dev/null

real	0m0.626s
user	0m0.271s
sys	0m0.047s

I have tried with fully up to date Fedora Workstation 38, httpie 3.2.2-2, time 1.9-20, both from Fedora repositories.

Could you please double-check that the issue still exists and provide some more details to your setup, like your OS (and version), httpie version and more details what your server response looks like (valid json?, response length? etc).

Thanks for looking into it. I tested it again against an httpbin server running on localhost and hitting http://localhost:8000/get with httpie 3.2.2:

  • On MacOS Ventura 13.2.1 (Darwin Kernel Version 22.3.0) the piped version is ~4-5x slower
  • On Linux (#35~20.04.1-Ubuntu SMP) there is not significant difference.

So it manifests on macOS only; I'll edit the title.

Thanks for taking the time to double-check. Unfortunately, not having an access to Mac I won't be able to test and look into it further.

On the other hand, at the very least, you've managed to narrow the problem down a bit and save the next person looking some time figuring out where the problem is 😄

After several hours of debugging, it turns out this is unrelated to potential buffering differences between MacOS and Linux as I had original suspected; it's this code. Commenting it out (or just the process.communicate() line) does away with the lag.

That's as far as I got. I don't have an explanation of why this affects only the piped version; the interactive non-piped version also calls process.communicate() but it apparently finishes(?) much faster. And I don't know enough about the low-level OS-specific process spawn details in this module.

I have to say, I am kind of shocked to find out that an HTTP client CLI includes a "feature" that:

  • will regularly warn you about the new update by checking the latest version on PyPI.
  • is enabled by default and if you want to disable it you have to add/edit a config file, can't do it from the CLI.
  • requires an ad-hoc (and unsurprisingly buggy) daemons module instead of delegating to a library designed specifically for this.

IMO the whole httpie.internal subpackage needs to go or at least be extracted as a separate project. Checking for new releases has nothing to do with sending requests to HTTP servers.

Wow, that's some wonderful investigation you've done.

Yes, it I can reproduce the behavior on Linux if I switch the code to the subprocess behavior used on MacOS.

I'm no expert in the low level OS details either, but should I guess (based on some investigations), the difference seems to be that the > version is a command output redirection while the piped (|) version seem to wait for the whole process (and subprocesses) to finish. The Popen.communicate() waits for the subprocess to finish (as written here).

Come to think of it, I really don't know why the process.communicate() command is used there - it does not seem to pass any value to the subprocess, does not do anything with the possible result returned from the subprocess. And there seems to be no info in the method docs that you have to wait for the subprocess to finish or join it back or anything.

The only thing I can see it does is wait for the subprocess to finish (which seems to be against the purpose of daemonizing the update fetcher).

But feel free to correct me if I am wrong - @isidentical and @jkbrzt (authors of the commit).

Unless you see a use for it, I would suggest just to remove the process.communicate() call, that should fix the problem.