upload_asset pipes HTTP headers into jq
ArturKlauser opened this issue · 9 comments
When running upload_asset, jq reports an error:
$ ./ok.sh upload_asset https://uploads.github.com/repos/ArturKlauser/raspberrypi-rstudio/releases/23039592/assets?name=foo1 ok.sh mime_type=application/octet-stream
parse error: Invalid numeric literal at line 1, column 9
jq parse error; invalid JSON.
The error goes away when -q is specified.
Some digging shows that apparently the HTTP headers are piped into jq before the actual json data:
$ printf ':\ncat -\n' > myjq; chmod 755 myjq
$ OK_SH_JQ_BIN=./myjq ~/Downloads/ok.sh upload_asset https://uploads.github.com/repos/ArturKlauser/raspberrypi-rstudio/releases/23039592/assets?name=foo2 ok.sh mime_type=application/octet-stream
HTTP/1.1 201 Created
Cache-Control: no-cache
Content-Security-Policy: default-src 'none'
Content-Type: application/json; charset=utf-8
Etag: W/"75c6982651342ea7db8fa9949fc7b3b4"
Last-Modified: Wed, 22 Jan 2020 14:59:26 GMT
Strict-Transport-Security: max-age=31557600
Vary: Accept, Authorization, Cookie, X-GitHub-OTP
X-Accepted-Oauth-Scopes: repo
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-Github-Media-Type: github.v3; format=json
X-Oauth-Scopes: repo
X-Xss-Protection: 1; mode=block
Date: Wed, 22 Jan 2020 14:59:26 GMT
Transfer-Encoding: chunked
X-GitHub-Request-Id: CE13:5182:68E29:8648C:5E28634C
{"url":"https://api.github.com/repos/ArturKlauser/raspberrypi-rstudio/releases/assets/17506513","id":17506513,"node_id":"MDEyOlJlbGVhc2VBc3NldDE3NTA2NTEz","name":"foo2","label":"","uploader":{"login":"ArturKlauser","id":4790123,"node_id":"MDQ6VXNlcjQ3OTAxMjM=","avatar_url":"https://avatars0.githubusercontent.com/u/4790123?v=4","gravatar_id":"","url":"https://api.github.com/users/ArturKlauser","html_url":"https://github.com/ArturKlauser","followers_url":"https://api.github.com/users/ArturKlauser/followers","following_url":"https://api.github.com/users/ArturKlauser/following{/other_user}","gists_url":"https://api.github.com/users/ArturKlauser/gists{/gist_id}","starred_url":"https://api.github.com/users/ArturKlauser/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ArturKlauser/subscriptions","organizations_url":"https://api.github.com/users/ArturKlauser/orgs","repos_url":"https://api.github.com/users/ArturKlauser/repos","events_url":"https://api.github.com/users/ArturKlauser/events{/privacy}","received_events_url":"https://api.github.com/users/ArturKlauser/received_events","type":"User","site_admin":false},"content_type":"application/octet-stream","state":"uploaded","size":60861,"download_count":0,"created_at":"2020-01-22T14:59:25Z","updated_at":"2020-01-22T14:59:26Z","browser_download_url":"https://github.com/ArturKlauser/raspberrypi-rstudio/releases/download/untagged-dc669712e7a6ef40f473/foo2"}
Include logs
- Run
ok.sh -x [...command here...] 2> ok-shdebug.log
and attach that file.
Environment (please complete the following information)
ok.sh
release- 0.5.1, same happens on HEAD
- curl version
- curl 7.64.1 (x86_64-apple-darwin19.0) libcurl/7.64.1 (SecureTransport) LibreSSL/2.8.3 zlib/1.2.11 nghttp2/1.39.2, Release-Date: 2019-03-27
- jq version.
- jq-1.6
- What operating system and version.
- MacOS Catalina 10.15.2
- What shell.
- GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin19)
Same problem, different command.
I have the same problem when running create_release
with a body=
beyond a certain length.
- ok.sh Version: 0.5.1
- curl 7.54.0 (x86_64-apple-darwin18.0) libcurl/7.54.0 LibreSSL/2.6.5 zlib/1.2.11 nghttp2/1.24.1
- jq-1.6
- MacOS Mojave
I was unable to find out where the wrong input to jq
slips through.
But I observed that it is related to the response code. From a certain body size on, curl will sent a Expect: 100-continue
header and the response code will be 100, not 200. That is when the error happens.
You can check this e.g. with generating a release with a body of increasing length. I did that with the following script, and with 13 lines, it works fine, with 14, jq
did puke.
awk -v lineno=14 -v linelen=80 'BEGIN { i = 0 } { i++; words[i] = $0 } END { line1 = ""; line2 = ""; srand(); n = 0; w = ""; nw = 0; for (j = 0; j < lineno; j++) { while (n <= linelen) { w = words[int(rand() * i) + 1]; nw = length(w); line1 = line2; if (n > 0) { line2 = line2 " " w; n += 1 + nw } else { line2 = w; n += nw } }; print line1; line2 = w; n = nw } }' /usr/share/dict/words
Oooh, Artur I didn't pay attention at first to what you did there with your myjq
. Very smart, love it.
Tried the same thing which leads me to an explanation.
The normal case with a short enough body. Response cut for brevity.
$ curl -nsSig -H 'Accept: application/vnd.github.v3+json' -H 'Content-Type: application/json' --data-binary @body10 -X POST https://api.github.com/repos/fzs/gitblit/releases
HTTP/1.1 201 Created
Date: Wed, 22 Jan 2020 20:06:24 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 2573
Server: GitHub.com
Status: 201 Created
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4981
X-RateLimit-Reset: 1579723794
...
Vary: Accept-Encoding
X-GitHub-Request-Id: 5DCD:3BE42:1D79C8F:238EEA4:5E28AB3F
{
"url": "https://api.github.com/repos/fzs/gitblit/releases/23051499",
"assets_url": "https://api.github.com/repos/fzs/gitblit/releases/23051499/assets",
...
Now a call to the API with a longer body. Response again cut for brevity.
$ curl -nsSig -H 'Accept: application/vnd.github.v3+json' -H 'Content-Type: application/json' --data-binary @body15 -X POST https://api.github.com/repos/fzs/gitblit/releases
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
Date: Wed, 22 Jan 2020 20:05:02 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 2820
Server: GitHub.com
{
Status: 201 Created
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4982
X-RateLimit-Reset: 1579723794
...
Vary: Accept-Encoding
X-GitHub-Request-Id: 5DDB:3EC99:2680B66:2E8538C:5E28AAED
{
"url": "https://api.github.com/repos/fzs/gitblit/releases/23051438",
"assets_url": "https://api.github.com/repos/fzs/gitblit/releases/23051438/assets",
...
In the second case there is the HTTP/1.1 100 Continue
response with a blank line, which is what ok.sh
triggers on, and treats the rest a the response body.
$ printf ':\ncat -\n' > myjq; chmod 755 myjq
$ OK_SH_JQ_BIN=./myjq .github/ok.sh create_release fzs gitblit tag_name=v123 draft=true body="$(awk -f randtext -v lineno=15 -v linelen=80 /usr/share/dict/words)"
HTTP/1.1 201 Created
Date: Wed, 22 Jan 2020 19:55:58 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 2791
Server: GitHub.com
{
Status: 201 Created
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4983
X-RateLimit-Reset: 1579723794
...
Vary: Accept-Encoding
X-GitHub-Request-Id: 5C9B:2BB14:24CCFD8:2C6B471:5E28A8CD
{
"url": "https://api.github.com/repos/fzs/gitblit/releases/23051178",
"assets_url": "https://api.github.com/repos/fzs/gitblit/releases/23051178/assets",
...
ok.sh
will need a special case handling for the 100 Continue
response.
(And for the weird stray {
in the headers.)
How interesting. I have never encountered 100 Continue
before now. That awk script reproduces it fantastically. Thank you both for the excellent investigation.
Suggestions very welcome if you see a potential fix, otherwise I'll plan to poke at this over weekend. Offhand, it looks like _response
will require a bit of rework.
I initially got wrapped up in reading the HTTP spec, but there's also a bunch of (grumpy) conversations on the web about this behavior.
Clearly both curl and GitHub are doing the right thing by sending and responding to that header, however it's not buying ok.sh much. It seems we can short-circuit that behavior in curl by including the -H 'Expect:'
flag.
I still want to research if there's a more robust or canonical way we should be parsing an HTTP response but avoiding might be a nice, quick fix.
Certainly an option. But I could also see benefit in that behaviour. For example, you upload a large asset but got the authentication wrong. I have no experience and have not tried it out, but it would be better if that lead to an error response immediately, instead of after having sent the 100MB binary.
Is it much more than checking in _response
for the response code 100 and discard everything up to the next empty line, after which the actual response will follow? As ok.sh is not interested in the 100 , which it could consider part of the "request".
Just as a proof of concept I toyed around with _response
reading over the "100 Continue" and skipping to the actual response. It's not beautiful, only covers this one case, but proves the solution.
Feel free to base something better and cleaner on it, if you like to.
efe5c10
It might actually work. I just tried the curl line for uploading a 30MB asset and also only get a single "100 Continue" line for that size.
Yeah, if I understand it correctly, there should be only one, as it is a stop gap between the client sending headers and then the body.
As this client is only dealing with Github, it could be enough.
Sent with GitHawk
I skimmed through the HTTP parser sources for Node and Golang and both special-case the 100 status code and avoid trying to process the request body so it looks to me like your fix is the best possible fix. Thanks again for the stellar work here.