golang/go

x/net/http: PROTOCOL_ERROR with HTTP2

Closed this issue · 25 comments

What version of Go are you using (go version)?

$ go version
go version go1.11.2 linux/amd64

Does this issue reproduce with the latest release?

Yes, which I believe is the one I am running, hence 1.11.2

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/simo/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/simo/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build210197689=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I am downloading tar files from the docker hub, uncompress them using the standard gzip, and passing the result to another software using STDIN.

If is necessary I will try to reproduce this using something simpler like sha256 sum in linux.

What did you expect to see?

I would not expect a PROTOCOL_ERROR

What did you see instead?

A PROTOCOL_ERROR

Below the relevant log using GODEBUG=http2debug=2


INFO[0177] Deleting temporary directory                  action=ingesting target=/tmp/tempCatalog140887158
INFO[0177] Created subcatalog in directory               directory=.layers/bf
INFO[0177] [cvmfs_server ingest --catalog -t - -b .layers/bf/bfb904e99f247ef4b354eab1285ec78bc5bb929dbf134ff69c156b90711216f6/layerfs portals.cern.ch]  action=executing
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=21 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=21 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=23 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=23 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=19 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=19 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=27 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=27 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=13 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=13 len=4 ErrCode=PROTOCOL_ERROR
INFO[0180] Process alive
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192

   ... this line repeat for exactly another 450 times, I cut it...

2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=18886
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16393
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=17397
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=12782
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=14630
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8197
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8201
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
ERRO[0180] Error in copying the input into STDIN.        error="stream error: stream ID 13; PROTOCOL_ERROR"
INFO[0210] Process alive
2018/12/06 18:54:36 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=17 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:54:36 http2: Transport received RST_STREAM stream=17 len=4 ErrCode=PROTOCOL_ERROR
INFO[0240] Process alive
INFO[0270] Process alive
INFO[0300] Process alive
2018/12/06 18:56:06 http2: Transport closing idle conn 0xc00047ddc0 (forSingleUse=false, maxStream=27)
2018/12/06 18:56:06 http2: Transport readFrame error on conn 0xc00047ddc0: (*net.OpError) read tcp 128.141.207.159:35866->104.18.123.25:443: use of closed network connection

Maybe is related to #26321 but I don't know how to tell.

Let me add that the problem arises seldom, so some downloads go just fine, while other present this problem.

The overall workflow is to get the tar.gz from the docker hub, pass the byte stream to "compress/gzip" (so something like gzip.NewReader(resp.Body) ) and finally pass the reader created by gzip to the STDIN of another software (using "os.exec" package).

Moreover, if instead of passing the result of the uncompression to the STDIN of another software I just save it on file I cannot reproduce the problem.

Hi @siscia - can you post an independent code sample that we can use to reproduce this issue ?

Hi,

I was not able to re-produce without using cvmfs_server, the custom software that read from STDIN.

Maybe is a problem from our side but I don't know what it could be...

Should I make a small code sample that use cvmfs_server and reproduce the problem or I should just close the issue?

@fraenkel, so you see anything wrong here?

@siscia, which HTTP/2 server is cvmfs_server using?

So I am trying to find all the places we just report a stream error without any cause. There aren't many.
I did notice when we receive a WindowUpdateFrame and the increment is 0.
The other is parseHeadersFrame where we go negative.

Adding a cause to all places that return an error would at least identify where its coming from.

@fraenkel I thought this was some mystery server sending us a PROTOCOL_ERROR because it didn't like our Transport's HTTP/2 behavior.

@bradfitz I see them against the docker registries, in particular I can see them against the standard registries hosted by docker hub itself.

I believe they are hosted on AWS.

Just to clarify, the Docker Registries servers are sending data, a go program is piping those data into the STDIN of another program, the cvmfs_server.

@bradfitz Sorry I missed the RST_STREAM at the top. I find it odd that all we see are just WINDOW_UPDATE. The fact that "all" or many streams hit this at the same time could mean that we exceeded the connection window? I am guessing only because we are missing information.
What might be useful is to compare a successful run vs an unsuccessful run and see where the messages differ. It might just be in the exchange of WINDOW_UPDATEs but its just a guess.

Do you guys need any more info from my side?

It may be same as #31534 ?

It may it may not, I worked around the issue and I can't recal the problem!

Feel free to close though!

How did your workaround, if I may ask @siscia ?

I don't remember honestly, two options:

  1. Just download the file/layers using directly docker instead of an HTTP calls
  2. Swallow the error and just re-trying

I don't remember which option, but nothing technically interesting I am afraid :(

I am seeing something similar to this. On the client side I see stream error: stream ID XXXXXX; PROTOCOL_ERROR and on the server I see stream error: stream ID XXXXXX; NO_ERROR when I try to read the request body with bodyBytes, err := ioutil.ReadAll(r.Body).

I've been able to get around this by using httpClient.Transport.RoundTrip(req) rather than httpClient.Do(req). This seems to reduce occurrences of the issue to basically zero but it also breaks all the timeouts I have set on the client and transport.

FYI I'm using a non tls http2 client:

	httpClient := &http.Client{
		Transport: &http2.Transport{
			AllowHTTP: true,
			DialTLS: func(netw, addr string, cfg *tls.Config) (net.Conn, error) {
				return dialer.Dial(netw, addr)
			},
		},
		Timeout: time.Duration(time.Second * 5),
	}

I'm unable to provide http2 logs because I only see the issue at scale.

Just in case anyone else finds this GitHub issue after making the same mistake as me:

Using http.NewRequest("get", url, nil) instead of http.NewRequest("GET", url, nil) leads to the exact same error message:

stream error: stream ID 1; PROTOCOL_ERROR

I would love to have amore descriptive error message. Shouldn't it be possible to check against a list of valid (according to specification) HTTP verbs and then say the passed one is not one of them?

Anyway, I hope this helps someone.

Using http.NewRequest("get", url, nil) instead of http.NewRequest("GET", url, nil) leads to the exact same error message:

There is http.MethodGet so you don't have to think about it.

2020/08/20 02:26:29 stream error: stream ID 1; PROTOCOL_ERROR
2020/08/20 02:26:29 stream error: stream ID 3; PROTOCOL_ERROR

	client := &http.Client{}
	req, err := http.NewRequest("GET", fmt.Sprintf("https://www.google.com/search?q=renata&source=lnms&tbm=isch"), nil)
	if err != nil {
		log.Fatal(err)
	}
	req.Header.Set("User-Agent", "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0")
	req.Header.Set("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8")
	req.Header.Set("Accept-Language", "en-US,en;q=0.5")
	req.Header.Set("Referer", "https://www.google.com/")
	req.Header.Set("DNT", "1")
	req.Header.Set("Connection", "keep-alive")
	req.Header.Set("Upgrade-Insecure-Requests", "1")
	req.Header.Set("TE", "Trailers")
	req.Header.Set("Cookie", "CGIC=Ikp0ZXh0L2h0bWwsYXBwbGljYXRpb24veGh0bWwreG1sLGFwcGxpY2F0aW9uL3htbDtxPTAuOSxpbWFnZS93ZWJwLCovKjtxPTAuOA; NID=204=QcWiRuWRCGREPOPxM1sA_JgkiIYxtiuZc9xzhvz-fUaVdskJqgjTUsLW6lepRkZ8ieKykY7JZ70cfP8DjArCV-QulLi0oFtzg5r2XwBZRz0jQAS-b_2lVefc9dgcd-DPOkygGclY1YfC__PLq_AWDqrEPjIH_BasiZJ3rbmULkM; 1P_JAR=2020-8-19-19; OTZ=5592678_28_28__28_")
	res, err := client.Do(req)
	if err != nil {
		log.Fatal(err)
	}

	b, err := ioutil.ReadAll(res.Body)
	if err != nil {
		return result, err
	}

	fmt.Println(string(b))

@batara666 this error is coming because HTTP2 does not use transfer encoding (TE).

package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
)

func main() {
	client := &http.Client{}
	req, err := http.NewRequest("GET", fmt.Sprintf("https://www.google.com/search?q=renata&source=lnms&tbm=isch"), nil)
	if err != nil {
		log.Fatal(err)
	}
	req.Header.Set("User-Agent", "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0")
	req.Header.Set("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8")
	req.Header.Set("Accept-Language", "en-US,en;q=0.5")
	req.Header.Set("Referer", "https://www.google.com/")
	req.Header.Set("DNT", "1")
	req.Header.Set("Connection", "keep-alive")
	req.Header.Set("Upgrade-Insecure-Requests", "1")
	// req.Header.Set("TE", "Trailers")
	req.Header.Set("Cookie", "CGIC=Ikp0ZXh0L2h0bWwsYXBwbGljYXRpb24veGh0bWwreG1sLGFwcGxpY2F0aW9uL3htbDtxPTAuOSxpbWFnZS93ZWJwLCovKjtxPTAuOA; NID=204=QcWiRuWRCGREPOPxM1sA_JgkiIYxtiuZc9xzhvz-fUaVdskJqgjTUsLW6lepRkZ8ieKykY7JZ70cfP8DjArCV-QulLi0oFtzg5r2XwBZRz0jQAS-b_2lVefc9dgcd-DPOkygGclY1YfC__PLq_AWDqrEPjIH_BasiZJ3rbmULkM; 1P_JAR=2020-8-19-19; OTZ=5592678_28_28__28_")
	res, err := client.Do(req)
	if err != nil {
		log.Fatal(err)
	}

	b, err := ioutil.ReadAll(res.Body)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(string(b))
}

If you uncomment the TE header the remote server will close the stream.

I do not think this is a bug in Go.

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

Currently just a "me too", but I'm running into a problem very similar to @stan-sack where sometimes file uploads are terminated when trying to read request body with stream error: stream ID XXXXXX; NO_ERROR. This happens pretty reliably (though still intermittently) with the following setup:

.NET Framework HttpClient (HTTP/1.1) performing a multipart/form-data upload to a GCP load balancer over HTTPS with a H2C Cloud Run backend. This is with go 1.20.
image

I captured some additional debug details with GODEBUG=http2debug=2 as well:

DEFAULT 2023-09-06T17:24:06.628757Z echo: http2: server read frame RST_STREAM stream=167 len=4 ErrCode=NO_ERROR
ERROR 2023-09-06T17:24:06.628731486Z stream error: stream ID 167; NO_ERROR
DEFAULT 2023-09-06T17:24:06.628730Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000139a40: read RST_STREAM stream=167 len=4 ErrCode=NO_ERROR
DEFAULT 2023-09-06T17:24:06.614934Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.614844Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.614826Z echo: http2: server read frame DATA stream=97 len=4096 data="\x89?\x8a^\xdc\xdb\xf8\xfbrݨ\xffr\xb2\xedN\xc1\xe3Q\xc4\xcaԼ\xc7\xfdE\x95\xc2\u009a\xce^+\xc0\x1ac\x94p\xa4Ʉ\xed\xddW\"\xday&<\x13\xc9XjP\xeeӡ\xa7\xbf\rr\xd5\x10\xec\xfd\xd3\xc8\xe5o\x15\xfbU|\x1ev\xa59\xe0\xa3\x169#R\xcdE\xb9\x01\xdc\x13\xa6=\xb3\x14\xcc[\x00^\xcc\xe7\xb7\x12\xc0m\x9c\xbb\x188\x97'\xcb\x05\x1f\x04\x97\xdc-\xb7\x0ezt{x\x99\xfbg\xe4\x00\x7f\x85\xc0k;$\xeb2`\xb7\x13\xd0y<p\xfb\x16\x18\xff\x18\x98v/\xa3܉\xc4\xfa\x18#HCF1\x82\xda\xf5V {\x11\xbf\xb5\x91iA3\xde\xc7\x18\x18=rF\xdc\xfd\x85\xef%\x1cYQ`-w\x99\x03\xb4\x04J\xb4\x94\xe3\v\x82\xdcWeus\x13.d\x96F'Yg\xe8N\xa8J\x98m\xf7\x18\xd1\xc4;\x18u\r\x16\x00\xf1?\x03L\x1b\xb4\xd3\x12`d\xd0WP\x9b1<ײ\x00+\x8c\"0\xd2\xd8" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.614805Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="\x89?\x8a^\xdc\xdb\xf8\xfbrݨ\xffr\xb2\xedN\xc1\xe3Q\xc4\xcaԼ\xc7\xfdE\x95\xc2\u009a\xce^+\xc0\x1ac\x94p\xa4Ʉ\xed\xddW\"\xday&<\x13\xc9XjP\xeeӡ\xa7\xbf\rr\xd5\x10\xec\xfd\xd3\xc8\xe5o\x15\xfbU|\x1ev\xa59\xe0\xa3\x169#R\xcdE\xb9\x01\xdc\x13\xa6=\xb3\x14\xcc[\x00^\xcc\xe7\xb7\x12\xc0m\x9c\xbb\x188\x97'\xcb\x05\x1f\x04\x97\xdc-\xb7\x0ezt{x\x99\xfbg\xe4\x00\x7f\x85\xc0k;$\xeb2`\xb7\x13\xd0y<p\xfb\x16\x18\xff\x18\x98v/\xa3܉\xc4\xfa\x18#HCF1\x82\xda\xf5V {\x11\xbf\xb5\x91iA3\xde\xc7\x18\x18=rF\xdc\xfd\x85\xef%\x1cYQ`-w\x99\x03\xb4\x04J\xb4\x94\xe3\v\x82\xdcWeus\x13.d\x96F'Yg\xe8N\xa8J\x98m\xf7\x18\xd1\xc4;\x18u\r\x16\x00\xf1?\x03L\x1b\xb4\xd3\x12`d\xd0WP\x9b1<ײ\x00+\x8c\"0\xd2\xd8" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.599709Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.599669Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.599662Z echo: http2: server read frame DATA stream=97 len=4096 data="\x06\xb0\xab\xef\xebf0\xf1\r\xfcq-\xb0\xe9\xbf\x00\x1e\x8d\xa7\xbdֶ\xa9`\x91\xda\xc2Ō4\x91q\x1c\x7f\n\x1e\xb7\x03\xa8\xfd\xf5\t\x14#\x8e\xc6p?\xf83\x8e\x9e&\xc2q\xe7\x80c\xef\x90\xcdw7\xfdm\x03\xf7\xb7AW\x0eD\xe7\xd7\xee\xdf\t\xec\xcf!\xfa\xa7I\xcas`\x9d:\xf8HZ\xef\x80G^\xcfH\xab\xc0{\xf9\xf7B$\x91\xc3\xc8\a\xf8\x1a\x8b\xfe:\xdam\xa6\xff\xd9\xf4\x7f\x15\U000fd279\\\xc7\xcaL\xe0{\nw\xadF\xbb\xc8\fj\a>\xc4L\xc71\xfahV@\xc9}\x0e\xf4A,\x9f\xcey)\xdfx\xda>\xce\xfa\x9aU<\v\xfa2\x91g_\t\x95Q6\xb8\x96\xa3M<\xe7\x14\xeeU\xb7_\xc5U\xb5&\xb5\x01c\x01Ղ\xacD\xa65\xe5r\xfa3\xfeI\xeb\x80~E5\x80\xab\x19Qj\xab=p\x1c\xf7Nf4\xed\x13\xff@\u07b8\x9c>'3\xb69M\xff\xa8\xbe!\xfe\x93\xd8\x1dhx\x97a\xdduW%" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.599646Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="\x06\xb0\xab\xef\xebf0\xf1\r\xfcq-\xb0\xe9\xbf\x00\x1e\x8d\xa7\xbdֶ\xa9`\x91\xda\xc2Ō4\x91q\x1c\x7f\n\x1e\xb7\x03\xa8\xfd\xf5\t\x14#\x8e\xc6p?\xf83\x8e\x9e&\xc2q\xe7\x80c\xef\x90\xcdw7\xfdm\x03\xf7\xb7AW\x0eD\xe7\xd7\xee\xdf\t\xec\xcf!\xfa\xa7I\xcas`\x9d:\xf8HZ\xef\x80G^\xcfH\xab\xc0{\xf9\xf7B$\x91\xc3\xc8\a\xf8\x1a\x8b\xfe:\xdam\xa6\xff\xd9\xf4\x7f\x15\U000fd279\\\xc7\xcaL\xe0{\nw\xadF\xbb\xc8\fj\a>\xc4L\xc71\xfahV@\xc9}\x0e\xf4A,\x9f\xcey)\xdfx\xda>\xce\xfa\x9aU<\v\xfa2\x91g_\t\x95Q6\xb8\x96\xa3M<\xe7\x14\xeeU\xb7_\xc5U\xb5&\xb5\x01c\x01Ղ\xacD\xa65\xe5r\xfa3\xfeI\xeb\x80~E5\x80\xab\x19Qj\xab=p\x1c\xf7Nf4\xed\x13\xff@\u07b8\x9c>'3\xb69M\xff\xa8\xbe!\xfe\x93\xd8\x1dhx\x97a\xdduW%" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.593946Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.593942Z echo: http2: server read frame DATA stream=97 len=4096 data="\xc2Q{J\x13<Or\xef)\xb7]\xa8\xd6r՞\xb9\xb6\x04}t\x0e\xec\xdc\x01\xe8\xfe\xa9\xf8\xe5Z\xa0 \xf5$\xcf\xfc\xdd\x1d\x8e\xe8<\xfb\xc1\xb5>~[\x8d<vO\x9f\xfd\xe0<6\f\xb1\x17\x18\x8e\x9c\xb1~\xb65\x1e\xae\xb4p߅\x14\x18\x0e\xbe\xcbɝ\xab\x95\xe8\x8dg\u0093\xfc\xfc\xe0\xf70\x03}\x0e\xe6\xfe\xe7\xa1w\xf8]\xf6\r\xa0\xfe\x17\xa1\xa7\xab\x87\xef\x1d>|\xee3s\x1b0\xa7\xfd\x18V\xcb߇\xd1\xeb=\x90\xd4k!ݗC\x7f.\xc5\xc8\xf7\xea\xd1x\xfa\x1ax\xbd\x17c\x94{\t4\xe3BH\xee\x8d\xd0\xce\x0fC\xa2\x1f\xdf\\7\xf7\x19h\xcd+1\xea\x9f\x0f\x9f\xf6\xe1\xd8\xf3\xbex\xf8dx\xf67\xa1\xf4\xa5Ѝ\xcfa\x1c|=t\x90\xb3\xf0\x9b0\xd6_\x0f<\xaf\xc1\xf3V\xe8\xc2_`\xb4\xbe\x19\xda\xc0\x93z\x9f\x03\xbe\xdf\xc1\xbe\xfd=pr\xff\xf3\x98\x03\xbc\x1d\xda\xf2E\x8c\xe9\xdb0\xfa\xf3]_\x9e\x06\xe4" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.593941Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.593919Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="\xc2Q{J\x13<Or\xef)\xb7]\xa8\xd6r՞\xb9\xb6\x04}t\x0e\xec\xdc\x01\xe8\xfe\xa9\xf8\xe5Z\xa0 \xf5$\xcf\xfc\xdd\x1d\x8e\xe8<\xfb\xc1\xb5>~[\x8d<vO\x9f\xfd\xe0<6\f\xb1\x17\x18\x8e\x9c\xb1~\xb65\x1e\xae\xb4p߅\x14\x18\x0e\xbe\xcbɝ\xab\x95\xe8\x8dg\u0093\xfc\xfc\xe0\xf70\x03}\x0e\xe6\xfe\xe7\xa1w\xf8]\xf6\r\xa0\xfe\x17\xa1\xa7\xab\x87\xef\x1d>|\xee3s\x1b0\xa7\xfd\x18V\xcb߇\xd1\xeb=\x90\xd4k!ݗC\x7f.\xc5\xc8\xf7\xea\xd1x\xfa\x1ax\xbd\x17c\x94{\t4\xe3BH\xee\x8d\xd0\xce\x0fC\xa2\x1f\xdf\\7\xf7\x19h\xcd+1\xea\x9f\x0f\x9f\xf6\xe1\xd8\xf3\xbex\xf8dx\xf67\xa1\xf4\xa5Ѝ\xcfa\x1c|=t\x90\xb3\xf0\x9b0\xd6_\x0f<\xaf\xc1\xf3V\xe8\xc2_`\xb4\xbe\x19\xda\xc0\x93z\x9f\x03\xbe\xdf\xc1\xbe\xfd=pr\xff\xf3\x98\x03\xbc\x1d\xda\xf2E\x8c\xe9\xdb0\xfa\xf3]_\x9e\x06\xe4" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.592901Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.592895Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.592888Z echo: http2: server read frame DATA stream=97 len=4096 data="\x0f\xfemp\xe7\xb9\a7\x7f\xd9|\xb6y'<\xff\xdfl~\xbdy7F\xe0w@\v\xff\xb2\xe1\xda\xff\xb7\x9a߆\xae~\bZ\xfaq\xccӛf\x06\xa3\xf8\xfd0ҟ\x84?\xee\x00l\x82\x8f@_\xffl\xd8\x03\xbe\xfd\xbf\x0f\x9a\xbf\x17~\xff\xfe\x91_\xb0\x15\xf3\x82U\xd0\xff]X\xf1\xfb\x1f\xcd\a\x9bOb.\xf1\x0f\xd8U\xfc.\xe0\xfdl\xf3\x16\x8c\xfa\xaf\xc3\xef\x1b\x9a\x977\xafh.m.h^<\x9a\x11\xbc\x1ew\b}\n\xeb\x82?=\xb7j\xf8t\xf8\xf5\xfbFc?\xe7\xdb\xfcv\x95\xf1\x95\xb7\x06p\xe5\xebA\xa3\xd9\xc1)\xb0\x93\xe7\xa3E?\x04\x8d\x7f%\xf67\xdf\f\xbe}\x14\xf3\xa4O\xe3,\xe1[\x11\xfa\x02\xe6\n\xb7`\xb6\xf4xX\x82\x1f\x04\x1c\xbf\x11\xf1P\xf0\x83g\xbf\x88\xcf4h+8f^3o\x8cW\xff\xf1\xbd\x1e\xf3\x12\xa4g&\x01\xb6\xde\xcf\xf2ˀ\xcfW\xff\xc9\x7f\xf6\xab\xa4\xc2\xf0\xd0:\xab\xbc\xd7(\xad\xd7" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.592868Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="\x0f\xfemp\xe7\xb9\a7\x7f\xd9|\xb6y'<\xff\xdfl~\xbdy7F\xe0w@\v\xff\xb2\xe1\xda\xff\xb7\x9a߆\xae~\bZ\xfaq\xccӛf\x06\xa3\xf8\xfd0ҟ\x84?\xee\x00l\x82\x8f@_\xffl\xd8\x03\xbe\xfd\xbf\x0f\x9a\xbf\x17~\xff\xfe\x91_\xb0\x15\xf3\x82U\xd0\xff]X\xf1\xfb\x1f\xcd\a\x9bOb.\xf1\x0f\xd8U\xfc.\xe0\xfdl\xf3\x16\x8c\xfa\xaf\xc3\xef\x1b\x9a\x977\xafh.m.h^<\x9a\x11\xbc\x1ew\b}\n\xeb\x82?=\xb7j\xf8t\xf8\xf5\xfbFc?\xe7\xdb\xfcv\x95\xf1\x95\xb7\x06p\xe5\xebA\xa3\xd9\xc1)\xb0\x93\xe7\xa3E?\x04\x8d\x7f%\xf67\xdf\f\xbe}\x14\xf3\xa4O\xe3,\xe1[\x11\xfa\x02\xe6\n\xb7`\xb6\xf4xX\x82\x1f\x04\x1c\xbf\x11\xf1P\xf0\x83g\xbf\x88\xcf4h+8f^3o\x8cW\xff\xf1\xbd\x1e\xf3\x12\xa4g&\x01\xb6\xde\xcf\xf2ˀ\xcfW\xff\xc9\x7f\xf6\xab\xa4\xc2\xf0\xd0:\xab\xbc\xd7(\xad\xd7" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.591132Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.591091Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.591072Z echo: http2: server read frame DATA stream=97 len=4096 data="Ӧ\xd8:\xb3\x82\xe2.ybV\x800\xd6[\xe2\x12\x9fҰm#\x8b\xe8\xe70\"\x8c\xee>a\x9aNGʷ\xe6.Z\x84\xed\xfe\x9a\x96\x9f\xd3\x17<\xcfc\x96=\xe3\x14\tʞ\xb4}\x16\xcak\xb5Z\x1d\xa6\x1e\xcb\xfeo\x98\x9f\x01d\f\x8cm\x86\xe6\x9f<\xea\xf9\r\xd0\x05\x9e_]\x8b\x99X\xd7w\xd0\xda号\x94\xbd\xf0+\x0f\xc0\a؍\xb6RGi\xf38֚T\xe4\xd6p\x8d@gߔ\xc3u\x12\x9d\xf2e\x9av\x1d]\xe7$y\x92#B\xb9\x85\xcf}\x9e\xbd\xa7\xe8]\xa8\xbel[\x98\xcaﶨ6ڌ\xbdH\xe3\x9f\xedD\xd2V\xb0\xd7M\xf39RZ.ϖ\xc83\xe0;ϖ*Y\xb3\xb5/\xce98{ \xf65\xe0\xd0\x1e\xac\xef\xbf\x0f\xef\xec>\r3\xf0\xc7\xc0C\xfc\x01h%W\x8f\xb6b\x8e\xbe\x1d\xfb\x7f\xff>\xf8\x87\xe17\xe7>\x83\x95\xfd\x953\xf7\x98\xb97\xc6\xf5\xbbC\xb77@\xbb\xf7\xe1\xf7\xfe" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.591031Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="Ӧ\xd8:\xb3\x82\xe2.ybV\x800\xd6[\xe2\x12\x9fҰm#\x8b\xe8\xe70\"\x8c\xee>a\x9aNGʷ\xe6.Z\x84\xed\xfe\x9a\x96\x9f\xd3\x17<\xcfc\x96=\xe3\x14\tʞ\xb4}\x16\xcak\xb5Z\x1d\xa6\x1e\xcb\xfeo\x98\x9f\x01d\f\x8cm\x86\xe6\x9f<\xea\xf9\r\xd0\x05\x9e_]\x8b\x99X\xd7w\xd0\xda号\x94\xbd\xf0+\x0f\xc0\a؍\xb6RGi\xf38֚T\xe4\xd6p\x8d@gߔ\xc3u\x12\x9d\xf2e\x9av\x1d]\xe7$y\x92#B\xb9\x85\xcf}\x9e\xbd\xa7\xe8]\xa8\xbel[\x98\xcaﶨ6ڌ\xbdH\xe3\x9f\xedD\xd2V\xb0\xd7M\xf39RZ.ϖ\xc83\xe0;ϖ*Y\xb3\xb5/\xce98{ \xf65\xe0\xd0\x1e\xac\xef\xbf\x0f\xef\xec>\r3\xf0\xc7\xc0C\xfc\x01h%W\x8f\xb6b\x8e\xbe\x1d\xfb\x7f\xff>\xf8\x87\xe17\xe7>\x83\x95\xfd\x953\xf7\x98\xb97\xc6\xf5\xbbC\xb77@\xbb\xf7\xe1\xf7\xfe" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.583750Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000336540: wrote WINDOW_UPDATE stream=213 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.583744Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000336540: wrote WINDOW_UPDATE stream=213 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.583737Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000336540: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.583693Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000336540: wrote WINDOW_UPDATE len=4 (conn) incr=4096

This is the error I see on the client side when it happens:

2023-09-06T17:55:44.892Z ERROR   upstream max stream duration reached
2023-09-06T17:55:44.892Z ERROR   Result Fetcher: RunModelJobs exception:Error parsing undefined value. Path '', line 1, position 1.
2023-09-06T17:55:44.923Z ERROR      at Newtonsoft.Json.JsonTextReader.ParseUndefined()
   at Newtonsoft.Json.JsonTextReader.ParseValue()
   at Newtonsoft.Json.JsonReader.ReadForType(JsonContract contract, Boolean hasConverter)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.Deserialize(JsonReader reader, Type objectType, Boolean checkAdditionalContent)
   at Newtonsoft.Json.JsonSerializer.DeserializeInternal(JsonReader reader, Type objectType)
   at Newtonsoft.Json.JsonConvert.DeserializeObject(String value, Type type, JsonSerializerSettings settings)
   at Newtonsoft.Json.JsonConvert.DeserializeObject[T](String value, JsonSerializerSettings settings)
   at Product.Client.ProductClientUtils.ThrowIfNotSuccessful(HttpResponseMessage response, ILogger logger) in C:\dev\cc\Product.Client\ProductClientUtils.cs:line 19
   at Product.Client.ProductClientV1.ProductClientV1.<UploadImageChunked>d__43.MoveNext() in C:\dev\cc\Product.Client\ProductClientV1\ProductClientV1.cs:line 310
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Product.Services.ProductResultFetcher2.<UploadImageData>d__12.MoveNext() in C:\dev\cc\Product.Common\Services\ProductResultFetcher2.cs:line 118
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Product.Services.ProductResultFetcher2.<UploadImageDataWrapper>d__14.MoveNext() in C:\dev\cc\Product.Common\Services\ProductResultFetcher2.cs:line 149
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Product.Services.ProductResultFetcher2.<SubmitJobsWrapper>d__16.MoveNext() in C:\dev\cc\Product.Common\Services\ProductResultFetcher2.cs:line 208
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Product.Services.ProductResultFetcher2.<RunModelJobs>d__29.MoveNext() in C:\dev\cc\Product.Common\Services\ProductResultFetcher2.cs:line 554
2023-09-06T17:59:59.566Z INFO    Uploading Imageset (chunked)...
2023-09-06T18:00:09.030Z INFO    Uploading Imageset (chunked)...
2023-09-06T18:00:46.095Z ERROR   upstream max stream duration reached
2023-09-06T18:00:46.095Z ERROR   Result Fetcher: RunModelJobs exception:Error parsing undefined value. Path '', line 1, position 1.
2023-09-06T18:00:46.096Z ERROR      at Newtonsoft.Json.JsonTextReader.ParseUndefined()
   at Newtonsoft.Json.JsonTextReader.ParseValue()

upstream max stream duration reached leads me to believe some GCP component is resetting the stream? Maybe Envoy, somehow? This may not be go related, but rather GCP related.

@stan-sack, was your issue happening in GCP and/or Envoy proxy?

I was calling non-TLS HTTP/2 (h2c) between two internal services via envoy client load balancing. It was in AWS and there was no actual load balancer.

Seems suspicious though. In my case, envoy (inside GCP load balancer) is also calling non-TLS HTTP/2 (h2c)

This was a long time ago. I dont work at the company where I had this problem anymore.

If I was still trying to solve I'd start playing with TCP keepalive and connection pooling in both go and envoy. I didnt have very much experience with envoy at that time but in hindsight it seems like it could be the source of the problem.