jaegertracing/jaeger-client-go

Out of memory on binary extract

julienmathevet opened this issue · 10 comments

fatal error: runtime: out of memory

runtime stack:
runtime.throw(0xca965f, 0x16)
	/usr/local/go/src/runtime/panic.go:774 +0x72
runtime.sysMap(0xc004000000, 0x1210000000, 0x1326c38)
	/usr/local/go/src/runtime/mem_linux.go:169 +0xc5
runtime.(*mheap).sysAlloc(0x130e300, 0x1210000000, 0x70109b, 0x1c0020f245)
	/usr/local/go/src/runtime/malloc.go:701 +0x1cd
runtime.(*mheap).grow(0x130e300, 0x908000, 0xffffffff)
	/usr/local/go/src/runtime/mheap.go:1252 +0x42
runtime.(*mheap).allocSpanLocked(0x130e300, 0x908000, 0x1326c48, 0x42cb5c)
	/usr/local/go/src/runtime/mheap.go:1163 +0x291
runtime.(*mheap).alloc_m(0x130e300, 0x908000, 0x7f3458be0100, 0x7f3456897980)
	/usr/local/go/src/runtime/mheap.go:1015 +0xc2
runtime.(*mheap).alloc.func1()
	/usr/local/go/src/runtime/mheap.go:1086 +0x4c
runtime.(*mheap).alloc(0x130e300, 0x908000, 0xc000010100, 0x7f34568c5180)
	/usr/local/go/src/runtime/mheap.go:1085 +0x8a
runtime.largeAlloc(0x1210000000, 0x1320001, 0x7f34568c5180)
	/usr/local/go/src/runtime/malloc.go:1138 +0x97
runtime.mallocgc.func1()
	/usr/local/go/src/runtime/malloc.go:1033 +0x46
runtime.systemstack(0x0)
	/usr/local/go/src/runtime/asm_amd64.s:370 +0x66
runtime.mstart()
	/usr/local/go/src/runtime/proc.go:1146

goroutine 44 [running]:
runtime.systemstack_switch()
	/usr/local/go/src/runtime/asm_amd64.s:330 fp=0xc0003bda58 sp=0xc0003bda50 pc=0x45a5e0
runtime.mallocgc(0x1210000000, 0xc0f900, 0x1, 0x4)
	/usr/local/go/src/runtime/malloc.go:1032 +0x895 fp=0xc0003bdaf8 sp=0xc0003bda58 pc=0x40df15
runtime.newarray(0xc0f900, 0x11000000, 0xc62f60)
	/usr/local/go/src/runtime/malloc.go:1173 +0x63 fp=0xc0003bdb28 sp=0xc0003bdaf8 pc=0x40e353
runtime.makeBucketArray(0xb8dc00, 0xc0003ec21c, 0x0, 0x4, 0xb5a1e0)
	/usr/local/go/src/runtime/map.go:362 +0x183 fp=0xc0003bdb60 sp=0xc0003bdb28 pc=0x40f243
runtime.makemap(0xb8dc00, 0x61633339, 0xc0003ec210, 0x1324c70)
	/usr/local/go/src/runtime/map.go:328 +0xf8 fp=0xc0003bdba8 sp=0xc0003bdb60 pc=0x40efd8
github.com/uber/jaeger-client-go.(*BinaryPropagator).Extract(0xc0002df170, 0xc7dac0, 0xc0003ec1e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/pkg/mod/github.com/uber/jaeger-client-go@v2.25.0+incompatible/propagation.go:249 +0x393 fp=0xc0003bdcb8 sp=0xc0003bdba8 pc=0x8a67b3
github.com/uber/jaeger-client-go.(*Tracer).Extract(0xc000374000, 0xb58fe0, 0x12b2f20, 0xc7dac0, 0xc0003ec1e0, 0x1, 0x1, 0x0, 0x0)
	/go/pkg/mod/github.com/uber/jaeger-client-go@v2.25.0+incompatible/tracer.go:360 +0xe4 fp=0xc0003bddd8 sp=0xc0003bdcb8 pc=0x8b5494
	```

It seems like your OS just refused a request for more memory to the process when it was inside the .Extract method. Do you have any details on whether or not the machine was under memory pressure at the time?

because it is an extract from binary, it's also possible that the binary was malformed and contained a size value that was too large (similar to the thrift bug we recently discovered)

// Handle the baggage items
var numBaggage int32
if err := binary.Read(carrier, binary.BigEndian, &numBaggage); err != nil {
return emptyContext, opentracing.ErrSpanContextCorrupted
}
if iNumBaggage := int(numBaggage); iNumBaggage > 0 {
ctx.baggage = make(map[string]string, iNumBaggage)

We could add a check if that number is greater than say 128 (which iirc is the limit that W3C imposes on the number of baggage entries) and fail the parsing, but it's unlikely to precisely catch malformed binary headers

Thank's for your replies. My os, which is alpine container, was not under low memory issue. I will try to do an exemple to reproduce it. I seem to happen when I try to extract binary content from carrier which not contains any jaeger informations. I will check @yurishkuro. I am interested by thrift bug you disvovered.

etix commented

Are you using Tracer().Inject() by any chance?

We just had the issue on one of our binary where opentracing.SetGlobalTracer(tracer) was not called, making our span a noopSpan and therefore not injecting the proper binary data before our payload. When the payload was read by our worker using the Extract() function, the Extract() was actually reading binary inside our payload making a very large memory allocation resulting in an OOM.

6ecuk commented

Have the same problem nats-io/not.go#6

Similar issue on the server side when receiving spans jaegertracing/jaeger#2638. We're going to use a patched version of Thrift on the server.

Since client has explicit code for parsing binary header, one simple fix is to do what Thrift did and just have an upper bound for the size param being read from incoming request - values larger than the limit should be treated as invalid header. The threshold could be configurable if necessary.

etix commented

Having an upper bound would fix the out-of-memory but it won't prevent the Extract() function from reading garbage and possibly creating random, unrelated, traces.

6ecuk commented

#529 must fix your problem. Waiting for version bump

2.26 was released