mostafa/xk6-kafka

Error serializing value when using Avro Union schema

ehardy opened this issue · 2 comments

Hello,

We are heavily using Avro union schemas within our environment & services. Trying to run Kafka based load tests using xk6-kafka, we are getting errors when serializing objects. I have seen issue 220, but the one we are experiencing seems slightly different.

When we try to serialize objects by calling SchemaRegistry.serialize() and reference a union schema, we essentially get the following error:

ERRO[0053] panic: runtime error: invalid memory address or nil pointer dereference
goroutine 112 [running]:
runtime/debug.Stack()
	runtime/debug/stack.go:24 +0x64
go.k6.io/k6/js/common.RunWithPanicCatching.func1()
	go.k6.io/k6@v0.44.1/js/common/util.go:82 +0x1c0
panic({0x1049bbb80, 0x1058f0330})
	runtime/panic.go:890 +0x26c
github.com/dop251/goja.(*Runtime).runWrapped.func1()
	github.com/dop251/goja@v0.0.0-20230427124612-428fc442ff5f/runtime.go:2516 +0xe0
panic({0x1049bbb80, 0x1058f0330})
	runtime/panic.go:890 +0x26c
github.com/dop251/goja.(*vm).handleThrow(0x14001116ea0, {0x1049bbb80, 0x1058f0330})
	github.com/dop251/goja@v0.0.0-20230427124612-428fc442ff5f/vm.go:788 +0x560
github.com/dop251/goja.(*vm).try.func1()
	github.com/dop251/goja@v0.0.0-20230427124612-428fc442ff5f/vm.go:807 +0x58
panic({0x1049bbb80, 0x1058f0330})
	runtime/panic.go:890 +0x26c
github.com/dop251/goja.(*vm).handleThrow(0x14001116ea0, {0x1049bbb80, 0x1058f0330})
	github.com/dop251/goja@v0.0.0-20230427124612-428fc442ff5f/vm.go:788 +0x560
github.com/dop251/goja.(*vm).runTryInner.func1()
	github.com/dop251/goja@v0.0.0-20230427124612-428fc442ff5f/vm.go:830 +0x58
panic({0x1049bbb80, 0x1058f0330})
	runtime/panic.go:890 +0x26c
github.com/linkedin/goavro/v2.(*Codec).NativeFromTextual(0x0, {0x1400040e050, 0x42, 0x50})
	github.com/linkedin/goavro/v2@v2.12.0/codec.go:500 +0x50
github.com/mostafa/xk6-kafka.(*AvroSerde).Serialize(0x14000574a24, {0x1049adcc0, 0x1400121eb70}, 0x140012141c0)
	github.com/mostafa/xk6-kafka@v0.0.0-00010101000000-000000000000/avro.go:14 +0xdc
github.com/mostafa/xk6-kafka.(*Kafka).serialize(0x140013f71e0, 0x1400121ea50)
	github.com/mostafa/xk6-kafka@v0.0.0-00010101000000-000000000000/serdes.go:46 +0x210
github.com/mostafa/xk6-kafka.(*Kafka).schemaRegistryClientClass.func4({{0x104c21290, 0x14001174cf0}, {0x14000578500, 0x1, 0x1a}})

Debugging through the xk6-kafka and goavro code, it appears the union is correctly identified but the inner record type is not. It seems like xk6-kafka is not recursively identifying union members as records, and thus a nil codec is eventually returned, which results in the above panic.

You can reproduce the above error by registering the following schemas in the schema registry:

subject: test.ProductCreated

{
  "namespace": "test",
  "type": "record",
  "name": "ProductCreated",
  "fields": [
    {
      "name": "id",
      "type": "string"
    },
    {
      "name": "productName",
      "type": "string"
    }
  ]
}

subject: products.default-value

[
  "test.ProductCreated"
]

And executing the following xk6-kafka test:

import {
    SCHEMA_TYPE_AVRO,
    SchemaRegistry,
} from 'k6/x/kafka';

const registry = new SchemaRegistry({
    url: __ENV.SCHEMA_REGISTRY || 'http://localhost:8081',
});

const subjectName = 'products.default-value';

export default function() {
    const schema = registry.getSchema({
        subject: subjectName,
    });

    const id = '1234567890';

    console.debug('serializing value');
    const value = registry.serialize({
        data: {
            id,
            productName: 'test',
        },
        schema: schema,
        schemaType: SCHEMA_TYPE_AVRO,
    });

    console.debug('value serialized successfully');
}

I'm thinking xk6-kafka should probably recursively load schemas associated with union members, and create Codec objects from those. I'm not entirely sure where the changes should go. Maybe that would not even be the right solution! Would be happy to submit a PR if I can get a couple pointers.

Cheers

Hey @ehardy,

This seems to be a limitation of the goavro library, as raised here, yet I suppose you can make it work using a combination of the schema subject and the data, which is roughly like this, or something along the same lines:

const value = registry.serialize({
    data: { "test.ProductCreated": {
        id,
        productName: 'test',
     },
    },
    schema: schema,
    schemaType: SCHEMA_TYPE_AVRO,
});

Hello @mostafa ,

Thanks for the quick reply. I believe I had tried that approach, in any case, I just tried it again, and it fails for the same reason (panic because of the nil pointer). Debugging through the code, the error message I get is the following (same as before):

Union item 1 ought to be valid Avro type: unknown type name: "test.ProductCreated"

Given the goavro question you referred to is > 5 years old and hasn't been addressed yet, I guess chances to see it addressed are not that high. In any case, we have worked around the problem, but wanted to see if there was a way to fix it.

Thanks!