redpanda-data/kminion

Kafka permission for kminion

duyhieuvo opened this issue · 2 comments

Hello,

could you help me clarify what is the right set of permission for KMinion on a secured Kafka cluster? In our case we have Confluent Platform and manage permission with Confluent predefined roles (https://docs.confluent.io/platform/current/security/rbac/rbac-predefined-roles.html#role-based-access-control-predefined-roles). And trying KMinion in both modes didn't work for us after we trying with different set of permissions:

AdminApi mode, after inspecting Kafka log, we see that Kminion tried to describe the consumer group and the cluster, so we gave it the following permission:

  • DeveloperRead on all consumer groups
  • Operator on the cluster

but then the Kminion pod crashed with the following logs:

│ panic: runtime error: invalid memory address or nil pointer dereference │
│ [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x7de84d] │
│ │
│ goroutine 2447 [running]: │
│ github.com/twmb/franz-go/pkg/kgo.(*describeGroupsSharder).shard(0x400?, {0xcf0750?, 0xc0001b2c80?}, {0xcf4828?, 0xc00040a9f0}, {0x7f5da46d4fff?, 0x429ea5?}) │
│ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:2544 +0xa2d │
│ github.com/twmb/franz-go/pkg/kgo.(*Client).handleShardedReq.func2({0x0, {0xcf4828, 0xc00040a9f0}, {0x0, 0x0}}) │
│ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:1730 +0x151 │
│ github.com/twmb/franz-go/pkg/kgo.(*Client).handleShardedReq(0xc0002ba000, {0xcf0750?, 0xc0001b2c80}, {0xcf4828?, 0xc00040a9f0}) │
│ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:1821 +0x9f9 │
│ github.com/twmb/franz-go/pkg/kgo.(*Client).shardedRequest(0xc0002ba000, {0xcf07f8?, 0xc000311410?}, {0xcf4828?, 0xc00040a9f0}) │
│ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:978 +0x691 │
│ github.com/twmb/franz-go/pkg/kgo.(*Client).RequestSharded(0xc0004b2000?, {0xcf07f8?, 0xc000311410?}, {0xcf4828?, 0xc00040a9f0?}) │
│ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:909 +0x3a │
│ github.com/cloudhut/kminion/v2/minion.(*Service).DescribeConsumerGroups(0xc0004b2000, {0xcf07f8, 0xc000311410}) │
│ /app/minion/describe_consumer_groups.go:70 +0x15b │
│ github.com/cloudhut/kminion/v2/prometheus.(*Exporter).collectConsumerGroups(0xc000540000, {0xcf07f8?, 0xc000311410?}, 0xcea010?) │
│ /app/prometheus/collect_consumer_groups.go:18 +0x5e │
│ github.com/cloudhut/kminion/v2/prometheus.(*Exporter).Collect(0xc000540000, 0xc00022a760?) │
│ /app/prometheus/exporter.go:235 +0x205 │
│ github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1() │
│ /go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/registry.go:456 +0x10d │
│ created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather │
│ /go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/registry.go:548 +0xbac

in OffsetTopics mode, we gave it the permission to consume and describe config on the __consumer_offsets topic. But the consumer group lag info in the metrics seems to not be correct. It always shows 0 lag even though there are some.

It would be nice to have a summary of required permissions of KMinion on the Kafka cluster.
Thank you

For me these ACLs work:

ACLs for principal `User:kminion`
Current ACLs for resource `ResourcePattern(resourceType=CLUSTER, name=kafka-cluster, patternType=LITERAL)`: 
 	(principal=User:kminion, host=*, operation=DESCRIBE_CONFIGS, permissionType=ALLOW)
	(principal=User:kminion, host=*, operation=DESCRIBE, permissionType=ALLOW) 

Current ACLs for resource `ResourcePattern(resourceType=GROUP, name=*, patternType=LITERAL)`: 
 	(principal=User:kminion, host=*, operation=READ, permissionType=ALLOW) 


Current ACLs for resource `ResourcePattern(resourceType=TOPIC, name=*, patternType=LITERAL)`: 
 	(principal=User:kminion, host=*, operation=DESCRIBE_CONFIGS, permissionType=ALLOW)
	(principal=User:kminion, host=*, operation=DESCRIBE, permissionType=ALLOW) 

Current ACLs for resource `ResourcePattern(resourceType=TOPIC, name=__consumer_offsets, patternType=LITERAL)`: 
 	(principal=User:kminion, host=*, operation=READ, permissionType=ALLOW)
	(principal=User:kminion, host=*, operation=DESCRIBE_CONFIGS, permissionType=ALLOW)
	(principal=User:kminion, host=*, operation=DESCRIBE, permissionType=ALLOW) 

The required ACLs are heavily dependent on your KMinion configuration, thus it's a bit harder to provide general guidance. Your posted panic should never happen, but I believe this was already fixed in franz-go.

I believe Confluent hides the consumer offsets topic in many of their product offerings so that this configuration is not an option for you. In that case you must use the Kafka API scrape mode and that requires permissions to run the DescribeGroups Kafka API command which will require the following ACLs:

  • Describe on Cluster (for ListGroups)
  • Describe on all Groups (for DescribeGroup, FindCoordinator)

e.g. if you configure Console to also delete it's previously created groups you also need to add Delete on Group with your configured group prefix if you want to constraint it further.