cetic/helm-nifi

Unable to locate initial admin - OIDC with Nifi Cluster Mode

kamniphat01 opened this issue · 4 comments

### Unable to up the Nifi Cluster Mode with OIDC Integration

  • Have Existing Cert-Manager v1.5.3 (Let's Encrypt)
  • Currently using OIDC with Cluster Mode
  • Nifi Image version 1.19.1
  • Pull nifi release v1.1.3
  • Kong gateway/proxy version 3.2
  • AzureAD

My value.yaml (Note: I only put necessary info value not a full value.yaml)

replicaCount: 2

properties:
  externalSecure: false
  isNode: true
  httpsPort: 8443 
  webProxyHost: nifi-cluster.domain.name:443 

service:
  type: ClusterIP
  httpsPort: 8443
  # nodePort: 30236
  annotations: 
    kubernetes.io/ingress.class: kong-nginx
    konghq.com/protocol: "https"

ingress:
  enabled: true
  className: kong-nginx
  annotations: 
    cert-manager.io/cluster-issuer: letsencrypt-xxx
    konghq.com/strip-path: "false"
    konghq.com/plugins: session
    konghq.com/session-affinity-cookie: my_session_cookie
    konghq.com/protocols: "http,https"  # Enable both HTTP and HTTPS
    konghq.com/session-affinity: "true"  # Enable session affinity
  tls: 
    - secretName: nifi-certificate
      hosts:
        - nifi-cluster.domain.name
  hosts: 
    - nifi-cluster.domain.name
  path: /

My authorizers.xml (Note: my coredns are using cluster.devops)

{{- $replicas := int .Values.replicaCount }}
{{- $chart := .Chart.Name }}
{{- $release := .Release.Name }}
{{- $fullname := include "apache-nifi.fullname" . }}
{{- $namespace := .Release.Namespace }}
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at
      http://www.apache.org/licenses/LICENSE-2.0
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<!--
    This file lists the userGroupProviders, accessPolicyProviders, and authorizers to use when running securely. In order
    to use a specific authorizer it must be configured here and it's identifier must be specified in the nifi.properties file.
    If the authorizer is a managedAuthorizer, it may need to be configured with an accessPolicyProvider and an userGroupProvider.
    This file allows for configuration of them, but they must be configured in order:
    ...
    all userGroupProviders
    all accessPolicyProviders
    all Authorizers
    ...
-->

<authorizers>
    <userGroupProvider>
        <identifier>file-user-group-provider</identifier>
        <class>org.apache.nifi.authorization.FileUserGroupProvider</class>
        <property name="Users File">./auth-conf/users.xml</property>
        <property name="Legacy Authorized Users File"></property>
        {{- range $i := until $replicas }}
        <property name="Initial User Identity {{ $i }}">CN={{ $fullname }}-{{ $i }}.{{ $fullname }}-headless.{{ $namespace }}.svc.cluster.devops, OU=NIFI</property>
        {{- end }}
    </userGroupProvider>

    <accessPolicyProvider>
        <identifier>file-access-policy-provider</identifier>
        <class>org.apache.nifi.authorization.FileAccessPolicyProvider</class>
        <property name="User Group Provider">file-user-group-provider</property>
        <property name="Authorizations File">./auth-conf/authorizations.xml</property>
        <property name="Node Identity"></property>
    </accessPolicyProvider>

    <authorizer>
        <identifier>managed-authorizer</identifier>
        <class>org.apache.nifi.authorization.StandardManagedAuthorizer</class>
        <property name="Access Policy Provider">file-access-policy-provider</property>
    </authorizer>

    {{- if .Values.auth.oidc.enabled}}
    <userGroupProvider>
        <identifier>aad-user-group-provider</identifier>
        <class>org.apache.nifi.authorization.azure.AzureGraphUserGroupProvider</class>
        <property name="Refresh Delay">1 mins</property>
        <property name="Authority Endpoint">https://login.microsoftonline.com</property>
        <property name="Directory ID">{{.Values.auth.oidc.tenantId}}</property>
        <property name="Application ID">{{.Values.auth.oidc.clientId}}</property>
        <property name="Client Secret">{{.Values.auth.oidc.clientSecret}}</property>
        <property name="Group Filter Prefix">aad-nifi</property>
        <property name="Page Size">100</property>
    </userGroupProvider>

    <userGroupProvider>
        <identifier>composite-configurable-user-group-provider</identifier>
        <class>org.apache.nifi.authorization.CompositeConfigurableUserGroupProvider</class>
        <property name="Configurable User Group Provider">file-user-group-provider</property>
        <property name="User Group Provider 1">aad-user-group-provider</property>
    </userGroupProvider>

    <accessPolicyProvider>
        <identifier>file-access-policy-provider</identifier>
        <class>org.apache.nifi.authorization.FileAccessPolicyProvider</class>
        <property name="User Group Provider">composite-configurable-user-group-provider</property>
        <property name="Authorizations File">./conf/authorizations.xml</property>
        <property name="Initial Admin Identity">{{.Values.auth.oidc.admin}}</property>
        <property name="Legacy Authorized Users File"></property>
        <property name="Node Identity 1"></property>
    </accessPolicyProvider>

    <authorizer>
        <identifier>managed-authorizer</identifier>
        <class>org.apache.nifi.authorization.StandardManagedAuthorizer</class>
        <property name="Access Policy Provider">aad-user-group-provider</property>
    </authorizer>
    {{- end}}
</authorizers>

Error hit and the nifi pod were in crashloop state

Caused by: org.apache.nifi.authorization.exception.AuthorizerCreationException: org.apache.nifi.authorization.exception.AuthorizerCreationException: Unable to locate initial admin niphat@azure.com to seed policies

org.apache.nifi.NiFi Application Server shutdown started

Expected Result

  • Able to login in cluster mode with oidc integration

I have been stuck for 2 weeks, unable to solve this. I would appreciate that someone could help me to find where the configuration needs to be correct.

banzo commented

The chart currently supports Nifi 1.16.3

Maybe you can have a look at #280 for 1.19 support.

hi @banzo , thanks for the info will have a look and further investigate.

In my latest test case
Nifi 1.16.3 or 1.19.1 with one replica the oidc is work but when set replica to two it will hit the "Unable to locate initial admin".

You're not doing anything wrong, it is the StatefulSet helm config.
This is due the odd setting of the property for the Initial User Identity to be linked to the replica count.

https://github.com/cetic/helm-nifi/blob/db835032b6e860a2c7a84bbc9ca3ddb74f270453/templates/statefulset.yaml#LL158C27-L158C85

--value "Initial User Identity {{ .Values.replicaCount }}" 

I noticed this just before coming to the Issues. Perhaps this should just be hardcoded to the number '1' and not linked to replicaCount?

You're not doing anything wrong, it is the StatefulSet helm config. This is due the odd setting of the property for the Initial User Identity to be linked to the replica count.

https://github.com/cetic/helm-nifi/blob/db835032b6e860a2c7a84bbc9ca3ddb74f270453/templates/statefulset.yaml#LL158C27-L158C85

--value "Initial User Identity {{ .Values.replicaCount }}" 

I noticed this just before coming to the Issues. Perhaps this should just be hardcoded to the number '1' and not linked to replicaCount?

Tested hardcode not working and hit the same issue. Your one was working fine in cluster mode with oidc integration?

I had test a few testcase

Singlelogon

  1. Standalone = working
  2. Cluster-mode = working

OIDC

  1. Standalone (Replica 1) = working
  2. Cluster-mode (More than 1 replicas) = Not working