python-zk/kazoo

Bug in zookeerper kerberos auth

tianchao-haohan opened this issue · 5 comments

##kerberos Auth should be passed

Auth Failed

Snippet to Reproduce the Problem

client = KazooClient("10.8.8.6:24000,10.8.8.7:24000", sasl_options={"mechanism": "GSSAPI", "service":"zookeeper"})
client.start()

Logs with logging in DEBUG mode

AUTH_FAILED closing: Unknown error: (('Unspecified GSS failure. Minor code may provide more information', 851968), ('Clock skew too great', -1765328347))

Specifications

  • Kazoo version: 2.8.0
  • Python version: 3.6
  • Zookeeper version:
  • OS: Centos7

Steps:
1. login kerberos with kinit -kt keytab_file principle
2. check kerberos auth status with klist, its ok
3. run the scripts above

The root cause I found is that:
The service send to kerberos is "zookeeper@10.8.8.6" (pure_sasl-0.6.2-py3.6.egg/puresasl/mechanisms.py):

class GSSAPIMechanism(Mechanism):
    name = 'GSSAPI'
    score = 100
    qops = QOP.all

    allows_anonymous = False
    uses_plaintext = False
    active_safe = True

    def __init__(self, sasl, principal=None, **props):
        Mechanism.__init__(self, sasl)
        self.user = None
        self._have_negotiated_details = False
        self.host = self.sasl.host
        self.service = self.sasl.service
        self.principal = principal
        self._fetch_properties('host', 'service')

        krb_service = '@'.join((self.service, self.host))         ##self.host here should be hostname rather than host ip

Actually, the correct krb_service should be "zookeeper/hostName@Realm" or "zookeeper@hostName.Realm". for example,
zookeeper@hadoop.hadoop.com

KazooClient should take serverFQDN as the input and propagate to puresasl.
serverFQDN - the fully qualified domain name of the server (e.g. “serverhost.example.com”).

Workaround:

  1. Input service as "zookeeper@serverFQDN"
  2. Change the code:
class GSSAPIMechanism(Mechanism):
    name = 'GSSAPI'
    score = 100
    qops = QOP.all

    allows_anonymous = False
    uses_plaintext = False
    active_safe = True

    def __init__(self, sasl, principal=None, **props):
        Mechanism.__init__(self, sasl)
        self.user = None
        self._have_negotiated_details = False
        self.host = self.sasl.host
        self.service = self.sasl.service
        self.principal = principal
        self._fetch_properties('host', 'service')

        krb_service = self.service
        if krb_service.find ('@') <= 0 :
            krb_service = '@'.join((self.service, self.host))
    

  • Client / sever respective hostnames and IP
    • client hostname: localhost.localdomain,
    • client ip: 10.87.9.15
    • server hostname: hw-ker-node1
    • server ip: 10.8.8.6
  • Kerberos principals as returned by klist on both client and server
    • client:
    [c4dev@localhost ~]$ klist
    Ticket cache: KEYRING:persistent:1000:1000
    Default principal: hiveuser@HADOOP.COM
    
    Valid starting       Expires              Service principal
    01/25/2021 14:33:42  01/26/2021 14:33:23  krbtgt/HADOOP.COM@HADOOP.COM
        renew until 01/25/2021 15:03:42
    
    
    • server: empty
  • Zk server Jaas config, if possible
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=false
useTicketCache=true
debug=false;
};

@ceache hi, is there any update on this issue?