dask/hdfs3

Mysterious error on a kerberos-enabled cluster

superbobry opened this issue · 5 comments

Hello,

I'm getting the following error with hdfs3 0.3.0 on CDH5:

HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop python -c "import hdfs3; hdfs3.HDFileSystem()"
018-02-13 22:05:25.435199, p47628, th139670155556672, ERROR Failed to invoke RPC call "getFsStats" on server "9c-b6-54-7e-03-2c.hpc.criteo.preprod:8020":
RpcChannel.cpp: 483: HdfsRpcException: Failed to invoke RPC call "getFsStats" on server "9c-b6-54-7e-03-2c.hpc.criteo.preprod:8020"
	@	Hdfs::Internal::RpcChannelImpl::invokeInternal(std::shared_ptr<Hdfs::Internal::RpcRemoteCall>)
	@	Hdfs::Internal::RpcChannelImpl::invoke(Hdfs::Internal::RpcCall const&)
	@	Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
	@	Hdfs::Internal::NamenodeImpl::getFsStats()
	@	Hdfs::Internal::NamenodeProxy::getFsStats()
	@	Hdfs::Internal::FileSystemImpl::getFsStats()
	@	Hdfs::Internal::FileSystemImpl::connect()
	@	Hdfs::FileSystem::connect(char const*, char const*, char const*)
    ...
ConnectionError: Connection Failed: HdfsRpcException: Failed to invoke RPC call "getFsStats" on server "9c-b6-54-7e-03-2c.hpc.criteo.preprod:8020"	Caused by: HdfsRpcServerException: org.apache.hadoop.security.authorize.AuthorizationException: User: s.lebedev@CRITEOIS.LAN is not allowed to impersonate H����H����ty��t1H�H��H��H�t>H��H��H�59�

The binary gibberish in the error is surprisingly stable. Could you give any pointers on how I can debug the root cause of this?

I can tell you what it means: the user indicated (I suppose your Keberos principal) is not a proxy user; this is not a surprise, usually only cluster services like HDFS, YARN and such are proxy users. I cannot tell you why this is happening, though. A workaround would be to set that user to be proxying (by adding *.proxyuser.* entries in your core-site.xml), but I very much doubt you want to do that.

For fun, you could try with the following package https://anaconda.org/mdurant/libhdfs3/2.3/download/linux-64/libhdfs3-2.3-0.tar.bz2
Download that, and do conda install libhdfs3-2.3-0.tar.bz2 and try again. Just for fun.

I've tried the version above and it seems to fails with the same error message. Looking for possible causes...

I've reinstalled libhdfs3/hdfs3 from conda-forge into a fresh conda environment and everything just works now. Closing the issue.

Hurray!
How odd...