Kubectl.apply fails while GenericKubernetesApi works for ClusterRole
agustinventura opened this issue · 16 comments
Describe the bug
When creating a ClusterRole Kubectl.apply returns error while GenericKubernetesApi creates it.
Client Version
18.0.0 - 21.0.0-legacy
Kubernetes Version
1.27 (OpenShift 4.14.16)
Java Version
Java 21
To Reproduce
Given a ClusterRole in clusterrole.yaml file:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: test-controller-admin
rules:
- apiGroups:
- '*'
resources:
- '*'
verbs:
- '*'
- nonResourceURLs:
- '*'
verbs:
- '*'
And a kubeconfig for an OpenShift 4.14.16 in kubeconfig file, load both files and try to apply ClusterRole:
public void createObjectsInCluster() throws URISyntaxException, IOException, KubectlException {
String kubeconfig = new String(Files.readAllBytes(Paths.get(ClassLoader.getSystemResource("kubeconfig").toURI())));
String ensManifest = new String(Files.readAllBytes(Paths.get(ClassLoader.getSystemResource("clusterrole.yaml").toURI())));
createObjects(kubeconfig, ensManifest);
}
public void createObjects(String kubeconfig, String manifest) throws IOException, KubectlException {
KubeConfig kc = KubeConfig.loadKubeConfig(new StringReader(kubeconfig));
ApiClient apiClient = Config.fromConfig(kc);
apiClient.setVerifyingSsl(false);
List<Object> objects = Yaml.loadAll(manifest);
for (Object object : objects) {
final KubernetesObject kubernetesObject = (KubernetesObject) object;
Class<KubernetesObject> objectClass = (Class<KubernetesObject>) kubernetesObject.getClass();
Kubectl.apply(objectClass).apiClient(apiClient).resource(kubernetesObject).execute();
}
}
This will return the following error:
io.kubernetes.client.extended.kubectl.exception.KubectlException: io.kubernetes.client.openapi.ApiException: class V1Status {
apiVersion: v1
code: 404
details: class V1StatusDetails {
causes: null
group: authorization.openshift.io
kind: clusterroles
name: test-controller-admin
retryAfterSeconds: null
uid: null
}
kind: Status
message: clusterroles.authorization.openshift.io "test-controller-admin" not found
metadata: class V1ListMeta {
_continue: null
remainingItemCount: null
resourceVersion: null
selfLink: null
}
reason: NotFound
status: Failure
}
However, if ClusterRole is created using GenericKubernetesApi it succeeds:
public void createObjectsInCluster() throws URISyntaxException, IOException, KubectlException {
String kubeconfig = new String(Files.readAllBytes(Paths.get(ClassLoader.getSystemResource("kubeconfig").toURI())));
String ensManifest = new String(Files.readAllBytes(Paths.get(ClassLoader.getSystemResource("clusterrole.yaml").toURI())));
createObjects(kubeconfig, ensManifest);
}
public void createObjects(String kubeconfig, String manifest) throws IOException {
KubeConfig kc = KubeConfig.loadKubeConfig(new StringReader(kubeconfig));
ApiClient apiClient = Config.fromConfig(kc);
apiClient.setVerifyingSsl(false);
List<Object> objects = Yaml.loadAll(manifest);
for (Object object : objects) {
final V1ClusterRole clusterRole = (V1ClusterRole) object;
GenericKubernetesApi<V1ClusterRole, V1ClusterRoleList> clusterRoleClient =
new GenericKubernetesApi<>(V1ClusterRole.class, V1ClusterRoleList.class, "rbac.authorization.k8s.io", "v1", "clusterroles",
apiClient);
KubernetesApiResponse<V1ClusterRole> createClusterRoleResponse = clusterRoleClient.create(clusterRole);
if (createClusterRoleResponse.getStatus() != null && !createClusterRoleResponse.isSuccess()) {
if (createClusterRoleResponse.getHttpStatusCode() != 409) {
log.error("Error creating k8s object {}: {}", clusterRole, createClusterRoleResponse.getStatus());
} else {
log.info("k8s object {} already exists", clusterRole);
}
}
}
}
If applying clusterrole.yaml with kubectl cli it suceeds too:
kubectl --kubeconfig kubeconfig apply -f clusterrole.yaml
Expected behavior
Kubectl.apply should create ClusterRole.
Server (please complete the following information):
- OS: Linux
- Cloud: OpenShift 4.14.16 on Azure
Additional context
This looks like some weird interaction with OpenShift, as we found it upgrading from 4.12.25 to 4.14.16 and didn't have this kind of issues with other platforms such as EKS. The error itself returns group authorization.openshift.io instead of rbac.authorization.k8s.io pointing to some specific OpenShift logic but however it is strange that kubectl cli and the GenericKubernetesApi works and only fails when using kubectl Java client.
We also tried to add the group in the error to ModelMapper:
ModelMapper.addModelMap("authorization.openshift.io", "v1", "ClusterRole", "clusterroles", false, V1ClusterRole.class);
But returns the same error.
It fails too for RoleBinding objects but not for Role or ClusterRoleBinding ones.
You're getting a 404 on the apply. Does this object exist currently in the cluster? Or are you trying to create it for the first time?
My first guess is that Kubectl.Apply
doesn't handle object creation correctly, there's probably special purpose code in kubectl
cli to handle that case.
Hi Brendan, it's the first I create it.
Also, I see clusterroles.authorization.openshift.io
in the 404 error message, but it looks from your YAML like you are trying to create a rbac.authorization.k8s.io/v1
is it possible there is a typo in your YAML somewhere?
But returns the same error.
It fails too for RoleBinding objects but not for Role or ClusterRoleBinding ones.
When you say that it works for Role, does it successfully create new Role resources that didn't previously exist?
The Apply code is just calling ServerSide apply, so we may need to special case that code if it returns a 404.
(sorry for lots of little questions :)
Can you try kubectl apply --server-side ...
and see if it works?
No problem Brendan, I've already tried lots of them.
There's no typo in the yaml, I've tried to create a bunch of objects together in one yaml, independent objects in different yamls and even creating the objects in Java instead of reading from a yaml, result has always been the 404 error with this change of rbac.authorization.k8s.io
to clusterroles.authorization.openshift.io
.
I can create previously non existing Role resources or ClusterRoleBinding ones. I always test with a newly provisioned cluster as it is our current use case.
I noticed last week that Apply is using --server-side, so I tested it and it works, creates the ClusterRole. I didn't mention it because I didn't consider it interesting, sorry for any inconvenience.
Ok, thanks for the details, this is odd. But I think I may have figured it out. We only supply the type, not the list type when we get the generic API in KubectlApply
and then we guess at the list type via the class loader:
My guess is that the class for the openshift ClusterRole is being discovered before the class for the standard ClusterRole and that is confusing things.
If you are willing it would be super useful if you could recompile the library with some logging around that code location to verify that's what's going on, that's great.
If not I can try to add some defensive code around that location that would fix the problem.
Hi Brendan, I'm happy to help anyway I can.
I've added the log but discovered that execution path is not getting into the getGenericApi method at line 246 but in the one at line 269 with following parameters:
apiTypeClass = class io.kubernetes.client.openapi.models.V1ClusterRole
apiListTypeClass = interface io.kubernetes.client.common.KubernetesListObject
The resolved groupVersionResource seems to be the problem, containing:
resource = clusterroles
group = authorization.openshift.io
version = v1
As I can see in ModelMapper
's classesByGVR kvMap
there's two entries for ClusterRole
, one with key rbac.authorization.k8s.io
and one with authorization.openshift.io
. When vkMap gets build there's only left the authorization.openshift.io
one. This happens too with RoleBinding
.
Anecdotally, the same happens with ClusterRoleBinding
and Role
but we are lucky enough that rbac.authorization.k8s.io
is the one that gets loaded in vkMap
.
Any thoughts on how can I address this? Thanks a lot for your help.
Ok, there must be some issue in the map key comparison that is causing a problem. I can try to reproduce this when I get some time, but if you have time to explore and send a PR that'd be great too.
I'd love to send a PR, I'm adding a test to reproduce the bug and will try to fix it.