After the upgrade from v4.0.19 to v5.0.4 the authentication with AzureAD stopped working for some users
Closed this issue · 23 comments
Describe the bug
After the upgrade from v4.0.19 to v5.0.4 the authentication with AzureAD stopped working for some users. The web-error for these users says "Identity Provider Login failed, no token found" and at the pod's console the error says "error: invalid algorithm .. error: Failed to obtain access token". It seems it has some relation with the length of the token since for those who are able to login the size of the callback url seems to be shorter (not 100% secure). AF is running as 2 pods inside a k8s cluster and using mysql 8.1.
To Reproduce
Steps to reproduce the behavior:
Login with AzureAD
Expected behavior
Same users using v4 could login using AzureAD in the same server with no errors.
Version
ansibleforms v5.0.4 (pod images inside a k8s v 1.29.x)
Deployment
Deployed ansibleforms with :
- kubernetes 1.29.x
Screenshots
If applicable, add screenshots to help explain your problem.
Error at pod's console:
....
2024-09-25 12:15:14:1514 debug: Schema check from cache
2024-09-25 12:15:17:1517 debug: Redirect to azure
2024-09-25 12:15:19:1519 debug: Schema check from cache
2024-09-25 12:15:20:1520 debug: Login
2024-09-25 12:15:20:1520 error: invalid algorithm
2024-09-25 12:17:02:172 error: Failed to obtain access token
...
Browsing when error pression F2 at edge:
Browsing when user is able to access ok with AzureAD credentials:
I assume duplicate with 229
No sorry, it is a diff issue and it started after moving PROD env from v4 to v5.0.4 (k8s); and even with AzureAD working well some users are not able to login to the new AF while in v4 they could. My guess is that the token is chunked in the frontend (URL MAX LENGTH or something like that) and depending on the length of the AzureAD-token some users with small tokens can login to AF (using AzureAD creds) while others no. Have you changed something in the algorithm to process AzureAD tokens from v4 to v5? otherwise I would review the node.js config or something with the service exposing the web portal in the k8s-service maybe?
I've had it once before that someone has so many groups that my own token is too long, but would not explain the behavior after the upgrade. That why I built in a filter for the groupnames.
I went through the code.. nothing really changed on the authentication; Can you verify the ones where it fails, have done a browser refresh ctrl-f5 ? If they have an old cached client version, you can have strange things
can you please verify the cache refresh. if the old client connects to the new backend, unexpected things can happen.
that doesnt make any sense. i would have several reports about this. i have several deployments where azure ad is working fine. i cant simulate this. oidc is the op id connector. has its own table. can you check the database? what records are in tables azuread and oidc?
thank you for the feedback. i will review asap! regarding the groups. would it mean the group regex is not applied?
Found the bug ! I will fix now and make a new release !
i only happened if I get a "next tag"
getGroupsAndLogin(token, url = `${this.azureGraphUrl}/v1.0/me/transitiveMemberOf`, type='azuread', allGroups = []) {
if (type === 'azuread') {
const config = {
headers: {
Authorization: `Bearer ${token}`
}
};
axios.get(url, config)
.then((res) => {
const groups = res.data.value.filter(x => x.displayName).map(x => (`azuread/` + x.displayName));
allGroups = allGroups.concat(groups);
if (res.data['@odata.nextLink']) {
// If there's a nextLink, make a recursive call to get the next page of data
this.getGroupsAndLogin(token, res.data['@odata.nextLink'], allGroups); // => error, type is not passed, but groups, this means type<>"azuread" => and oidc is assumed.
} else {
// No more nextLink, you have all the groups
this.tokenLogin(token, allGroups)
}
})
.catch((err) => {
this.$toast.error("Failed to get group membership");
});
}
this also means that >100 groups (using the next tag) is hitting the same bug
a new beta is pushed. image tag beta (ansibleguy/ansibleforms:beta)
if you could do me favour and test it ?
i could if you create the image to upload to my K8s
image on docker hub. tag beta
fixed in new release 5.0.7
yes it was fixed!... at least in my implementation (k8s+azureAD and users having +100 groups)