netdata/netdata-cloud

[Bug]: Unable to use OIDC with Entra ID

Closed this issue · 6 comments

Bug description

It's entirely possible this is an issue with my configuration but after two days of debugging I'm not sure where else to check.

Using the Netdata documentation I configured OIDC auth between our Entra ID tenant and Netdata cloud.
https://learn.netdata.cloud/docs/netdata-cloud/authentication-&-authorization/cloud-authentication-&-authorization-integrations/oidc#setting-up-authorization-server

The authentication flow properly redirects back to Netdata after authenticating with Entra ID however I get the error failed to verify token: could not verify message using any of the signatures or keys.
image

A full example of the decoded URL with error message is below.
https://app.netdata.cloud/sign-in#error_code=6oNhMy9u3r-243652&error_msg=failed to verify token: could not verify message using any of the signatures or keys&error_msg_key=ErrInternalServerError&metrics_correlation=false&after=-900&before=0&utc=America/Los_Angeles&offset=-7&timezoneName=Pacific Time (US & Canada)&modal=&modalTab=

I can't find that error string in any of the Netdata or Pulsar repos though so I'm not sure where it's coming from.
I suspect the error is coming from this library but I can't figure out if/where it's being pulled in as a dependency.
https://github.com/lestrrat-go/jwx/blob/ed00dbaaecacb843571f04c219d4ec46549fa4c4/jws/jws.go#L471

I also found that there was a recently fixed bug in the Apache Pulsar repo relating to Entra not sending the alg field but Pulsar requiring it. It may be a total red herring, however.
apache/pulsar#22419

Any help or ideas on how to further troubleshoot this would be greatly appreciated.

Expected behavior

OIDC authentication works between Entra and Netdata.

Steps to reproduce

  1. Setup an Entra ID App Registration for OIDC
  2. Configure the correct URIs using the Netdata documentation
    https://learn.netdata.cloud/docs/netdata-cloud/authentication-&-authorization/cloud-authentication-&-authorization-integrations/oidc#setting-up-authorization-ser
  3. Configure OIDC in Netdata with the appropriate Entra values
  4. Try to authenticate

Screenshots

No response

Error Logs

No response

Desktop

OS: [e.g. iOS]
Browser [e.g. chrome, safari]
Browser Version [e.g. 22]

Additional context

No response

Greetings @bc-ble!
Sad to hear you are having troubles configuring OIDC.

After an investigation I found out the problem relies on our server not being able to verify the access token was signed using any of the signatures or keys provided by your server.

After receiving the access token we verify it was signed by the configured issuer. To get the signatures or keys, we use the following process.

  1. Fetch OpenID configuration from {issuer}/.well-known/openid-configuration
  2. Find jwks_uri on the response payload
  3. Fetch jwks (signatures or keys)
  4. Verify access token with the fetched jwks

As mentioned above, process is failing on step 4.

Could you please confirm the access token provided by your server is a valid JWT and it was signed by any of the signatures or keys provided by your server on the jwks_uri.

Here's a doc with a better explanation about validation process.
https://curity.io/resources/learn/client-assertions-jwks-uri/#assertion-validation-and-public-keys

@car12o thank you for taking a look.

From my side I'm unable to see the JWT that is issued. The last thing I see from the client side is the request to Netdata with the authorization code issued by my IdP in the redirect and then I get the error from Netdata.

Since it's failing in the step where Netdata is exchanging the authorization code for an access token, the request/response isn't exposed to me, unfortunately.
https://curity.io/resources/learn/oauth-code-flow/#token-endpoint

Do you know which property Netdata is looking at to do the verification? If it's trying to use alg that would explain why it's failing and is likely related to the Pulsar bug I linked. As far as I know the only logs I have available from Netdata is the error returned in the URL so I can't directly validate that.

Looking at the Entra openid-configuration the jwks_uri that is returned doesn't list the alg property, just the kty property.

image

https://login.microsoftonline.com/common/.well-known/openid-configuration
https://login.microsoftonline.com/common/discovery/keys

These are the generic endpoints which don't include our tenant information but I verified that the values are the same minus our tenant specific URIs. I can provide you with our tenant specific URIs if needed.

@bc-ble thanks for the detailed information.

Yes, keys response payload is incomplete. It's missing alg & use keys.
We have made adjustments to infer the algorithm (alg) when it's missing.

Could you please give it a try now and get us the feedback.

@car12o I apologize for the delay, I was out of the office.

It looks to be working now, thank you!

When I signed in via SSO by going to the initiate URL directly it authenticated successfully and then sent me an email to click a link to actually navigate to the console. Is that the intended flow rather than authenticating directly via OIDC? It's not clear to me from the docs.

If I navigate from our IdP directly it goes straight to the console.

@bc-ble no problem at all.

Nice, glad to hear that it's working now.

That intended for a specific case, let me explain.

  1. If a user/email is logging in through IdP and that email is not associated with any Netdata account yet, an account is created and link with that specific IdP provider, and then it goes straight to the console.
  2. If a user/email is logging in through IdP and that email is already associated with a Netdata account and not linked with the provider, we send a verification email to ensure email access before linking with the provider
  3. If a user/email is logging in through IdP and that email is already associated with a Netdata account and linked with the provider, then it goes straight to the console.

Point 2 is a safety measure to do not allow Netdata account steal as without the verification email, it would be easy for an attacker to create an account on an IdP provider, verify the account on the IdP itself (without having access to the email), and then log in on Netdata with that account/email. This way, an attacker would gain access to the Netdata account without having access to the email.

Awesome, thanks for the explanation.
The additional verification email was the part I was missing.

Appreciate the support!