ory/kratos

Define anti-automation policies with CAPTCHA

aeneasr opened this issue ยท 41 comments

Is your feature request related to a problem? Please describe.

It should be possible to prevent spam and automation using CAPTCHA during sign up, login, and password reset.

We should deploy a default CAPTCHA strategy with sane defaults.

Policy

  • The user tries to register more than one account within 72 hours.
  • The user failed provide valid credentials for the third time within 12 hours.
  • The user tries to recover their account for the second time within 72 hours.

I was put off by https://www.ory.sh/kratos/docs/concepts/security/#bruteforce-attacks and #134 when picking kratos, but I thought "at least they're being honest. We can put a ticket in our backlog and fix it before we ship" and went with it.

After digging a bit deeper, I found https://www.ory.sh/kratos/docs/self-service/flows/user-login-user-registration/username-email-password/#anti-automation . The wording there is a bit misleading: it suggests that the feature is already there. Is there a design in someone's head for this ticket, or would you suggest that we do this in our web app ourselves?

it suggests that the feature is already there. Is there a design in someone's head for this ticket, or would you suggest that we do this in our web app ourselves?

Sorry about that, this is not implemented. We thought the leading paragraph would be clear enough

This feature is a work in progress and is tracked as kratos#133.

also given that the linked issue is still open :)

So no, anti-automation is not yet available / implemented / designed!

I see, the wording in #134 is maybe a bit misleading, but as long as #133 and #138 are still open it will also be implemented! What I meant there was that, for example, rate limiting is pretty dependent on context (intranet? no rate limiting. big gaming site? maybe some rate limiting. and so on) and that there are better tools to solve that (e.g. nginx).

Agreed that rate-limiting is context dependent and is best handled upstream.

However, I think it would be good for Kratos to provide some anti-automation protections for the login, signup, and password recovery flows. The resource-intensive nature of the password hashing process makes the login and signup endpoints a particularly sensitive attack vector for DDOS attacks. I include the password recovery flow for the same reason listed in the docs: The goal of such an attack is to send out so many emails or SMS, that your reputation worsens (spam filters) or you're faced with massive costs (carrier fees). The verification flow probably falls into this category as well, as it's only slightly less exposed than the password recovery flow. Additionally, if we set some sort of anti-automation policy on verification, such as only being able to resend a verification emails X number of times, we could allow users to resend without needing to re-enter their emails, which is something that was flagged by our designers in the design review of our implementation.

When you refer to a "user" in your proposed time-based policy (eg The user tries to register more than one account within 72 hours), how are you defining user? By IP?

I agree with your findings. We need anti-automation and I think we could use some type of CAPTCHA (maybe just reCaptcha?).

The new 0.6 branch has an open PR for refactoring the form payloads. It will enable us to include custom JavaScript so this should be easy to implement once the feature is merged and the tests fixed.

I would refer to a user by their ID - this works for login, verification, recovery. For registration we probably need IP-based?

Another option would be of course advising people to use e.g. Cloudflare.

Hey, could you point me to the open PR you referenced? Just want to watch it so I know when it's merged and work on this issue can begin.

I'd like to toss in a couple of thoughts to consider when forming a spec for this idea:

  • It would be awesome to see support for hCaptcha. Cloudflare posted notes comparing hCaptcha and reCAPTCHA. They have a mostly compatible API (slightly different options on the Enterprise plan), but we found it to be a drop-in replacement. Customizing the front-end script & configuring the backend URL should be most of what it takes to switch between these two?
  • It would be great to have the ability to customize the CAPTCHA policy from your "sane defaults" to "always required". We consistently experience credential stuffing attacks, and we've seen successful attacks with passwords from yet-to-be-identified data breaches. The attacks bounce through a very large pool of proxy servers that limit/distribute the number of calls from any given IP, and they have repeatedly circumvented one large auth provider's expensive bot detection service.

Yup, that definitely makes sense to me! So the form refactoring landed in 0.6 - meaning that this could now be added as we can theoretically add javascript to the form payload, which the UI would render to render the captcha!

Or if it's server-side, images :)

Please, please, please do not make captcha compulsory for all users. IMO, this is a terrible idea

There are major concerns about Captchas for two reasons:

  1. Most captchas pose serious usability concerns in terms of accessibility. See for example:https://www.w3.org/WAI/GL/wiki/Captcha_Alternatives_and_thoughts. Even though the document is a bit old, it just shows that there is no captcha proven to be usable by all users.
  2. Mainstream solutions like google's no-captcha have been criticised for their invasion of privacy https://www.theregister.com/2020/11/02/google_ad_privacy/

If Kratos had a captcha that users can't disable, Kratos may clash with privacy and accessibility policies, thus making it unusable for many projects. Organisations may already have measures in place like rate limiting, bot protection (e.g. Cloudflare) , WAFs, etc. that are far more successful than Captcha at catching malicious behaviour anyway. Those users may not need Kratos to re-implement those features

I'll add $.2 here. As stated above Captha's have a quite limited success and can be only used in browser based flows(?). Also there are global rate limiting solutions like Cloudflare and others which can be applied transparently to Kratos. There are also other solutions that are a bit complex and allow application/request specific limits. For example we use rules that can be applied per API endpoint and can use request specific data (or even data derived from the request like username -> user ID transformation) which can allow more complex rules to be specified which are quite hard to implement outside of the target app. I think solution for that could be a more advanced hook system (that could actually interrupt/control the app flow). This way we could leave the decision how this should be implemented to the adopters. Of course this will add extra external call to the critical path and would impact response times but on other hand it would decouple Kratos from that requirement.

Another thing that probably pure rate-limit/captcha won't solve are credential enumeration attacks which are quite painful to deal with and more sophisticated tools are needed. I'll check with my colleagues tomorrow what tools they are using since our Proof of Work solution is not ideal.

What do you think? This decision should probably should affect all products in Ory ecosystem that are potentially exposed to DDoS and other attacks.

That's an interesting approach (PoW) but requires front end code. I think we should also differentiate between login / sign up which is most impacted by to these type of attacks, and other endpoints like initializing a login flow or something.

Yeah. PoW is ok-ish for frontend flows where one can supply JS implementation. Things get quite difficult when trying to deal with pure API calls. One could implement some kind of "smart" API client that deals with retries and computes the Proof of Work behind the scenes but it is quite inconvenient to deal with and adds latency. In the end when dealing with more sophisticated attacks captcha/PoW has a moderate success rate (either it allows quite a lot of bots through or blocks substantial legitimate requests). Currently some of our solutions uses PerimeterX (paid solution) as a defence against bots/DDoS and they seem quite happy with it. I'm quite sceptical and website is marketing bull^#%it without proper docs so I need to get some hands on it working in prod.

I agree that login/sign up are the most vulnerable to those kinds of attacks and any simple solution (rate-limiting, PoW) will not block more advanced attacks/scans anyway.

I think we have to draw the line between the regular spammy bot that pollutes your users with "free s!x n0w" spam and more sophisticated DDoS attacks. The intention of the issue is to deal with the former (i.e. keep out 95% of spam). If you are under a more sophisticated attack you need more specialized mechanisms in place that do not really related to Ory Kratos.

Instead of using the self service UI node in Node.js. We are doing our own Go API that integrates validation on the endpoints with Google Recaptcha entreprise v3. We don't support yet anti-csrf on those web calls (but recaptcha helps). The front can't be generic, it has to integrate smart validations preventing bot like behaviour to calls to the UI node. Kratos as an engine doesn't have that context to differentiate legitimate from illegitimate. Pure (PoW) is not great in my opinion as attackers could have a magnitude more computing power than you do, it requires device based action detection like Google entreprise Recaptcha v3 does, Also Chrome/Chromium browser protection team would be in their right to ban your webpage for doing unnecessary work.

I think we have to draw the line between the regular spammy bot that pollutes your users with "free s!x n0w" spam and more sophisticated DDoS attacks. The intention of the issue is to deal with the former (i.e. keep out 95% of spam). If you are under a more sophisticated attack you need more specialized mechanisms in place that do not really related to Ory Kratos.

Sure. It is totally valid option (as long as it is configurable). DDoS attacks nowadays require more sophisticated tooling and probably some high level/global solution should be applied to solve some problems (multi-DC, multi-cluster, etc.).

Instead of using the self service UI node in Node.js. We are doing our own Go API that integrates validation on the endpoints with Google Recaptcha entreprise v3. We don't support yet anti-csrf on those web calls (but recaptcha helps). The front can't be generic, it has to integrate smart validations preventing bot like behaviour to calls to the UI node. Kratos as an engine doesn't have that context to differentiate legitimate from illegitimate. Pure (PoW) is not great in my opinion as attackers could have a magnitude more computing power than you do, it requires device based action detection like Google entreprise Recaptcha v3 does, Also Chrome/Chromium browser protection team would be in their right to ban your webpage for doing unnecessary work.

To be clear, I'm not advocating for PoW since we did have issues with it. We didn't get banned but our site is not a common webpage so different rules apply here. Anyway, I think adopters should be better off with anti-measures tailored to their traffic/needs and Kratos perhaps applying only some basic filtering to cover basic scenarios that will help you get up and running and think about more complex solutions down the road. Would be great to have hooks that can interrupt the flows but that is something we can probably implement regardless.

Yes, and the easiest way to prevent bot sign ups and bruteforce / credentials stuffing would be to have a CAPTCHA mechanism in place. This could be enabled in the config and disabled if you have other measures in place. Everything else like limit IP, limit per user, limit per cookie will be probably pretty useless in the real world and also hard to implement if we think about the backend self-service APIs.

Alternatively we can also just say: use Cloudflare. I think that's also valid ๐Ÿ˜… and would definitely save some maintenance and development time (and people coming around saying: but I want captcha-oss-nodejs-alternative-15! :D )

I get the maintenance burden aspect, and there's never a one-size-fits-all solution. Is there an extensibility option (now or planned for the future) that would allow us to plugin our own CAPTCHA logic (or really... any custom workflow... client-side JS + associated server-side logic)? It could be slick if this was something we could do on our own with some config and a server-side plugin of sorts (a webhook flow?). The config & flow could be posted online as a recipe, but the core service wouldn't have to maintain support for various external services.

Any news?

Looked a bit into this, primarily due to #1780 .

First I thought the best idea would be to add CAPTCHAs. However, it looks like most CAPTCHAs that do not require JavaScript are intrusive (always present) and already solvable by machine learning.

I checked into ReCaptcha v3 as one solution but the problem here is that we have to be able to execute JavaScript. Some clients though might not be a able to do that (e.g. native apps) which means we can not enforce this type of CAPTCHA as e.g. reCAPTCHAv3 is only available for iOS as a preview for enterprise customers: https://cloud.google.com/recaptcha-enterprise/docs/instrument-ios-apps

Additionally, there are concerns over the data privacy aspects of Google reCAPTCHA. As far as I learned, it appears that these are not GDPR compliant.

There are some other options like honeypot fields but all of these are geared towards spam bots rather than abuse defined in #1780.

In conclusion, I fear we might need to descope this and have a rate limiting service where one IP can only make so many requests.

In case you didn't know - there is a GDPR compliant, Proof of Work captcha, which doesn't require cookies: https://friendlycaptcha.com. It's used by some of Germanys biggest companies.

Here is the (MIT licensed) code: https://github.com/orgs/FriendlyCaptcha/repositories

Can this be used in a purely open source manner or do you need to have an account with them?

The license has been MIT, but it's now 'Friendly Captcha License'. I asked them what that means (see FriendlyCaptcha/friendly-pow#13) and it means we cannot use it for Ory โ˜น๏ธ

Oh, too bad :( I also looked into this approach a bit more and it seems that for reasonable bot protection this would need several seconds on a powerful PC, meaning that it would stall the site for ~30 seconds on old smart phones (think old android systems for example).

I looked into this again and it appears that this is still problematic on e.g. iOS. On Android, it appears more common to have captchas available though.

FireBase for example requires the use of reCAPTCHA during authentication, see this expo example: https://docs.expo.dev/versions/latest/sdk/firebase-recaptcha/


hCAPTCHA on the other hand does appear to have the ability to show captchas on mobile devices: https://docs.hcaptcha.com/mobile_app_sdks/


Language-level libraries are not secure enough to offer advanced OCR protection, see: https://github.com/dchest/captcha

Here's a bit of research into OCR for captchas:


One possibility though would be to use a math solver captcha, for example https://captcha.mojotv.cn

However, I'm not sure how effective those would be against large scale attacks to a big site (let's say apple uses Kratos). I think with a bit of tinkering it would be possible to break this relatively easy as the source code is also public.

Any progress on that front?

Any update on the progress ?

I'll share a method we end up implementing. This one uses reCaptcha specifically but I think it should apply to other as well.

  1. We have added reCaptcha to our fronted app (login/registration)
  2. We submit token for verification with the form (obviously)
  3. We have added extra blocking hook to Kratos which checks reCaptcha token validity and have possibility to reject the data (via external service)

Not great. Not terrible ๐Ÿ˜„

Interesting approach! ๐Ÿค”

Not great. Not terrible

Nice!
Did you do point 3 directly from Kratos to reCaptcha or did you have to create a service in the middle to translate the reCaptcha response into something the Kratos hooks could process/parse?

We used service in the middle so we can response with some meaningful error message to the frontend. Not 100% sure but to verify the response from reCaptcha API you need some basic logic and Kratos webhooks cannot inspect arbitrary responses and mostly rely on HTTP status codes.
https://developers.google.com/recaptcha/docs/verify require to check JSON response and act according to the success and error-codes (other fields should also be validated).

Thanks so much for confirming, that's what we were starting to conclude, it's good we're on the right track and even better to know it's worked for someone before us. ๐Ÿ™๐Ÿป

  1. We have added extra blocking hook to Kratos which checks reCaptcha token validity and have possibility to reject the data (via external service)

We've been trying to work on this today but we aren't having any luck working out how to get the token to the hook.

What method did you use for the login/registration hooks to be able to receive and pass on the token?
Did you have to create a trait to send it in or is there something related to transient_payload that can be used (this doesn't seem implemented for logins though)?

Oh damn! I've forgot that this is our custom feature to pass on some form fields in the hook context... It is a workaround and I think we would need something like transient_payload to make it work on vanilla Kratos build.

That explains it!
It does look like the Kratos way to implement Captcha (and in my case I'm interested in Cloudflare Turnstile) will be using the new transient_payload, that way we can submit it a token with login/registration and use the blocking hooks parser to extract and validate it.

A few questions for the Ory team:

  1. Is there a reason the recent ability to parse transient_payload is limited to registration (we can't see it in login)?
  2. Would you be open to a PR to add it to Login too?
  3. Is there an ETA for the next Kratos release as the transient_payload is not yet in a release?

@CaptainStandby for visibility

We are thinking about expanding the transient_payload feature to other flows as well but no concrete plans have been made yet.

The next Kratos release will be by the end of this quarter (https://ory-community.slack.com/archives/C012RJ2MQ1H/p1678364596920199)

Hello contributors!

I am marking this issue as stale as it has not received any engagement from the community or maintainers for a year. That does not imply that the issue has no merit! If you feel strongly about this issue

  • open a PR referencing and resolving the issue;
  • leave a comment on it and discuss ideas on how you could contribute towards resolving it;
  • leave a comment and describe in detail why this issue is critical for your use case;
  • open a new issue with updated details and a plan for resolving the issue.

Throughout its lifetime, Ory has received over 10.000 issues and PRs. To sustain that growth, we need to prioritize and focus on issues that are important to the community. A good indication of importance, and thus priority, is activity on a topic.

Unfortunately, burnout has become a topic of concern amongst open-source projects.

It can lead to severe personal and health issues as well as opening catastrophic attack vectors.

The motivation for this automation is to help prioritize issues in the backlog and not ignore, reject, or belittle anyone.

If this issue was marked as stale erroneously you can exempt it by adding the backlog label, assigning someone, or setting a milestone for it.

Thank you for your understanding and to anyone who participated in the conversation! And as written above, please do participate in the conversation if this topic is important to you!

Thank you ๐Ÿ™โœŒ๏ธ

8j1u38.jpg

We're now looking into this, however we will first experiment with different captcha's in Ory Network. Since all captcha solutions require some type of contract (hCaptcha, ReCaptcha, Cloudflare) with a third party and there is no good open source solution, we're not quite sure how/if this will land in the open source code base.

Therefore, closing this for the time being.