tokio-rs/axum

Rate Limit Layer not Respected

Opened this issue ยท 10 comments

  • I have looked for existing issues (including closed) about this

Bug Report

Version

v0.7.4

Platform

Linux pop-os 6.6.10-76060610-generic

Description

Quick note: I created an issue in tower as well as this involves both tower and axum. I can close whichever of the two makes the most sense.

I am running into an issue with using the RateLimitLayer from tower with axum. I noticed that I can keep sending requests and the rate isn't actually limited. Below is a complete example.

[package]
name = "axum-rate-limit-test"
version = "0.1.0"
edition = "2021"

[dependencies]
axum = "0.7.4"
serde = { version = "1.0.193", features = ["derive"] }
serde_json = { version = "1.0.108", features = ["std"] }
tokio = { version = "1.35.0", features = ["full"] }
tower = { version = "0.4.13", features = ["limit", "buffer"] }
use axum::{
    error_handling::HandleErrorLayer,
    http::StatusCode,
    routing::get,
    BoxError,
    Router,
    Json,
};
use serde::Serialize;
use tokio::net::TcpListener;
use tower::ServiceBuilder;

#[derive(Debug, Serialize)]
pub struct Health {
    pub ok: bool,
}

pub async fn health() -> Result<Json<Health>, (StatusCode, String)> {
    Ok(Json(Health { ok: true }))
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let app = Router::new()
        .route("/health", get(health))
        .route_layer(
            ServiceBuilder::new()
                .layer(HandleErrorLayer::new(|err: BoxError| async move {
                    (
                        StatusCode::INTERNAL_SERVER_ERROR,
                        format!("Unhandled error: {}", err),
                    )
                }))
                .buffer(1024)
                .rate_limit(1, tokio::time::Duration::from_secs(5))
        );
    let listener = TcpListener::bind("127.0.0.1:3000")
        .await
        .expect("failed to bind TCP listener");
    axum::serve(listener, app)
        .await
        .expect("failed to serve service");

    Ok(())
}

I expected the request rate to be limited to 1 per 5 seconds. But the below script executes without any rate limiting.

for n in {1..10}; do curl localhost:3000/health; done

I've tested this snippet with #2586 and it doesn't seem to fix the issue, the request rate is not limited

I can confirm as well.

Isn't this a problem only because the layer/service gets cloned, which seems to give it its own state for concurrency tracking? Which I do consider not intuitive and certainly also not how i would implement it.

So at the very least, if my theory is correct, wouldn't it be welcome @jplatte for someone to add a big fat warning to that middleware that cloning won't work as they expect it to do and thus that use cases such as axum either need to not clone or provide its own cloning-friendly middleware?

@GlenDC yes on the theory.

Not sure how reasonable it is from tower's perspective, so what the answer to your second question is. They certainly could have a layer that has one shared limit when cloned.

I think what #2568 + RateLimit might do in practice is limit the rate of requests per TCP connection. But for real-world use cases, I wonder when that rather general layer makes sense at all, even if it works across clones.

Generally you want to limit the rate of requests per some origin (e.g. peer IP for HTTP), not globally for a certain service (together with clones or separately). But an actual implementation of that with good performance could quickly get too complex for inclusion in simple helper crates like tower(-http), so I'm not sure it makes sense to even open an issue for that.

Personally I anyway prefer that kind of <=L4 rate limiting on ingress infra level. And yes I do agree that for most server cases you indeed want that at the application layer with more info. That's why my suggestion that perhaps it might be better for something that Axum provides.

The biggest point I tried to bring home here, and I think I mentioned it on Discord before as a reply to someone, is that I do think that there should be probably a big warning added to the docs of that layer, as I think it's confusing to many and certainly not what you would expect when you clone.

For a future major release of tower it might then even be considered to entirely remove this from tower but that's an entirely different discussion.

Yeah, agree there should be a change on those docs. Maybe not even so much about cloning (how would people know when something like axum clones the layer internally?) but more about scope of that layer (it's clearly not the sort of rate limiting people need for http backends usually).

Well I would mention this while also mentioning use cases such as Axum.

That said, I wonder what actual use cases for that middleware do exist? Is there anyone even using that specific middleware in production for something where it works and they like it? Asking it not with any mall intent but more so that hopefully I can see the code using it and learn a thing or two from it.

Not sure it's worth diving down this rabbit hole yet, but I did some digging into salvo's implementation (link). I think something similar could be added to axum (or a new third-party crate) using a rate store in the app state, and then handled by from_extractor_with_state() (link) with an origin extractor.