danielsz/system

handler: Impossible to declare order of endpoints

plexus opened this issue · 30 comments

When creating a handler with multiple endpoints, there is no way to influence the order in which they will be combined into a single ring/routes.

The current (undocumented) logic is that

  • routes with their own middleware come first
    • those that share the same middleware are combined
  • routes without route-specific middleware always come last

My use case is this:

(defn prod-system [{:keys [web-port]}] 
  (c/system-map
   :app-routes     (-> (new-endpoint app-routes)
                       (c/using [:middleware]))
   :webhook-routes (new-endpoint webhook-routes) 
   :middleware     (new-middleware {:middleware middleware})
   :handler        (-> (new-handler)
                       (c/using [:webhook-routes :app-routes]))
   :jetty          (-> (new-web-server web-port)
                       (c/using [:handler]))))

Most of the functionality sits in app-routes, and uses the default middleware, but a few routes are related to webhooks, and should sit outside that middleware. These do their own authentication, and so they should not care about any CSRF tokens.

Currently they would always come last (because they don't have middleware), but this means that the middleware from app-routes gets checked first. It notices a POST request without CSRF token, and returns 403. The webhook routes are never reached.

To work around this I hacked up my own version so I can order them by priority

(defn prod-system [{:keys [web-port]}] 
  (c/system-map
   :app-routes     (-> (new-endpoint app-routes)
                       (assoc :priority 20)
                       (c/using [:middleware]))
   :webhook-routes (-> (new-endpoint webhook-routes)
                       (assoc :priority 10)) 
   :middleware     (new-middleware {:middleware middleware})
   :handler        (-> (new-handler)
                       (c/using [:webhook-routes :app-routes]))
   :jetty          (-> (new-web-server web-port)
                       (c/using [:handler]))))

I'd be curious what others think of this use case, and what other approaches there could be to handle it.

@plexus I'm really happy you bring this up. I have written a +1000 words blog post regarding how Ring middleware is deficient, but I haven't published it yet.
However, let me suggest the following:

(defn prod-system [{:keys [web-port]}] 
  (c/system-map
   :app-middleware   (new-middleware {:middleware middleware})
   :app-routes     (-> (new-endpoint app-routes)
                             (c/using [:app-middleware]))
   :webhook-routes (new-endpoint webhook-routes)   
   :handler        (-> (new-handler)
                       (c/using [:webhook-routes :app-routes]))
   :jetty          (-> (new-web-server web-port)
                       (c/using [:handler]))))

Wouldn't this address your use case?

Composing middleware in NodeJS is a risky business. They mutate the HTTP request and response objects freely, and are often dependent on each others’ side-effects. There are no guarantees that you have stacked the middleware functions in a sensible order, and it is often the case, in my experience, that misconfigured middleware takes a lot of time and effort to debug.

The fundamental problem is composition in the light of mutation (middleware is not pure).
I would be interested in a principled approach to solve this, but this is outside of the scope of the Duct components.
A principled solution to compose middleware can be found, but not in the Clojure ecosystem.
The above quote comes from hyper, a promising solution in purescript. The following page is well worth a read: http://hyper.wickstrom.tech/

A (weaker) attempt to solve Ring middleware is proposed in macchiato, where metadata is added to coordinate composition. https://yogthos.net/posts/2016-12-20-MacchiatoModules.html

That being said, I think it is actually possible to order endpoints and middleware, since we're following the Ring model and Compojure, and ordering is the same here as it is there. But maybe you can help me explain this better :-)

I don't see the difference between your code and my first example, except for renaming :middleware to :app-middleware, it still has the same problem.

The problem is that the handler component conceptually does this

(apply routes (concat routes-with-middleware routes-without-middleware))

So the routes that don't have route-specific middleware (in this case webhook-routes) always come last. routes will try app-routes, and only if it returns nil will it go on to try webhook-routes, but app-routes is wrapped with middleware which returns 403, and so the webhook routes is never reached.

Ah, I see.
But app-routes should return 403 only if a forbidden request was made to a route it knows about, shouldn't it? That is to say, if there is actually a request being made to a route it owns, otherwise it should return nil.
This sounds like a Compojure gotcha to me. Similar to returning a not-found prematurely.

The 403 is coming from middleware, because it's not finding a CSRF token. This happens before any routes are checked.

OK, let me ask you this way. How would you address your use case in straight Compojure?

(routes
   webhook-routes
   (wrap-mw app-routes))

Currently system only allows doing this

(routes
   (wrap-mw app-routes)
   webhook-routes)

Because routes with middleware are always placed first in the stack.

OK, if you say can't do this currently, I believe you, I'm just surprised.
I'll definitely try to come up with something to address this shortcoming. It will be extremely helpful to have a minimum proof scenario in a temporary github repo to work with. Do you think you can do that?

Sure, I won't be able to do it straight away though, but I'll come up with something.

OK, excellent. Thanks.

Here's an example app. Please have a look at the README, I've tried my best to explain the issue clearly there, with relevant pointers to code.

https://github.com/plexus/handler-demo

Thank you very much, @plexus. I'll study that demo and I'll be back asap.

Problem verified and understood. Thank you indeed, @plexus.
Now on to the fun part: addressing the issue. :-)
The goal is to think hard on the semantics that we need to introduce, if at all.
Before we take the route of late-binding a :priority key to endpoints , a suggestion which is not without merits, I would like to make an observation.
When we compose the handler, It seems to me that we can safely invert the current order of endpoints with middleware and endpoints without middleware. Reference
So instead of:

(wrap-mw (apply compojure/routes (concat handlers routes)))

We do this:

(wrap-mw (apply compojure/routes (concat routes handlers)))

By definition, endpoints without middleware can always come first, right? I say by definition because routes are either matched are not matched. If my reasoning is correct, endpoints without middleware can never short-circuit the routing stack.
Incidentally, this also solves your use case.
So the next question is how to order silos of routes that have middleware, such as the typical use case of api endpoints vs website endpoints.
Now I see two avenues of solving the problem.
1, Ensure that the order of the vector in which the endpoints are passed to the handler is reflected in the way the middleware is composed.
2, Add a priority key to the middleware component. This can be done in typical Component fashion, passed as an option to the component constructor.
For example,

:api-middleware (new-middleware {:middleware [wrap-api] :silo 10})
:web-middleware (new-middleware {:middleware [wrap-web] :silo 20})

At first glance, I like the first solution better because it eschews introduction of new semantics.

Do you have any thoughts?

Swapping the two sets of handlers seems like the simplest solution with the biggest win.

I think I will start experimenting with a different approach though, that is, making the structure of endpoints and middleware explicit, and deriving the component/using vector from that structure, instead of the other way around

This is probably too big a change to retrofit onto the existing components, maybe it needs to be a new component altogether (within or outside of System). I think it would make things easier to reason about. I find the current behavior of grouping and recombining endpoints surprising and unintuitive.

(defn new-handler2 [handler-spec]
  (-> (map->Handler2 {:spec handler-spec})
      (component/using (flatten handler-spec))))

(new-handler2 [:webhook-endpoint
               [:middleware1
                :api-endpoint]
               [:middleware2
               :endpoint-x
               :endpoint-y]])

Making structure explicit is the best approach. If I had to write the components again, that's how I probably would go about it. Please keep me updated about your progress, no matter what you decide to do.
Your contributions are always welcome, and very much appreciated.

FYI: the latest system snapshot has the "swapped handlers, big win fix".

I think we can close this, the solution is workable, and I won't have time any time soon to propose something better :) Thanks @danielsz!

Of course. Thank you, @arne. Your input was very valuable indeed. 👍

Hey, I just came by to raise this exact issue, however, the implemented solution doesn't work for me. So before I just open up a new issue I'd post here.

A simple representation of my issue is that we have 2 endpoints. An api-endpoint and an app-endpoint let's pretend that neither of them contains middleware to avoid the above solution. In the app-endpoint we have a not-found route that handles our 404 page needs. It is quite scary to think what would happen in the app-endpoint were to be checked before the api-endpoint and return 404 errors for all of our API routes.

Currently, it appears that component/using works in a manner that allows ordering based on what I think is just an OrderedHashMap but it is obviously implicit and undependable. So ideally it would still be great to be able to explicitly order the endpoints in the final handler.

Now I know the above solution is simple and there are ways around it but in actuality, the application I am working on is quite large, has several endpoints and is currently suffering from two implicit ordering requirements, one of which has to do with middleware we don't control and isn't seen by system.

Absolutely, composing a Ring stack can be tricky, there is mutable state, then there is the opaqueness of the final, composed function.

I am in favor of exploring new approaches and redesign the web stack. The author of compojure is working on a new library , there is an attempt to produce a monadic Ring library, etc.

The duct components in system don't solve the fundamental problems, they offer one way to play around them. We need to distinguish whether the issue you are experiencing is a limitation of the duct components in system, or if it is a usage problem. For example, the Duct components in system were designed to handle the use case you are describing, with api-endpoints and app-endpoints. The not-found route is composed on top of both, and you will not encounter what you've been dreading. A hint of how this looks can be found here.

The issue that @plexus opened is because there was a hard limitation with the way things were wired up. That particularl issue was solved.

What I am interested in at this point is lifting hard limitations when found. I would need a reproducible scenario in a github repository so that I can look into it. Merely describing problems is not sufficient precisely because the brittleness of our foundation.

Thank you.

I finally went ahead an implemented the component that I had in mind, webstack!

https://gist.github.com/plexus/b19bc7a5148f7df85d7868c5806ea541

At its most basic it just takes a list of handler constructors

(defn app-routes [dependencies]
  (fn [req] ,,,))

(defn api-routes [dependencies]
  (fn [req] ,,,))

(new-web-stack [{:make-handler app-routes}
                {:make-handler api-routes}])

Dependencies can be injected with :using, this will call component/using to get hold of the components, and then will pass the specified ones to each routes constructor function.

(defn api-routes [{:keys [db}]
  (fn [req] ,,,))

(new-web-stack [{:make-handler app-routes}
                {:make-handler api-routes
                 :using [:db]}])

Each "tier" of the stack can have its own middleware

(new-web-stack [{:make-handler app-routes
                 :middleware [wrap-this
                             [wrap-that with-extra-arg]]}])

Middleware can also depend on other components, if you use a vector with the ^:using metadata as argument, then a map with those dependencies will be used as argument.

(defn wrap-this [handler {:keys [db]}] ,,)

(new-web-stack [{:make-handler app-routes
                 :middleware [[wrap-this ^:using [:db]]]}])

It's hot off the press so might still be issues, but I'm using it on Lambda Island already and it seems to work so far :)

Feedback welcome. I'm not sure yet if this should be in system, or a separate library. I'm leaning towards a separate library.

I have the same issue... 2 different set of routes (api and webpage)

tried your solution @plexus without success (I am using bidi).

(defn api-handler
  [request]
  (r/response "Yeahhhhh"))

(def api-routes (br/make-handler ["/" {"api" api-handler}]))

(defn app-system [config]
  (component/system-map

    :web-stack (new-web-stack [{:make-handler api-routes
                                :middleware   [wrap-restful-format]}])
    :http (-> (new-web-server (:http-port config))
              (component/using {:handler :web-stack}))))

Anyway we should be able to manage different middleware for different set of routes. Any ideas on how to implement it @danielsz ?

As a matter of fact, this is the raison-d'être of these components, to separate routes into distinct silos with their own middleware.
To recap, you can have per-endpoint middleware. So define your API middleware and your site middleware, and attribute them to the relevant endpoints. Give the keys names like :api-middleware and :site-middleware, for example.

The following example should be useful because its use case is similar.

Remember that the key :middleware is special, it is preset to be merged with all endpoints, so depending on the specifics you either define it to contain a minimal set of middleware shared by all endpoints, or you leave it out.

If this explanation is not enough, please prepare a minimal scenario so that I can take a closer look at what you're doing.

Update, I resolved it with @plexus solution, I was using it badly => working example =>

(defn api-handler
  [request]
  (r/response {:current-user {:first-name "Test"}}))


(defn api-routes [_]
  (br/make-handler ["/" {"api" api-handler}]))

(defn app-system [config]
  (component/system-map

    :web-stack (new-web-stack [{:make-handler api-routes
                                :middleware   [wrap-transit-json-params
                                               wrap-transit-json-response]}
                               {:make-handler webapp-routes
                                :middleware   (:middleware config)}])
    :http (-> (new-web-server (:http-port config))
              (component/using {:handler :web-stack}))))

@danielsz Okay great!
I'll try to do it the right way .
I did not know about the :middleware special keyword...

In the hope to alleviate the pain points mentioned in this thread, I have prepared two example repos where a web app is composed with separate API and site middleware.

https://github.com/danielsz/system-api-example
https://github.com/danielsz/system-api-immutant-example

The first example works by applying the relevant middleware to the corresponding routes.
The second example is interesting as it exploits the context path feature that Immutant brings to the table, allowing you to actually run multiple handlers separately.

Please let me know if you encounter any difficulty.

@plexus I know I'm a little late with this, but I finally had time to look at your solution. it looks neat! How is holding up the test of time? Have you made it into a library? If there is anything I can do, please let me know. Maybe include a component in system as another option?

I finally came around a write-up of Raamwerk's principles.

Comments welcome!