cloudflare/pingora

Question: how to dynamically route a request to an instance of HttpProxy<SV>?

Opened this issue · 4 comments

Hi!

I'm evaluating the possibility of using Pingora to replace a solution based on Nginx and OpenResty. For that, I'm trying to route the HTTP request to one of many possible instances of HttpProxy<SV> based on the request's HTTP headers.

My reason to have multiple instances of HttpProxy<SV> is because I would like to have one connection pool per load balancer/upstream configuration and it seems to me that it is not possible at the moment when using a single HttpProxy<SV>. I have the expectation of being able to limit the number of TCP connections to a peer using the TransportConnector, or the ConnectionPool under it, at some point or, perhaps, patching them.

My other reason to have multiple instances of HttpProxy<SV> is because it is "statically linked" to the ProxyHttp and therefore one service will use a single ProxyHttp implementation. Meanwhile, my upstreams might require distinct behaviour which could be better constructed using isolated ProxyHttp implementations.

I also intend for these HttpProxy<SV> instances to be created on demand and eventually destroyed when no-longer required.

Initially I though about implementing the router as a ProxyHttp which would call the specialized implementations of ProxyHttp for each upstream, but then I realized that the connection pool is managed at the HttpProxy<SV> level, which implied that all my ProxyHttp implementations and their instances would share the same connection pool, hence the idea of implementing the router as a HttpServerApp and having it routing the requests to one of multiple instances of HttpProxy<SV>, each one with its own pool. For that I had to duplicate some code from HttpProxy<SV> and change the visibility of some methods in the crate pingora-proxy (see link below).

It crossed my mind that another possible implementation could route the request using Unix Domain Sockets internally. A "front end" ProxyHttp could route the request to one of many Unix Domain Sockets connected to "back end" instances of the specialized ProxyHttp. But it sounds to me that I will be adding unnecessary processing with that approach.

To illustrate what I'm trying to do, I created a PR that adds a router_app example to the pingora-proxy crate with the required visibility changes.

At last, my question: Is that a reasonable approach? How would you do it instead?

Example: lmalheiro#1

my own implementation is looks like this utilizing some LRU and RegexCache approach, and use identification id for the cache change-point to not search everytime request come.

gateway core

also i believe with this Pingora actually give the freedom to the programmer to do whatever the routing style they want rather than locked into specific LUA style like nginx.

what do you think? @lmalheiro

@zonblade Thanks for sharing your code and sorry for the late reply. It is a nice implementation that allows for flexible routing, but I'm more concerned with the single connection pool under HttpProxy<SV>. I want to serve multiple independent upstreams, each with its own set of servers and connection pool. If I do the routing at the ProxyHttp implementation, they will all share the same connection pool, that's why I believe I should do the routing at the level of the HttpServerApp implementation. My concern is that it requires changes to the visibility of some methods.

@zonblade Thanks for sharing your code and sorry for the late reply. It is a nice implementation that allows for flexible routing, but I'm more concerned with the single connection pool under HttpProxy<SV>. I want to serve multiple independent upstreams, each with its own set of servers and connection pool. If I do the routing at the ProxyHttp implementation, they will all share the same connection pool, that's why I believe I should do the routing at the level of the HttpServerApp implementation. My concern is that it requires changes to the visibility of some methods.

@lmalheiro as my experience, as you can see there are atomic_id() implementation there to track the session, as far as i can see the new connection always generate new id. or maybe you can implement per-peer LrU or spawn more gateway like i do in here

--- edit ---
for the proxy (not gateway) is yes, they have like dangeling connection which can override other connection. but that's if you try to modify the payload/path/anything before the duplex. so i move the route,LrU, etc inside the duplex, as my Proxy Core code

also my implementation of pingora now serve like 4 different server backend upstream under 1 domain with routed different path. /up1 /up2 etc...

So true for us, we have to rewrite ton of proxy logic by move ConnectionPool from proxy to our own upstream module, which more works like NGINX