jjcollinge/traefik-on-service-fabric

Handle stateful partition keys in Traefik middleware

jjcollinge opened this issue ยท 15 comments

@jjcollinge What kind of approach are you planning for this? Looking at traefik, it seems fairly simple to implement another middleware (though not as a plugin). I am curious as to how generic you were planning to make it, and also when you are are planning to do it, as we do have a stateful service on our setup, and i don't want to go and fork traefik if you are just about to add something for this.

@petertiedemann we have not confirmed our approach to this yet, we plan to discuss some proposals with the SF team and the Traefik folks to get their views next week. We then have a spike in 2 weeks to try and tackle a bunch of the items on our backlog, including this.

@petertiedemann I'd be interested in your thoughts on this. Did the APIM model of passing the partition key directly from the client rather than resolving in the router work for you?

@jjcollinge With the APIM we mapped the client provided key (a string (tenant id)) into a number (for the UniformInt64Partition) as part of an APIM policy and set the backend based on that (as per https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-api-management-overview#send-traffic-to-a-stateful-service ).

So we had a chance to look at this in more detail and have a rough proposal we're looking to implement. Would be great to get your thoughts before we get stuck in @petertiedemann @AviMualem

Proposal

The work is split into two sections.

  1. Add an additional matching rule to Traefik which enables a hashed range match for example HashedRange: type:header value:x-partitionheader match:0-100 range:0-300 . It would take an input and use a hashing algorithm to convert this to an int with even distribution in a range. In this case the full range would be 0-300 and this rule would match if the hashed result of the header x-partitionheader fell in the range 0-100.

  2. This matching rule would then be used by the Traefik Service Fabric provider. It would create a frontend for each partition and assign the HashedRange matcher to each frontend with the correct range set for it's partition.

The end result for a stateful service with 3 partitions and a KeyMin=0 and KeyMax=300 would be:

  • Frontend for Parition 1 with matcher HashedRange: type:header value:x-partitionheader match:0-100 range 0-300
  • Frontend for Parition 2 with matcher HashedRange: type:header value:x-partitionheader match:100-200 range 0-300
  • Frontend for Parition 2 with matcher HashedRange: type:header value:x-partitionheader match:200-300 range 0-300

We'd then look to support querystring, header or path.

@lawrencegripper That sounds reasonable enough, but i think its important to offer a flexible way of determining the key based on the request. In our particular case it is the first part of a uri passed as a query parameter (we stopped having it as a direct part of the url due Swagger incompatibility), but you can easily imagine it being passed as part of the body as well i suppose.

A regex pattern that could pick from url, header or query string would probably be a good first step, and would cover our use-case.

Thanks for taking a look @petertiedemann so we would have a label which allowed a regex to select the value from the URL.

This would give us the following supported types for the hashedRange label:

  • header: Select a header to hash
  • url-regex: Match a section of the url to hash

I can think of lots more but I'm tempted to say these two cover most use cases and not over complicate it.

Example of url-regex type

URL: http://example.com/bob/?customerid=jamesnesbit
HashedRange: type:url-regex value:[=].* match:0-100 range:0-300

This would hash jamesnesbit and match if it the result was in range 0-100

@lawrencegripper Sounds good to me. Sorry to ask, but what is your timeline for this feature?

@petertiedemann we are working on a bunch of things this week, including this - if all goes to plan we should have an initial implementation in a custom Traefik build by next week. There are a few things that might get in the way so we can't guarantee that timeline.

@jjcollinge Thanks, that would be pretty snappy. If we are talking a few weeks, then it should be fine for us, we can then start by integrating Traefik in our deployment and making sure all the stateless stuff is working.

@deanmccrae FYI

We need to get agreement from the Containous team that they are happy to accept this additional matcher into the Traefik code and line up releases.

Now we've got an idea of what this looks like I'll create an issue to discuss with Containous

I've made a start on this work here. It needs some tidying but looks like it should work. Next step is to do some more testing and then update the service fabric provider code to generate the matcher labels.

Hi @lawrencegripper, has there been any update on this task or is it still blocked?

Hi, This was proposed to Traefik here - best to have a read through of the issue but the TLDR is that the work proposal wasn't accepted at this time.

Hi @lawrencegripper, thanks for the quick response. just finished reading the full thread there. It a shame but hopefully one they have their plugin model ready it might kick off again. I've subscribed to notifications on it just in case things do start to move again.