containous/traefik-extra-service-fabric

Key Already Exist Error with Stateless Service with Multiple Named Partitions

briandenicola opened this issue · 4 comments

Service Fabric - 6.3.187.9494 on Windows
Traefik - v1.7.4

Stateless Service when deployed with a Singleton Partition, Traefik sees the frontend/backend and forwards traffic as expected.

If I deploy the same service using named partitions, the follow errors are logged to stdout and the dashboard does not show

time="2018-10-31T13:03:56Z" level=error msg="Provider connection error: Near line 49 (last key parsed 'backends'): Key 'backends.fabric:/app01/serv001' has already been defined.; retrying in 1.365652418s"

I am not if I should open this issue here or Joni GitHub repo.

There isn't currently any test coverage in the codebase for NamedPartitions 😭 so it's likely that this is a bug in handling of these.

You may be able to work around the solution by creating a custom template based off the default template here.

As a first step it would be good to turn on the DEBUG=true in the toml and post the full log as this may give more indication as to what is going wrong.

Thanks Lawrence. I also tried int64range partitions but had the same issue.
I updated the toml file to:
debug = true
logLevel = "DEBUG"
traefikLogsFile = "log/traefik.log"

The traefik.log output is attached but I didn't see much more beyond the key already defined.

I created a very simple go app yesterday (https://github.com/bjd145/ServiceFabric/blob/master/API/main.go) that's based off servicefabric.go. (There's probably a better way to do this).

From that, I believe the call to configuration, err := p.getConfiguration() is successful. It's the mapping of the object to the configuration template so I'll take a look at your suggestion on creating a custom template

Thanks!
traefik.log

@bjd145 Good work, if you can share back what you find when using a custom template that would be great as we can roll the fix into the template for others to benefit.

I had a quick read through and I may have found the source of the bug. backend's each need a unique name in Traefik. The code in the template loops through each partition on the stateless service and creates a backend for each partition found however it uses the same name for each partition's backend. See the code here:

{{ $backendName := $service.Name }}

This means the template incorrectly always assumes that stateless services will be singletons. If they aren't you see the duplicate key error as we attempt to add a backend multiple times with the same name.

If you look at the same code for stateful services in the template here:

{{ $backendName := getBackendName $service $partition }}

it uses the following function, which creates a backend name out of the servicename and partitionid

func getBackendName(service ServiceItemExtended, partition PartitionItemExtended) string {

If you edit the template to use the same function in the stateful block so line #24 would become

{{ $backendName := getBackendName $service $partition }}

You would then also need to update the code in the template which creates the frontend for each service. At the moment it assumes a singleton stateless service here:

backend = "{{ $service.Name }}"

This would need to create a frontend for each stateless service partition, like we do for stateful services here:

{{else if isStateful $service }}

Another option would be to add an additional check to identify if the stateless service is a singleton or not then branch the existing stateless service with an additional if statement to handle both cases.

@jjcollinge can you have a quick sanity check of this for me

Thanks @lawrencegripper. I haven't gotten to the template yet. Other client work got in the way but I should have time tomorrow to try this out.