Issues when configuring the same Caddy server in multiple tasks

Question

Issues when configuring the same Caddy server in multiple tasks

Closed this issue 2 years ago · 5 comments

I'm running into some issues when trying to add configuration in multiple tasks under different paths on a single Caddy server.

It seems that when starting with a clean Caddy server with no configuration, I always need to apply the whole config when configuring Caddy via the api.
So I need load the new configuration either by using the caddy_load module and provide the full config, or use caddy_config again with the full config and a blank string for the path parameter.

caddy_config fails if the configuration path does not already exist, both with and without the append option with a error like this: Error durning processing of the request (Internal Server Error): invalid traversal path at: config/apps/http

As a example scenario let's say I have two Ansible roles

One role is for setting up some product, and in that role there is a task that configures Caddy as a reverse proxy for said product.
The other role is for managing internal certificates, which deploys a in hous managed root ca and renews certificates, etc. In that role there is a task that configures the ssl certificates in Caddy.

For role 1. I would like to apply this configuration

apps:
  http:
    servers:
      vault:
        listen:
          - ":443"
        routes:
          - handle:
              - handler: "reverse_proxy"
                upstreams:
                  - dial: "127.0.0.1:8080"

and for the second role lets say I want to apply this configuration:

apps:
  tls:
    automation:
      policies:
        - issuers:
            - module: internal

Now if I load the whole config it works just fine, with either:

- maxhoesel.caddy.caddy_load:
    config:
      apps:
        tls:
          automation:
            policies:
              - issuers:
                  - module: internal
        http:
          servers:
            example:
              listen:
                - ":443"
              routes:
                - handle:
                    - handler: "reverse_proxy"
                      upstreams:
                        - dial: "127.0.0.1:8080"

or:

- maxhoesel.caddy.caddy_config:
    path: ""
    config:
      apps:
        tls:
          automation:
            policies:
              - issuers:
                  - module: internal
        http:
          servers:
            example:
              listen:
                - ":443"
              routes:
                - handle:
                    - handler: "reverse_proxy"
                      upstreams:
                        - dial: "127.0.0.1:8080"

But if I try to do it in different tasks with one config at a time, it will overwrite the previous config

- maxhoesel.caddy.caddy_load:
    config:
      apps:
        tls:
          automation:
            policies:
              - issuers:
                  - module: internal

- maxhoesel.caddy.caddy_load:
    config:
      apps:
        http:
          servers:
            example:
              listen:
                - ":443"
              routes:
                - handle:
                    - handler: "reverse_proxy"
                      upstreams:
                        - dial: "127.0.0.1:8080"

- maxhoesel.caddy.caddy_config:
    path: ""
    config:
      apps:
        tls:
          automation:
            policies:
              - issuers:
                  - module: internal

- maxhoesel.caddy.caddy_config:
    path: ""
    config:
      apps:
        http:
          servers:
            example:
              listen:
                - ":443"
              routes:
                - handle:
                    - handler: "reverse_proxy"
                      upstreams:
                        - dial: "127.0.0.1:8080"

And as I said above if I try to do it by defining the path I get a invalid traversal path error.

- maxhoesel.caddy.caddy_config:
    path: "apps/tls/automation"
    config:
      policies:
        - issuers:
          - module: internal

- maxhoesel.caddy.caddy_config:
    path: apps/http/servers/example
    config:
      listen:
        - ":443"
      routes:
        - handle:
            - handler: "reverse_proxy"
              upstreams:
                - dial: "127.0.0.1:8080"

To achieve what I want I would need to look up the currently running config

I realize this issue is not exactly the fault of the collection but a problem with how the Caddy api behaves caddyserver/caddy#3501
But do you think perhaps it's something that the Ansible collection could work around? By doing a lookup of the currently running config and merging it with the new config that's going to get applied or something like that?

I can obviously do the same workaround with some extra tasks in the the Ansible play/role to lookup the current config and doing a merge of the config and then apply it if needed, etc. but I think that is something that should be avoided in Ansible roles.

Answer 1 · 2022-05-04T17:54:33.000Z

Hm, this one is tricky. I wrote the modules in this collection as thin wrappers around the actual API, so it's not surprising that you are running into this limitation. From what I can tell, the above behavior is expected, but I can see the use case for something "smarter" here.

But if I try to do it in different tasks with one config at a time, it will overwrite the previous config
- maxhoesel.caddy.caddy_load:
    config:
      ....
- maxhoesel.caddy.caddy_load:
    config:
      ....      

caddy_load is designed to replace the running configuration with a new one, so this looks reasonable to me.

- maxhoesel.caddy.caddy_config:
    path: ""
    config:
      ....
- maxhoesel.caddy.caddy_config:
    path: ""
    config:
      ....

The issue with this approach is that caddy doesn't do any recursive merging - a push to a given API endpoint will just overwrite everything below it. So this is basically equivalent to a caddy_load, which causes the changes to be overwritten

- maxhoesel.caddy.caddy_config:
     path: "apps/tls/automation"
     config: ...
-  maxhoesel.caddy.caddy_config:
     path: apps/http/servers/example
     config:

... and this results in a server error because the path doesn't exist yet. That said, this seems like the most "correct" approach to me - the first module call is supposed to only modify the automation config, while the second one is supposed to add a server.

I can think of a few approaches to solve this issue:

The first option would be to add a merge_config argument to the caddy_config module, as you suggested. Ansible already has a combine filter for this purpose so we could model the module arguments after that. On the other hand, merging JSON correctly isn't exactly trivial and it would add a lot of baggage to the config module.

Alternatively, we could make it easier to use Ansible's combine filter with an existing config. This collection already has an config_info module to retrieve the current configuration, but it's a regular module, so any config merge currently involves two module calls. If we converted that module into a lookup module, then we could easily merge and push the config in a single module call like so:

- caddy_config:
    path: ""
    config: "{{ lookup('caddy_config_info', url=https://caddyapi, path='') | combine(our_new_config, recursive=yes, list_merge='replace') }}"

The workaround in the linked issue also seems promising. Before pushing a config, the caddy_config module could check whether that path exists and attempt to create it if it doesn't. This isn't exactly trivial either (we'd need to handle array indices and all that), but it seems doable. That behavior could also be toggled with a single argument such as create_path. Then, your last example should work just fine.

What are your thoughts on this? Personally, I think I prefer the last option because it seems like the most "clean" one - it avoids having to merge any kind of configuration and doesn't mess around with the API semantics too much.

Answer 2 · 2022-05-05T10:50:35.000Z

Hm, this one is tricky. I wrote the modules in this collection as thin wrappers around the actual API, so it's not surprising that you are running into this limitation. From what I can tell, the above behavior is expected, but I can see the use case for something "smarter" here.

caddy_load is designed to replace the running configuration with a new one, so this looks reasonable to me.

If you manage Caddy fully with the ansible collection, that is if you are not modifying some existing configuration, then there is actually not much difference between caddy_load and caddy_config, both of them require the full config.
And currently you can achieve everything that the collection does pretty easily with the ansible uri module, so making the collection smarter would definitely make it more valuable.
With that being said I do appreciate not going far away from the way Caddy works, I feel it's good to try to be as close to the api as possible... But as there are few areas that might be improved on in terms of user friendliness I think it should be ok to add a few "helpers" to make the collection a bit "smarter"

... and this results in a server error because the path doesn't exist yet. That said, this seems like the most "correct" approach to me - the first module call is supposed to only modify the automation config, while the second one is supposed to add a server.

Agreed, this was my first attempt as it felt like natural ansible behavior.

I can think of a few approaches to solve this issue:

The first option would be to add a merge_config argument to the caddy_config module, as you suggested. Ansible already has a combine filter for this purpose so we could model the module arguments after that. On the other hand, merging JSON correctly isn't exactly trivial and it would add a lot of baggage to the config module.

Perhaps it would be easier if the caddy_config module would use a different config adapter than json?
In fact, writing the caddy config with the caddy-yaml config adapter might feel more natural for configuring the ansible modules.
There is even the possibility of creating a custom config adapter for ansible.

Alternatively, we could make it easier to use Ansible's combine filter with an existing config. This collection already has an config_info module to retrieve the current configuration, but it's a regular module, so any config merge currently involves two module calls. If we converted that module into a lookup module, then we could easily merge and push the config in a single module call like so:
- caddy_config:
    path: ""
    config: "{{ lookup('caddy_config_info', url=https://caddyapi, path='') | combine(our_new_config, recursive=yes, list_merge='replace') }}"

Having a lookup module would be a nice addition, but I don't see it as a solution to this problem. Mainly because lookup modules run from localhost and it's not always the case that you are exposing the caddy api on external interfaces.

The workaround in the linked issue also seems promising. Before pushing a config, the caddy_config module could check whether that path exists and attempt to create it if it doesn't. This isn't exactly trivial either (we'd need to handle array indices and all that), but it seems doable. That behavior could also be toggled with a single argument such as create_path. Then, your last example should work just fine.

What are your thoughts on this? Personally, I think I prefer the last option because it seems like the most "clean" one - it avoids having to merge any kind of configuration and doesn't mess around with the API semantics too much.

Yes when a path does not exist then creating a blank path first and then posting the content in that path feels like the most natural solution to me.
With that being said, as stated in the linked issue, It's not guaranteed that it will work in all cases, like when trying to post to the third element of an array and the array does not already exist. But I think that's acceptable, your configuration has to make sense.

I see a few possibilities for how we could use the existing parameters for configuring this behavior, or perhaps it's best to add a new one.

Add a new create parameter that would create the path if it does not exist (should be set to true by default imho).
Leverage the existing append parameter, which would append the changes to a already existing config by creating the necessary paths needed, if set to false then the whole config would be overwritten.
Use the force parameter, when set to false then add to the config, but overwrite it when set to true... Or one could argue to have it the exact opposite where force would force the creation of the path 😕

There is one extra solution I can think of, that would be adding more specific modules. So instead of simply loading the Caddy config the modules would know more about what you are trying to do and how to structure it correctly.
This way the collection would enforce a structure that is known to work, which would take away some of the flexibility of accepting just any config, but in return it would be easier to manage for merging new/old config.

For example a module called caddy_tls which would know the config structure of the caddy tls app, a module called caddy_http for the caddy http app, etc. etc.

---
- maxhoesel.caddy.caddy_tls:
    issuer: internal


- maxhoesel.caddy.caddy_http:
    server: example
    listen:
      - 443
    reverse_proxy:
      - 127.0.0.1:8080

Answer 3 · 2022-05-08T15:56:31.000Z

Yes when a path does not exist then creating a blank path first and then posting the content in that path feels like the most natural solution to me.
With that being said, as stated in the linked issue, It's not guaranteed that it will work in all cases, like when trying to post to the third element of an array and the array does not already exist. But I think that's acceptable, your configuration has to make sense.

Agreed. I've given the create_path parameter approach a go, you can find the implementation here. Let me know if it works for you.

Having separate modules per "endpoint" is a neat idea, but it has its own issues (like, what do we consider an endpoint and how extensive should our module -> API coverage be?), so I think I prefer the "create-path-if-it-doesn't-exist" approach, assuming it works out of course.

Answer 4 · 2022-05-09T13:29:54.000Z

Great, I'll give it a shot!

Having separate modules per "endpoint" would for sure make life easier for the end-user, and the collection it would probably have more users. I can however see that it would require a lot more work in the backend. (Well like always: easier on the front-end, harder on the back-end and vice versa 😃 ).

But for the time being I don't think it's necessary to add a full api coverage, we can begin with the basics and add more features later as you can always use the caddy_config or caddy_load for the more advanced things.
That being said, isn't the create_path parameter you just added a requirement for the dedicated endpoint modules anyways?

Answer 5 · 2022-05-09T23:23:49.000Z

Maybe not a hard requirement, but I'll make things easier for sure. And I agree that having separate modules for the various endpoints would be nice - but to be frank, I just don't have the time or motivation to put in the work right now. Maybe some other time, hah. I'll open a separate issue about this for now.