creativeprojects/resticprofile

[proposal] Improved configuration file format

creativeprojects opened this issue ยท 26 comments

Proposal for an updated configuration file format

Introduction

The current file format was decided at the time resticprofile was only using the toml format. Nesting pieces of configuration in blocks is not the easiest as you have to specify the whole path in the block:

[profile]

[profile.backup]
...

Since then, I believe the yaml format is preferred over toml.

My proposal is to make a version 2 of the configuration file, the current file format is version 1.

Both formats will continue to be valid (like docker-compose):

  • if no version is specified, the version 1 is used. This is the current format
  • if a version is specified (2) the new format will be expected

New format availability

The new format version 2 will be available for:

  • TOML
  • YAML
  • JSON

It won't be available for HCL. This may not be definitive, but it's not widely used and it's becoming more and more difficult to support HCL.

HCL can still be used as is, version = 1

New format specifications

I will show the specification using the yaml as examples, because it's probably the most readable format.

version

---
version: 2

global

The global section does not change. We'll keep all the global configuration in there.

---
global:
    default-command: snapshots
    initialize: false
    priority: low

profiles

All your profiles will be nested under a profiles section. Please note the schedules are no longer described inside the profile, but in a separate section schedules (see following sections).

profiles:
    default:
        env:
            tmp: /tmp
        password-file: key
        repository: /backup

    documents:
        inherit: default
        backup:
            source: ~/Documents
        snapshots:
            tag:
                - documents

groups

The list of profiles will be nested under a profiles section, so we can add more configuration to groups later.

groups:
    full: # name of your group
        profiles:
            - root
            - documents
            - mysql

This format leaves more space for improvements later (like a repos section maybe?)

Does that make sense to you?

Please reply in this issue, thanks :)

Edit:

  • changed numbering as such: 1 is current and 2 is the new format
  • HCL not supported with version 2
  • schedules section removed

Need to think more about it, however I considered a similar approach for a targets specification (#69) to have them defined at one location and referenced. So yes in general I think that makes sense to have a cleaner format.

As a matter of fact, this is the reason why I need to modernise the configuration, to allow for a targets section ๐Ÿ˜›

(I think you should be able to update the Wiki now?)

Yes it works. Try to contribute asap.

Yes it works. Try to contribute asap.

No worries, it's not urgent ๐Ÿ™‡๐Ÿป

I've decided to drop support of HCL in version 2. Please shout if that's an issue for you ๐Ÿ˜„

HCL is nice to have but it doesn't come for free as it requires extra workarounds. Support for toml, yaml is enough from my perspective (and json for machine to machine communication if needed)

On top of these new features, I just got a new one in mind (#82)

profiles:
    default:
        password-file: key
        repository: rclone:target:/backup
        rclone:
            program: /usr/local/bin/rclone
            stdio: true
            bwlimit: 1M
            b2-hard-delete: true
            verbose: true

    documents:
        inherit: default
        backup:
            source: ~/Documents

which will generate a command like:

restic backup --option rclone.program=/usr/local/bin/rclone --option rclone.args="serve restic --stdio --bwlimit 1M --b2-hard-delete --verbose" --password-file key --repo rclone:target:/backup ~/Documents

In general a very good idea. Only one concern: How would you control what is considered restic options, is there a global entry like option-arguments: ["rclone", "..."]? Or is option always related to the repo URL?

Using go templates is a bit cumbersome, specially for non developers who never used a similar templating language before...

I was thinking about something that we could introduce in the config version 2: templates. Basically they would work like building blocks that you can assemble together.

templates:
  source:
    backup:
      source: /

  target:
    repository: /backup

profiles:
  backup_target:
    templates:
      - target
      - source

and that will assemble both templates to generate the profile. We could also introduce variables (although we have to decide on a syntax for the variables)

templates:
  healthchecks:
    send-before:
        method: HEAD
        url: {URL}/start

    send-after:
        method: HEAD
        url: {URL}

    send-after-fail:
        method: POST
        url: {URL}/fail
        body: "${ERROR}\n\n${ERROR_STDERR}"
        headers:
          - name: Content-Type
            value: "text/plain; charset=UTF-8"

profiles:
  with_monitoring:
    backup: /
    templates:
      - name: healthchecks
        vars:
          URL: https://hc-ping/uuid

# or alternatively

  with_monitoring:
    backup: /
    templates:
      healthchecks:
        URL:  https://hc-ping/uuid

Config templates is a missing bit, go-templates is more like macros, it has its use case but it generates markup not a parsed config and this makes it less easy to use.

The proposed syntax looks good to me. If implemented on top of the viper API, it could use viper.MergeConfigMap which would automatically handle list and map merging. Since templates have a deterministic order (like includes) this means one could control how properties are merged (or replaced when not a list or map).

Had tried to implement proper list merging for inheritance (using "propertyName+" added to plain "propertyName") but this is not so easy as it sounds as it must happen after inheritance and also recursively up to the parent. Templates would also allow to extend lists but in a much simpler way.

Still have the test implementation around on top of the viper API could dump it into a branch and turn it into processing templates instead, shall I?

Made a quick working prototype and found that viper.MergeConfigMap does actually replace slices not merging them (not sure why I thought it would, looks like I had looked at the wrong things when I tested includes...).

Besides this it works quite well with little code to add.

Besides this it works quite well with little code to add.

That's good news ๐Ÿ˜‰

I quite like how this new configuration is opening new possibilities. Now I need to go back to it and finish refactoring the scheduling (that was the last part not working in v2 if I remember well)

Also I'm thinking if there's an easy way we could allow both configurations to be valid...

# simple syntax
profiles:
  with_monitoring:
    backup: /
    templates:
      - healthchecks

# full syntax
  with_monitoring:
    backup: /
    templates:
      healthchecks:
        replace: true
        vars:
          URL:  https://hc-ping/uuid

That's how the prototype implements it :) .. simple and full syntax (can even be mixed in the same list)

That's how the prototype implements it :) .. simple and full syntax (can even be mixed in the same list)

Doh! sorry I should have checked first ๐Ÿ˜Š

Variables might need an option to define defaults though. Maybe something the unix shell would also support "${var:-default}".

Good point, that's a nice to have as well ๐Ÿ‘๐Ÿป

I've been using v2 of the format, and noticed the scedules: format in the docs, and tried it out, but it seems like its not implemented yet? Scheduling with that format didnt give me any output.

also it wasnt clear what to use for the profile to run:

sudo resticprofile profiles
2024/01/14 14:53:26 using configuration file: profiles.yaml

Profiles available (name, sections, description):
  default:  (backup, check, forget, prune)
  storj:    (backup, check, forget, prune)

would it be run: storj.backup ?

EDIT: I see from #265 (comment) that it is unfinished.

@creativeprojects , this is a followup on the discussion started in #290, regarding the direction of schedules in V2. I made a comparison on the 2 options:

Option 1: current proposal with separate V2 schedules section in YAML and TOML:

profiles:
  default:
    backup:
      schedule: "10:00,14:00,18:00,22:00"
      schedule-log: tcp://localhost:514

schedules:
  weekly-check:
    schedule: weekly
    profiles: "*"
    run: check
    log: tcp://localhost:514
[profiles.default.backup]
schedule = "10:00,14:00,18:00,22:00"
schedule-log = "tcp://localhost:514"

[schedules.weekly-check]
schedule = "weekly"
profiles = "*"
run = "check"
log = "tcp://localhost:514"
resticprofile schedule weekly-check default

Option 2: discussion in #290 with unified schedule section and schedules in V2 groups in YAML and TOML:

version: "2"

profiles:
  default:
    backup:
      schedule: 
        at: "10:00,14:00,18:00,22:00"
        log: tcp://localhost:514

groups:
  all:
    profiles: "*"
    schedules:
      check:        
        at: weekly
        log: tcp://localhost:514
[profiles.default.backup.schedule]
at = ["10:00,14:00,18:00,22:00"]
log = "tcp://localhost:514"

[groups.all]
profiles = "*"

[groups.all.schedules.check]
at = "weekly"
log = "tcp://localhost:514"

Note: The current syntax for scheduling remains valid and transparently translates to
the new Schedule instance. It both are set, the settings would be merged.

resticprofile schedule default all

maybe:
resticprofile schedule all.check and/or check@all
resticprofile schedule default.backup and/or backup@default

Thanks @jkellerer for making it clearer from the examples ๐Ÿ‘๐Ÿป

Originally for v2 I imagined to have only the schedules section. I was going to remove the option to schedule inside a profile.

But now that I waited so long to finish v2 I guess it is going to break people's configurations. You're also right about inheritance.

If we want to allow both, I'm now more incline to use your second examples. Just because it makes more sense to have those in a similar place in the configuration. Having the options to add them to a schedule section and in the profile could be quite confusing? It might also make our life more difficult by merging the configurations from different places ๐Ÿค”

@creativeprojects , yes I also agree. Option 2 fits more naturally in the config format.

I guess we're getting closer to releasing v2 now, finishing scheduling won't be such a big task any more ๐Ÿ˜†

Yes, let me finish #290, then I can refactor the implementations and introduce the unified schedule in a separate pr.

changed my mind. have a mostly working unified schedule config implemented #333. should be possible to merge this prior to #290 so that we don't have to do any refactoring later.

Do you have plan to add run-after/run-before options for groups?

Not for the moment. Not sure if it would be helpful. What might be better is advanced options for run hooks to let them run once when used in a group.

I would chime in for run-before or send-before (and after) for groups or schedules. I have a group of copy jobs which I want to bundle in one schedule and I also want to have an integration with healthchecks.io to find out if something went wrong for my copies.