Xilonz/trellis-backup-role

Check credentials - fresh install

Closed this issue · 14 comments

Hello,

Sorry to open another issue but you made the other only open to collaborators. I did actually collab with this role around a year and a half ago. Anyway...

Expected Behavior

Backups should run without error

Actual Behavior

Credentials error - msg #31

Steps to Reproduce the Problem.

I followed the instructions on this repo exactly. I have other servers where all backups still work correctly so I know nothing changed for S3 login on digital oceans. The only things that have changed over the last 18 months or so are this repo and https://github.com/lafranceinsoumise/ansible-backup

  1. Setup a new trellis/bedrock site
  2. Follow this repo to setup backup role for S3 bucket
  3. Check logs for local and/or staging sites at /var/log/duply and see if any errors

Specifications

  • Version: backup v2.1.5 - lafranceinsoumise.backup v5.1.3
  • Platform: Trellis

my vault file:

vault_wordpress_sites:
  site.com:
    env:
      db_password: "password"
      # Generate your keys here: https://roots.io/salts.html
      auth_key: "xxx"
      secure_auth_key: "xxx"
      logged_in_key: "xxx"
      nonce_key: "xxx"
      auth_salt: "xxx"
      secure_auth_salt: "xxx"
      logged_in_salt: "xxx"
      nonce_salt: "xxx"
      backup_target_user: "XXXXXXXXXXXXX"
      backup_target_pass: "XXXXXXXXXXXXX"

When you have time, please try creating a new vanilla trellis site and see that something is not working correctly... you can recreate on your local box as well. I believe the backup target will need to be an AWS S3 or DigitalOcean Spaces. Please try before closing.

Thank you,
Josh

Ok figured this out. I am not sure how I am the only person experiencing this as I came to a wall on my current servers and brand new ones. Maybe there is an easy fix or the documentation just needs to be updated.

Issue is duplicity repo location, duplicity version AND duply version.

Duply v2 seems to break it and lafranceinsoumise.backup V3.9 seems to be the last version to include duply itself in the repo. Must start there otherwise it will install duply v2.

Duplicity 0.8 also seems to have problems and if 0.7 is not on the system, the script will install the latest version. Some versions of lafranceinsoumise.backup have it set to keep the present version as is and some have to install latest.

SO, this is how I fixed it all (for now):
In your galaxy.yml, downgrade and put -

- name: backup
       src: xilonz.trellis_backup
       version: 2.1.4

- name: lafranceinsoumise.backup
       version: 3.9.0

This will downgrade you and get the correct version of duply.

Install downgrades on local system with: ansible-galaxy install -r galaxy.yml --force

Then go to trellis/vendor/roles/lafranceinsoumise.backup/tasks/install.deb.yml and change the "Add duplicity repository" task to:

- name: Add duplicity repository
   apt_repository:
    repo: ppa:duplicity-team/duplicity-release-git
    state: present
    update_cache: yes
  when: backup_duplicity_ppa

This tells the older version of the backup repo to get duplicity from its new location.
Also change the "state=latest" in the last line of the file to "state=present" as this will stop any server provisioning from installing the newest version of duplicity.

You can then run: ansible-playbook server.yml -e env=production --tags=backup

You may get an error here about fetching PPA information if this is not a new trellis server. This is because there is a file still telling the server to check the old duplicity repo location.

To fix, ssh into the server then cd /etc/apt/sources.list.d/ and sudo rm the duplicity ppa file

Ssh into the server and check duply —version and duplicity —version
You want duply 1.9.1 and duplicity 0.7.xx

If you have duply 2, apt-get autoremove duply and make sure to install galaxy.yml and provision the server from your local system again.

If you have duplicity 0.8, you must remove that version with apt-get remove duplicity and then install the older version with apt-get install duplicity=0.7.17-0ubuntu1.1

At this point you may have old and new cron jobs active for the website. Go to cd /etc/cron.d and see if there is a backup and a duply_backup file. If so, sudo rm duply_backup

Good to go until the next vagrant provision or galaxy update as your changes to the lafranceinsoumise.backup install file will be overwritten. I will probably fork the old version and make the changes to use that instead.

Hopefully the issue is just that I am using the wrong variable names in the vault file or something.

Thanks,
Josh

I've opened a pull request to fix this issue. #33
I haven't had the time to check if this resolves the issue, but if I'm correct we should version lock lafranceinsoumise.backup and that should fix the issue.

@morcth I am experiencing the same issue as you.

The workaround you have described works for me.

An alternative workaround is to ssh into the server and add your S3 credentials to ~/.aws/credentials. Adding this change to the root user worked for me. See https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html for more information.

If you put in the credentials for backup_env in /vendor/roles/backup/tasks/main.yml like below it also works.
Obviously not a solution, but maybe you could just add another default for env.

---
- name: List backup jobs
  set_fact:
    backup_user: root
    backup_env:
      AWS_ACCESS_KEY_ID: *****************
      AWS_SECRET_ACCESS_KEY: ****************************

The docs also say backup_env or profile env but I unable to use

backup_profiles:
   env:
      AWS_ACCESS_KEY_ID: *****************
      AWS_SECRET_ACCESS_KEY: ****************************

Thanks for chiming in everyone... good to know it wasn't just my setup that was busted.

@Xilonz - Your fix prob won't work due to v3.9 still points to the old duplicity repo which is dead.
@smaboshe - Your fix won't help keep it simple and automatic
@dylanlawrence - If that works for you, then we can probably put:

- name: List backup jobs
  set_fact:
    backup_user: root
    backup_env:
      AWS_ACCESS_KEY_ID: "{{ site_env.backup_target_user | default(false) }}"
      AWS_SECRET_ACCESS_KEY: "{{ site_env.backup_target_pass | default(false) }}"

And it will worked with all current versions. I have 2 sites I have to make in the next few days. I will test this out.

If this is just about defining the AWS_ keys you should be able to define them in your vault.yml like so:

backup_env:
  AWS_ACCESS_KEY_ID: aws_access_key
  AWS_SECRET_ACCESS_KEY: aws_secret

Also make sure you have the lastest verson of lafranceinsoumise/ansible-backup (5.3.2 at the time of writing) with ansible-galaxy install -r galaxy.yml --force. To speed things up and prevent running through the whole provisioning you can use --tags backup in your command (for example: trellis provision --tags backup production).

After provisioning your server this variable should then be visible in /etc/duply/sitename_task/conf

If the resulting duply configuration is wrong, please open an issue at https://github.com/lafranceinsoumise/ansible-backup

Yes, I'm thinking this will work.

I believe I had tried env: instead of backup_env: like it works for the backup_target_user and pass. If that works fine, then maybe adding that in the readme would help s3 users. Or could leave it out and add backup_env AWS variables in the task and copy the backup_user values in there? Either way, this is awesome that the fix will be easy. Will confirm next week that it works.

Note, this should also work with other platforms that also require special authentication schemes.
Reference - S3, Azure, Cloudfiles...

I was not able to make env work myself (maybe just something on my end),
but it seems the most correct to do:

backup_profiles:
   env:
      AWS_ACCESS_KEY_ID: *****************
      AWS_SECRET_ACCESS_KEY: ****************************
or
      GS_ACCESS_KEY_ID: *****************
      GS_SECRET_ACCESS_KEY: ****************************


Along with some documentation to also wrap this in: 

Azure: AZURE_ACCOUNT_NAME, AZURE_ACCOUNT_KEY
Cloudfiles: CLOUDFILES_USERNAME, CLOUDFILES_APIKEY, CLOUDFILES_AUTHURL
Google Cloud Storage: GS_ACCESS_KEY_ID, GS_SECRET_ACCESS_KEY
Pydrive: GOOGLE_DRIVE_ACCOUNT_KEY, GOOGLE_DRIVE_SETTINGS
S3: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
Swift: SWIFT_USERNAME, SWIFT_PASSWORD, SWIFT_AUTHURL, SWIFT_TENANTNAME OR SWIFT_PREAUTHURL, SWIFT_PREAUTHTOKEN

Ok, I believe I almost got this working with a few ideas from this thread. I am not getting a credentials error but I am getting a no boto module error:

Start duply v2.0.3, time is 2020-10-09 15:11:01.
Using profile '/etc/duply/xxxxxxxx.com_database'.
Using installed duplicity version 0.8.16, python 2.7.17, gpg 2.2.4 (Home: /root/.gnupg), awk 'GNU Awk 4.1.4, API: 1.1 (GNU MPFR 4.0.1, GNU MP 6.1.2)$
Checking TEMP_DIR '/tmp' is a folder and writable (OK)
Test - En/Decryption skipped. (GPG disabled)

--- Start running command PURGE at 15:11:02.414 ---
BackendException: Could not initialize backend: No module named 'boto'
15:11:03.501 Task 'PURGE' failed with exit code '23'.
--- Finished state FAILED 'code 23' at 15:11:03.501 - Runtime 00:00:01.086 ---

--- Start running command PRE at 15:11:03.575 ---
Running '/etc/duply/xxxxxxxx.com_database/pre' - OK
--- Finished state OK at 15:11:03.674 - Runtime 00:00:00.099 ---

--- Start running command BKP at 15:11:03.738 ---
BackendException: Could not initialize backend: No module named 'boto'
15:11:04.762 Task 'BKP' failed with exit code '23'.
--- Finished state FAILED 'code 23' at 15:11:04.762 - Runtime 00:00:01.023 ---

--- Start running command POST at 15:11:04.834 ---
Running '/etc/duply/xxxxxxxxx.com_database/post' - OK
--- Finished state OK at 15:11:04.867 - Runtime 00:00:00.032 ---

Before I go manually installing some python modules, does anyone know why this dependency wasn't installed? Did anybody else get this in the logs? boto is for aws credentials but would assume this should be installed with Lafranceinsoumise.backup or this backup role?

Thanks!

Ok, I was able to get this to work.

No changes in vault would get it. I tried

site.com:
  env:
    AWS_ACCESS_KEY_ID: *****************
    AWS_SECRET_ACCESS_KEY: ****************************

AND

site.com:
  env:
    backup_env:
      AWS_ACCESS_KEY_ID: *****************
      AWS_SECRET_ACCESS_KEY: ****************************

AND

site.com:
  env:
    other passwords
  backup_env:
    AWS_ACCESS_KEY_ID: *****************
    AWS_SECRET_ACCESS_KEY: ****************************

Should this have worked?
What did end up working adding

- name: List backup jobs
  set_fact:
    backup_user: root
    backup_env:
      AWS_ACCESS_KEY_ID: "{{ site_env.backup_target_user | default(false) }}"
      AWS_SECRET_ACCESS_KEY: "{{ site_env.backup_target_pass | default(false) }}"

to
/vendor/roles/backup/tasks/main.yml

Also, to get rid of boto error, I needed to install it on the server with: sudo apt-get install python3-boto
Shouldn't boto have been a dependency? This is an older staging server so maybe a few things are not updated? Should this have worked out of the box or should it be added somewhere as a dependency?

I lied... it does work in vault but you must put backup_env section outside or as a sibling of vault_wordpress_sites.

This gets everything working with the most updated versions BUT if you have multiple sites on one server, this stops them having different AWS credentials. So, you could not save different sites to different account storage spaces. Before, you could have a backup_target_user and password variable in each site section.

To sum up how I got this working with a few last questions:

  • sudo apt-get install python3-boto (Is this automatically installed on newer servers or should this be listed and installed as dependency?)
  • Add
backup_env:
  AWS_ACCESS_KEY_ID: "********************"
  AWS_SECRET_ACCESS_KEY: "***************************"

to top level of vault.yml. (Is there a way to add this in env section for each site if you have multiple sites on one server?)

Once we are sure of these answers, just the documentation for the backup role will need to be updated a bit.

I had this issue with AWS S3, and having only the following in trellis/group_vars/production/vault.yml resolved it:

...

vault_wordpress_sites:
  example.com:
    env:
      db_password: ...

backup_env:
  AWS_ACCESS_KEY_ID: "********************"
  AWS_SECRET_ACCESS_KEY: "********************"

I checked and tried a thousand things before finding this ticket, but I believe my environment right now has no other changes than just that.

Is this still an issue @morcth?

Closing due to inactivity.