pivotal/credhub-release

bbr fails when using bosh links for the credhub job

Closed this issue · 3 comments

What version of the credhub server you are using?
2.0.0

What version of the credhub cli you are using?
2.0.0

If you were attempting to accomplish a task, what was it you were attempting to do?
When using bosh links for the DB in credhub, the bbr-credhubdb backup fails with:

1 error occurred:
error 1:
1 error occurred:
error 1:
Error attempting to run backup for job bbr-credhubdb on web/8c1276b4-e408-48cf-95a4-044783fd5921: + JOB_PATH=/var/vcap/jobs/bbr-credhubdb
+ SDK_PATH=/var/vcap/jobs/database-backup-restorer
+ BBR_ARTIFACT_FILE_PATH=/var/vcap/store/bbr-backup/bbr-credhubdb//credhubdb_dump
+ CONFIG_PATH=/var/vcap/jobs/bbr-credhubdb/config/bbr.json
+ /var/vcap/jobs/database-backup-restorer/bin/backup --config /var/vcap/jobs/bbr-credhubdb/config/bbr.json --artifact-file /var/vcap/store/bbr-backup/bbr-credhubdb//credhubdb_dump
psql: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
2018/08/28 19:43:21 Unable to check version of Postgres: exit status 2

psql: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/tmp/.s.PGSQL.5432"? - exit code 1

My current manifest looks like this for credhub (sorry it includes also uaa and concourse stuff for the DB):

instance_groups:
- name: db
  instances: 1
  azs: [z1]
  networks:
  - name: ((network_name))
  stemcell: trusty
  vm_type: ((db_vm_type))
  persistent_disk_type: ((db_persistent_disk_size))
  jobs:
  - release: postgres
    name: postgres
    provides:
      postgres: {as: db}
    properties:
      databases:
        port: 5432
        databases:
        - name: *concourse_db_name
        - name: *uaa_db_name
        - name: *credhub_db_name
        roles:
        - *concourse_db_role
        - name: *uaa_db_username
          password: *uaa_db_password
        - name: *credhub_db_username
          password: *credhub_db_password
        tls: 
          ca: ((/database-tls.ca))
          certificate: ((/database-tls.certificate))
          private_key: ((/database-tls.private_key))

  - name: credhub
    release: credhub
    consumes:
      postgres: {from: db}
    properties:
      credhub:
        log_level: warn
        tls: ((/credhub-tls))
        authentication:
          uaa:
            url: *uaa-url
            verification_key: ((uaa-jwt.public_key))
            ca_certs:
            - ((/uaa-tls.ca)) 
        data_storage:
          type: postgres
          port: 5432
          username: &credhub_db_username credhub
          password: &credhub_db_password ((postgres_credhub_password))
          database: &credhub_db_name credhub
          require_tls: true
          tls_ca: ((/database-tls.ca))

  - name: bbr-credhubdb
    release: credhub
    properties:
      release_level_backup: true

The credhub.data_storage.host variable is empty as it's provided by the bosh link. So the value provided to the bbr-credhubdb job is empty also. When bbr runs, it tries to connect to localhost.
The UAA team went around this issue by consuming 2 bosh-links in the bbr-uaadb job. See:

What did you expect to happen?
The backup should work when using bosh-links to link the DB to credhub

What was the actual behavior?
It fails.

Please confirm where necessary:

  • I have included a log output
  • My log includes an error message
  • I have included steps for reproduction

If you are a PCF customer with an Operation Manager (PCF Ops Manager) please direct your questions to support (https://support.pivotal.io/)

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/160166686

The labels on this github issue will be updated when the story is started.

Sorry for the delay and thanks for reporting. The 2.0.0 release of CredHub had a few issues, but we'll try to reproduce this on our end with 2.0.2 to see if we're still affected.

@RomRider we believe we've addressed this issue in the latest releases of 2.1, 2.0 and 1.9. 1.7's bbr scripts are significantly different from 1.9 and up, so we didn't backport this fix there. Please reopen this issue if you continue to see this error. Thanks for reporting!