CAPI 1.185.0 - Docker Push bug
ChrisMcGowan opened this issue · 7 comments
Thanks for submitting an issue to cloud_controller_ng
. We are always trying to improve! To help us, please fill out the following template.
Issue
When CAPI was upgraded from 1.183.0
to 1.185.0
certain cf push
commands using a docker image path with a SHA256 failed. Rolling back to 1.183.0
resolved the issue temporarily for our users.
Context
After moving to latest cf-deployment release which contained CAPI 1.185.0
- pushing a docker image using sha256 would return an error:
failed to create container: running image plugin create: fetching image reference: creating image: parsing url failed: invalid reference format
Steps to Reproduce
While running CAPI 1.185.0
just do a simple push with the following public image using a SHA256 - cf push foo --docker-image gsatts/usagov-2021@sha256:b6a34d1afc391dfff44a43aa10c4a0e80c50f8c11b63df9485aa607320c6e7d2
Expected result
A running image on CF
Current result
it fails to stage:
failed to create container: running image plugin create: fetching image reference: creating image: parsing url failed: invalid reference format
I was able to reproduce this issue with using CAPI 1.183.0 and CAPI 1.185.0 on a bbl test environment. Might be related to the introduction of cloud native buildpacks #3778 (ping @modulo11 @pbusko @c0d1ngm0nk3y @nicolasbender ).
I'll add a warning to the release notes.
@johha @ChrisMcGowan Using the CloudController commit 706b043, we can see that the Cloud Controller successfully sends TaskDefinition
request to Diego for staging process (the exact case is also tested here https://github.com/cloudfoundry/cloud_controller_ng/blob/main/spec/unit/lib/utils/uri_utils_spec.rb#L161-L163):
{
"task_definition": {
"rootfs": "preloaded:cflinuxfs4",
"action": {
"timeout": {
"action": {
"emit_progress": {
"action": {
"run": {
"path": "/tmp/lifecycle/builder",
"args": [
"-outputMetadataJSONFilename=/tmp/result.json",
"-dockerRef=gsatts/usagov-2021@sha256:b6a34d1afc391dfff44a43aa10c4a0e80c50f8c11b63df9485aa607320c6e7d2"
],
"env": [
{
"name": "VCAP_APPLICATION",
"value": "{\"cf_api\":\"http://localhost\",\"limits\":{\"fds\":16384,\"mem\":1024,\"disk\":1024},\"application_name\":\"foo\",\"application_uris\":[\"foo.customer-app-domain1.com\"],\"name\":\"foo\",\"space_name\":\"test\",\"space_id\":\"68d395be-ebb9-4293-ac14-2ac45af2cdfc\",\"organization_id\":\"b4b0f1ed-b659-4889-81b7-13cd9dc03a2b\",\"organization_name\":\"the-system_domain-org-name\",\"uris\":[\"foo.customer-app-domain1.com\"],\"users\":null,\"application_id\":\"7fc4f327-5f3d-4088-86d9-8dd0ec70edbf\",\"version\":\"f26b9b56-c46c-43ab-953b-f98a94922b2f\",\"application_version\":\"f26b9b56-c46c-43ab-953b-f98a94922b2f\"}"
},
{
"name": "MEMORY_LIMIT",
"value": "1024m"
},
{
"name": "VCAP_SERVICES",
"value": "{}"
}
],
"resource_limits": {
"nofile": 42
},
"user": "vcap",
"suppress_log_output": false
}
},
"start_message": "Staging...",
"success_message": "Staging Complete",
"failure_message_prefix": "Staging Failed"
}
},
"timeout_ms": 42000
}
},
"disk_mb": 1024,
"memory_mb": 1024,
"cpu_weight": 50,
"privileged": false,
"log_source": "STG",
"log_guid": "7fc4f327-5f3d-4088-86d9-8dd0ec70edbf",
"metrics_guid": "",
"result_file": "/tmp/result.json",
"completion_callback_url": "https://api.internal.cf:8182/internal/v3/staging/df485df4-432b-4ff2-878d-fba030c013cb/build_completed?start=false",
"cached_dependencies": [
{
"name": "",
"from": "http://file-server.service.cf.internal:8080/v1/static/docker_app_lifecycle/docker_app_lifecycle.tgz",
"to": "/tmp/lifecycle",
"cache_key": "docker-lifecycle",
"log_source": ""
}
],
"legacy_download_user": "vcap",
"trusted_system_certificates_path": "/etc/cf-system-certificates",
"network": {
"properties": {
"app_id": "7fc4f327-5f3d-4088-86d9-8dd0ec70edbf",
"container_workload": "staging",
"org_id": "b4b0f1ed-b659-4889-81b7-13cd9dc03a2b",
"policy_group_id": "7fc4f327-5f3d-4088-86d9-8dd0ec70edbf",
"ports": "8080",
"space_id": "68d395be-ebb9-4293-ac14-2ac45af2cdfc"
}
},
"max_pids": 2048,
"certificate_properties": {
"organizational_unit": [
"organization:b4b0f1ed-b659-4889-81b7-13cd9dc03a2b",
"space:68d395be-ebb9-4293-ac14-2ac45af2cdfc",
"app:7fc4f327-5f3d-4088-86d9-8dd0ec70edbf"
]
},
"image_username": "",
"image_password": "",
"log_rate_limit": {
"bytes_per_second": 1048576
},
"metric_tags": {
"app_id": {
"static": "7fc4f327-5f3d-4088-86d9-8dd0ec70edbf"
},
"app_name": {
"static": "foo"
},
"organization_id": {
"static": "b4b0f1ed-b659-4889-81b7-13cd9dc03a2b"
},
"organization_name": {
"static": "the-system_domain-org-name"
},
"source_id": {
"static": "7fc4f327-5f3d-4088-86d9-8dd0ec70edbf"
},
"space_id": {
"static": "68d395be-ebb9-4293-ac14-2ac45af2cdfc"
},
"space_name": {
"static": "test"
}
}
},
"task_guid": "df485df4-432b-4ff2-878d-fba030c013cb",
"domain": "cf-app-staging"
}
Also, the error looks very similar to the errors thrown by the go-containerregistry
Golang library. Could the error be originated from the other components, which were also updated as part of the cf-deployment
bump?
I used a bbl environment with cf-d v41.0.0 which comes with capi 1.185.0. There cf push
with sha256 fails. After that I only downgraded capi to 1.183.0 and cf push
succeeds.
Correct @johha - same config we did. The only add was some manual CCDB schema cleanup on the DB migration scripts that where part of CAPI 1.184/185 so CAPI 1.183 would run - the rest of the cf deployment components in v41.0.0 where left as is.
I could reproduce the problem with the CATs docker_lifecycle
test. Fails with capi-release 1.185.0 and succeeds with 1.183.0.
I've proposed https://github.com/cloudfoundry/relint-envs/pull/40 as regression test for the cf-deployment validation.
I created some dev releases and can confirm that the issue is related to the introduction of CNB. With commit 60e06534481d9e584f9490d05aa651d8e751047a
just before #3778 cf push foo --docker-image images@sha256:1234
succeeds. After the CNB commits were merged (last one is daffedca6dd499c2b8edef61d396c42c92714353
) cf push foo --docker-image images@sha256:1234
fails.
If a tag is used instead of the sha256 reference cf push
is successful for all versions.
Couldn't find any differences in the database (package, droplet) between the successful and failing apps. Also the cloud controller logs did not provide any further information.
Fixed with #3889 and shipped with CAPI 1.186.0