bug: conda cache does not work
gaocegege opened this issue · 5 comments
gaocegege commented
Are you use the envd server?
- Yes, I am using the envd server.
- No, I am not using the envd server.
Describe the bug
def build():
config.repo(url="https://github.com/tensorchord/envd", description="gnn")
base(os="ubuntu20.04", language="python3.7")
install.cuda(version="11.3.1")
install.python_packages(name = [
"dgllife",
])
install.conda_packages(
name=[
"pytorch",
"cudatoolkit=11.3",
"rdkit",
"dgl-cuda11.3",
],
channel=[
"pytorch",
"conda-forge",
"dglteam",
],
)
shell("bash")
The conda cannot be cached
To Reproduce
- run with the build.envd
Expected behavior
No response
The docker info
output
None
The envd version
output
v0.3.11
Additional context
No response
gaocegege commented
Could you please have a look?
Electronic-Waste commented
Maybe I can have a try.
Electronic-Waste commented
I can't download dependencies... I wonder if it's due to my OS(macOS).
#32 [internal] /opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3 pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3
#32 73.23 done
#32 73.23 Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
#32 410.1 Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source.
#32 1866.9 Collecting package metadata (repodata.json): ...working... WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
#32 2078.4 WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
#32 2078.4 WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
#32 2157.3 done
#32 2157.3 Solving environment: ...working... DEBU[2024-01-09T13:15:48+08:00] stopping session
#32 ERROR: process "/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3" did not complete successfully: exit code: 137
------
> importing cache manifest from docker.io/tensorchord/python-cache:envd-v0.3.43-cuda-11.3.1-cudnn-8:
------
------
> [internal] /opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3 pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3:
#32 2157.3 Solving environment: ...working...
#0 1.682 Collecting package metadata (current_repodata.json): ...working... WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.7.1.*, but conda is ignoring the .* and treating it as 1.7.1
#32 73.23 done
failed with initial frozen solve. Retrying with flexible solve.
WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
#32 2078.4 WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
#32 2078.4 WARNING conda.models.version:get_matcher(537): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
#32 2157.3 done
------
ERRO[2024-01-09T13:15:48+08:00] Buildkit error: failed to solve: process "/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3" did not complete successfully: exit code: 137
(1) attached stack trace
-- stack trace:
| github.com/tensorchord/envd/pkg/builder.generalBuilder.build.func1
| /home/runner/work/envd/envd/pkg/builder/build.go:265
| golang.org/x/sync/errgroup.(*Group).Go.func1
| /home/runner/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75
| runtime.goexit
| /opt/hostedtoolcache/go/1.19.10/x64/src/runtime/asm_arm64.s:1172
Wraps: (2) Buildkit error
Wraps: (3) failed to solve: process "/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3" did not complete successfully: exit code: 137
| (1) failed to solve: process "/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3" did not complete successfully: exit code: 137
| Error types: (1) *builder.BuildkitdErr
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *builder.BuildkitdErr
ERRO[2024-01-09T13:15:48+08:00] error="failed to load docker image: Post \"http://%2Fvar%2Frun%2Fdocker.sock/v1.43/images/load?quiet=1\": context canceled" language-version=v0 tag="envd-quick-start:dev"
FATA[2024-01-09T13:15:48+08:00] exit app=envd error="failed to build the image: failed to build: failed to wait error group: Buildkit error: failed to solve: process \"/opt/conda/bin/conda install -n envd -c pytorch -c conda-forge -c dglteam pytorch cudatoolkit=11.3 rdkit dgl-cuda11.3\" did not complete successfully: exit code: 137" version=v0.3.43
My docker info
:
Client:
Version: 24.0.2
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.11.0
Path: /Users/x/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.19.1
Path: /Users/x/.docker/cli-plugins/docker-compose
dev: Docker Dev Environments (Docker Inc.)
Version: v0.1.0
Path: /Users/x/.docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.20
Path: /Users/x/.docker/cli-plugins/docker-extension
init: Creates Docker-related starter files for your project (Docker Inc.)
Version: v0.1.0-beta.6
Path: /Users/x/.docker/cli-plugins/docker-init
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /Users/x/.docker/cli-plugins/docker-sbom
scan: Docker Scan (Docker Inc.)
Version: v0.26.0
Path: /Users/x/.docker/cli-plugins/docker-scan
scout: Command line tool for Docker Scout (Docker Inc.)
Version: 0.16.1
Path: /Users/x/.docker/cli-plugins/docker-scout
Server:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 3
Server Version: 24.0.2
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc version: v1.1.7-0-g860f061
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.15.49-linuxkit-pr
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 5
Total Memory: 7.667GiB
Name: docker-desktop
ID: 0a1c4432-d01a-4090-b9da-8cf7b4464c9d
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false
kemingy commented
@Electronic-Waste can you provide your envd build file?
Electronic-Waste commented
My envd build file is the buggy file provided by @gaocegege . (In the beginning of this issue)