`suggestion-enas` rock permission denied error
Closed this issue · 1 comments
misohu commented
Bug Description
When comparing suggestion-enas
rock to the upstream docker image I found that the rock is reporting permission problems. These permission problems are not presented in upstream's Docker image.
To Reproduce
- docker run -ti "charmedkubeflow/suggestion-enas:v0.17.0-92cd6d9" -v
Environment
Docker
Relevant Log Output
2024-09-03T07:12:31.066Z [suggestion-enas] 2024-09-03 07:12:31.066701: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-09-03T07:12:31.067Z [suggestion-enas] 2024-09-03 07:12:31.066996: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-09-03T07:12:31.069Z [suggestion-enas] 2024-09-03 07:12:31.069109: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-09-03T07:12:31.095Z [suggestion-enas] 2024-09-03 07:12:31.095143: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
2024-09-03T07:12:31.095Z [suggestion-enas] To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-09-03T07:12:31.542Z [suggestion-enas] 2024-09-03 07:12:31.542531: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-09-03T07:12:31.837Z [suggestion-enas] WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
2024-09-03T07:12:31.837Z [suggestion-enas] I0000 00:00:1725347551.837823 95 config.cc:230] gRPC experiments enabled: call_status_override_on_cancellation, event_engine_dns, event_engine_listener, http2_stats_fix, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
2024-09-03T07:12:31.838Z [suggestion-enas] ENAS Suggestion Service
2024-09-03T07:12:31.838Z [suggestion-enas] Traceback (most recent call last):
2024-09-03T07:12:31.838Z [suggestion-enas] File "/opt/katib/cmd/suggestion/nas/enas/v1beta1/main.py", line 45, in <module>
2024-09-03T07:12:31.838Z [suggestion-enas] serve()
2024-09-03T07:12:31.838Z [suggestion-enas] File "/opt/katib/cmd/suggestion/nas/enas/v1beta1/main.py", line 31, in serve
2024-09-03T07:12:31.838Z [suggestion-enas] service = EnasService()
2024-09-03T07:12:31.838Z [suggestion-enas] File "/opt/katib/pkg/suggestion/v1beta1/nas/enas/service.py", line 161, in __init__
2024-09-03T07:12:31.838Z [suggestion-enas] os.makedirs("ctrl_cache/")
2024-09-03T07:12:31.838Z [suggestion-enas] File "/usr/lib/python3.10/os.py", line 225, in makedirs
2024-09-03T07:12:31.838Z [suggestion-enas] mkdir(name, mode)
2024-09-03T07:12:31.838Z [suggestion-enas] PermissionError: [Errno 13] Permission denied: 'ctrl_cache/'
2024-09-03T07:12:32.036Z [pebble] Service "suggestion-enas" stopped unexpectedly with code 1
2024-09-03T07:12:32.036Z [pebble] Service "suggestion-enas" on-failure action is "restart", waiting ~2s before restart (backoff 3)
Additional Context
The problem is that we skipped this section of Dockerfile while rewriting the rock.
syncronize-issues-to-jira commented
Thank you for reporting us your feedback!
The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6198.
This message was autogenerated