logs not showing up in cloudwatch
ibegleris opened this issue · 6 comments
When deploying a question answering model with later versions of python, transformers and pytorch the logs do not seem to be seen in cloudwatch. If I deploy the same model but with older python and module versions the logs can be seen in cloudwatch.
Recreate the issue
Case 1, No logs seen
Code
hub = {
'HF_MODEL_ID':'distilbert-base-uncased-distilled-squad',
'HF_TASK':'question-answering'
}
huggingface_model = HuggingFaceModel(
env=hub,
role=role,
transformers_version="4.12.3",
pytorch_version="1.9.1",
py_version="py38",
)
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.m5.xlarge"
)
data = {
"inputs": {
"question": "What is used for inference?",
"context": "My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference."
}
}
# request
predictor.predict(data)
logs:
#015Downloading: 0%\| \| 0.00/2.74k [00:00<?, ?B/s]#015Downloading: 100%\|██████████\| 2.74k/2.74k [00:00<00:00, 3.36MB/s]
--
#015Downloading: 0%\| \| 0.00/451 [00:00<?, ?B/s]#015Downloading: 100%\|██████████\| 451/451 [00:00<00:00, 387kB/s]
#015Downloading: 0%\| \| 0.00/265M [00:00<?, ?B/s]#015Downloading: 3%\|▎ \| 8.39M/265M [00:00<00:03, 83.9MB/s]#015Downloading: 6%\|▋ \| 17.2M/265M [00:00<00:02, 86.1MB/s]#015Downloading: 10%\|▉ \| 25.8M/265M [00:00<00:03, 79.6MB/s]#015Downloading: 13%\|█▎ \| 34.0M/265M [00:00<00:02, 80.7MB/s]#015Downloading: 16%\|█▌ \| 42.1M/265M [00:00<00:02, 74.5MB/s]#015Downloading: 19%\|█▊ \| 49.6M/265M [00:00<00:02, 72.9MB/s]#015Downloading: 21%\|██▏ \| 57.0M/265M [00:00<00:03, 68.6MB/s]#015Downloading: 24%\|██▍ \| 63.9M/265M [00:00<00:03, 63.6MB/s]#015Downloading: 27%\|██▋ \| 70.7M/265M [00:00<00:03, 64.9MB/s]#015Downloading: 29%\|██▉ \| 77.5M/265M [00:01<00:02, 65.5MB/s]#015Downloading: 32%\|███▏ \| 84.1M/265M [00:01<00:02, 62.5MB/s]#015Downloading: 34%\|███▍ \| 90.4M/265M [00:01<00:02, 61.2MB/s]#015Downloading: 37%\|███▋ \| 97.4M/265M [00:01<00:02, 63.8MB/s]#015Downloading: 39%\|███▉ \| 104M/265M [00:01<00:02, 63.0MB/s] #015Downloading: 42%\|████▏ \| 110M/265M [00:01<00:02, 60.5MB/s]#015Downloading: 44%\|████▍ \| 116M/265M [00:01<00:02, 58.9MB/s]#015Downloading: 46%\|████▌ \| 123M/265M [00:01<00:02, 60.0MB/s]#015Downloading: 48%\|████▊ \| 129M/265M [00:01<00:02, 60.1MB/s]#015Downloading: 51%\|█████ \| 135M/265M [00:02<00:02, 60.0MB/s]#015Downloading: 53%\|█████▎ \| 141M/265M [00:02<00:02, 60.3MB/s]#015Downloading: 55%\|█████▌ \| 147M/265M [00:02<00:01, 59.7MB/s]#015Downloading: 58%\|█████▊ \| 153M/265M [00:02<00:01, 59.8MB/s]#015Downloading: 60%\|█████▉ \| 159M/265M [00:02<00:01, 58.4MB/s]#015Downloading: 62%\|██████▏ \| 165M/265M [00:02<00:01, 60.3MB/s]#015Downloading: 65%\|██████▍ \| 171M/265M [00:02<00:01, 60.1MB/s]#015Downloading: 67%\|██████▋ \| 177M/265M [00:02<00:01, 59.4MB/s]#015Downloading: 69%\|██████▉ \| 183M/265M [00:02<00:01, 57.9MB/s]#015Downloading: 71%\|███████ \| 189M/265M [00:02<00:01, 56.6MB/s]#015Downloading: 73%\|███████▎ \| 195M/265M [00:03<00:01, 56.9MB/s]#015Downloading: 76%\|███████▌ \| 202M/265M [00:03<00:01, 61.1MB/s]#015Downloading: 79%\|███████▉ \| 210M/265M [00:03<00:00, 66.0MB/s]#015Downloading: 81%\|████████▏ \| 216M/265M [00:03<00:00, 62.2MB/s]#015Downloading: 84%\|████████▍ \| 223M/265M [00:03<00:00, 60.2MB/s]#015Downloading: 86%\|████████▌ \| 229M/265M [00:03<00:00, 58.8MB/s]#015Downloading: 88%\|████████▊ \| 235M/265M [00:03<00:00, 58.3MB/s]#015Downloading: 91%\|█████████ \| 241M/265M [00:03<00:00, 58.9MB/s]#015Downloading: 93%\|█████████▎\| 247M/265M [00:03<00:00, 57.4MB/s]#015Downloading: 95%\|█████████▌\| 252M/265M [00:04<00:00, 56.9MB/s]#015Downloading: 97%\|█████████▋\| 258M/265M [00:04<00:00, 57.9MB/s]#015Downloading: 100%\|█████████▉\| 264M/265M [00:04<00:00, 58.3MB/s]#015Downloading: 100%\|██████████\| 265M/265M [00:04<00:00, 62.1MB/s]
#015Downloading: 0%\| \| 0.00/466k [00:00<?, ?B/s]#015Downloading: 18%\|█▊ \| 86.0k/466k [00:00<00:00, 567kB/s]#015Downloading: 92%\|█████████▏\| 430k/466k [00:00<00:00, 1.56MB/s]#015Downloading: 100%\|██████████\| 466k/466k [00:00<00:00, 1.52MB/s]
#015Downloading: 0%\| \| 0.00/28.0 [00:00<?, ?B/s]#015Downloading: 100%\|██████████\| 28.0/28.0 [00:00<00:00, 20.7kB/s]
#015Downloading: 0%\| \| 0.00/232k [00:00<?, ?B/s]#015Downloading: 16%\|█▌ \| 36.9k/232k [00:00<00:00, 242kB/s]#015Downloading: 82%\|████████▏ \| 190k/232k [00:00<00:00, 802kB/s] #015Downloading: 100%\|██████████\| 232k/232k [00:00<00:00, 757kB/s]
WARNING - Overwriting /.sagemaker/mms/models/distilbert-base-uncased-distilled-squad ...
Warning: MMS is using non-default JVM parameters: -XX:-UseContainerSupport
Model server started.
Case 2, logs visible
Code
hub = {
'HF_MODEL_ID':'distilbert-base-uncased-distilled-squad',
'HF_TASK':'question-answering'
}
huggingface_model = HuggingFaceModel(
env=hub,
role=role,
transformers_version="4.6",
pytorch_version="1.7",
py_version="py36",
)
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.m5.xlarge"
)
data = {
"inputs": {
"question": "What is used for inference?",
"context": "My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference."
}
}
# request
predictor.predict(data)
logs:
This is an experimental beta features, which allows downloading model from the Hugging Face Hub on start up. It loads the model defined in the env var `HF_MODEL_ID`
--
#015Downloading: 0%\| \| 0.00/2.74k [00:00<?, ?B/s]#015Downloading: 100%\|██████████\| 2.74k/2.74k [00:00<00:00, 2.74MB/s]
#015Downloading: 0%\| \| 0.00/451 [00:00<?, ?B/s]#015Downloading: 100%\|██████████\| 451/451 [00:00<00:00, 473kB/s]
#015Downloading: 0%\| \| 0.00/265M [00:00<?, ?B/s]#015Downloading: 3%\|▎ \| 8.17M/265M [00:00<00:03, 81.7MB/s]#015Downloading: 6%\|▌ \| 16.5M/265M [00:00<00:03, 82.9MB/s]#015Downloading: 9%\|▉ \| 24.8M/265M [00:00<00:03, 75.2MB/s]#015Downloading: 12%\|█▏ \| 32.4M/265M [00:00<00:03, 66.9MB/s]#015Downloading: 15%\|█▍ \| 39.8M/265M [00:00<00:03, 69.1MB/s]#015Downloading: 18%\|█▊ \| 46.8M/265M [00:00<00:03, 66.9MB/s]#015Downloading: 20%\|██ \| 53.6M/265M [00:00<00:03, 64.6MB/s]#015Downloading: 23%\|██▎ \| 60.2M/265M [00:00<00:03, 65.1MB/s]#015Downloading: 25%\|██▌ \| 67.3M/265M [00:00<00:02, 66.7MB/s]#015Downloading: 29%\|██▉ \| 76.5M/265M [00:01<00:02, 74.3MB/s]#015Downloading: 32%\|███▏ \| 85.2M/265M [00:01<00:02, 78.0MB/s]#015Downloading: 35%\|███▌ \| 93.0M/265M [00:01<00:02, 76.5MB/s]#015Downloading: 39%\|███▊ \| 103M/265M [00:01<00:01, 82.1MB/s] #015Downloading: 42%\|████▏ \| 111M/265M [00:01<00:01, 82.6MB/s]#015Downloading: 45%\|████▍ \| 119M/265M [00:01<00:01, 76.1MB/s]#015Downloading: 48%\|████▊ \| 127M/265M [00:01<00:01, 75.6MB/s]#015Downloading: 51%\|█████ \| 135M/265M [00:01<00:01, 76.9MB/s]#015Downloading: 54%\|█████▍ \| 143M/265M [00:01<00:01, 71.3MB/s]#015Downloading: 57%\|█████▋ \| 150M/265M [00:02<00:01, 70.4MB/s]#015Downloading: 59%\|█████▉ \| 157M/265M [00:02<00:01, 68.8MB/s]#015Downloading: 62%\|██████▏ \| 165M/265M [00:02<00:01, 70.2MB/s]#015Downloading: 65%\|██████▍ \| 172M/265M [00:02<00:01, 68.7MB/s]#015Downloading: 67%\|██████▋ \| 179M/265M [00:02<00:01, 64.9MB/s]#015Downloading: 70%\|██████▉ \| 185M/265M [00:02<00:01, 62.4MB/s]#015Downloading: 72%\|███████▏ \| 191M/265M [00:02<00:01, 61.1MB/s]#015Downloading: 75%\|███████▍ \| 198M/265M [00:02<00:01, 63.2MB/s]#015Downloading: 78%\|███████▊ \| 206M/265M [00:02<00:00, 67.1MB/s]#015Downloading: 81%\|████████ \| 214M/265M [00:03<00:00, 71.8MB/s]#015Downloading: 83%\|████████▎ \| 222M/265M [00:03<00:00, 72.3MB/s]#015Downloading: 86%\|████████▌ \| 229M/265M [00:03<00:00, 69.9MB/s]#015Downloading: 89%\|████████▉ \| 236M/265M [00:03<00:00, 71.1MB/s]#015Downloading: 92%\|█████████▏\| 245M/265M [00:03<00:00, 76.8MB/s]#015Downloading: 95%\|█████████▌\| 253M/265M [00:03<00:00, 75.0MB/s]#015Downloading: 98%\|█████████▊\| 261M/265M [00:03<00:00, 69.5MB/s]#015Downloading: 100%\|██████████\| 265M/265M [00:03<00:00, 70.8MB/s]
#015Downloading: 0%\| \| 0.00/466k [00:00<?, ?B/s]#015Downloading: 6%\|▌ \| 28.7k/466k [00:00<00:02, 191kB/s]#015Downloading: 45%\|████▍ \| 209k/466k [00:00<00:00, 783kB/s] #015Downloading: 100%\|██████████\| 466k/466k [00:00<00:00, 1.23MB/s]
#015Downloading: 0%\| \| 0.00/28.0 [00:00<?, ?B/s]#015Downloading: 100%\|██████████\| 28.0/28.0 [00:00<00:00, 19.5kB/s]
#015Downloading: 0%\| \| 0.00/232k [00:00<?, ?B/s]#015Downloading: 18%\|█▊ \| 41.0k/232k [00:00<00:00, 273kB/s]#015Downloading: 88%\|████████▊ \| 205k/232k [00:00<00:00, 750kB/s] #015Downloading: 100%\|██████████\| 232k/232k [00:00<00:00, 762kB/s]
WARNING - Overwriting /.sagemaker/mms/models/distilbert-base-uncased-distilled-squad ...
2022-03-23 14:31:16,829 [INFO ] main com.amazonaws.ml.mms.ModelServer -
MMS Home: /opt/conda/lib/python3.6/site-packages
Current directory: /
Temp directory: /home/model-server/tmp
Number of GPUs: 0
Number of CPUs: 1
Max heap size: 3234 M
Python executable: /opt/conda/bin/python3.6
Config file: /etc/sagemaker-mms.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8080
Model Store: /.sagemaker/mms/models
Initial Models: ALL
Log dir: /logs
Metrics dir: /logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Preload model: false
Prefer direct buffer: false
2022-03-23 14:31:16,921 [WARN ] W-9000-distilbert-base-uncased-d com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-9000-distilbert-base-uncased-d
2022-03-23 14:31:17,049 [INFO ] W-9000-distilbert-base-uncased-d-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - model_service_worker started with args: --sock-type unix --sock-name /home/model-server/tmp/.mms.sock.9000 --handler sagemaker_huggingface_inference_toolkit.handler_service --model-path /.sagemaker/mms/models/distilbert-base-uncased-distilled-squad --model-name distilbert-base-uncased-distilled-squad --preload-model false --tmp-dir /home/model-server/tmp
2022-03-23 14:31:17,051 [INFO ] W-9000-distilbert-base-uncased-d-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.mms.sock.9000
2022-03-23 14:31:17,051 [INFO ] W-9000-distilbert-base-uncased-d-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - [PID] 35
2022-03-23 14:31:17,051 [INFO ] W-9000-distilbert-base-uncased-d-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - MMS worker started.
2022-03-23 14:31:17,051 [INFO ] W-9000-distilbert-base-uncased-d-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Python runtime: 3.6.13
2022-03-23 14:31:17,052 [INFO ] main com.amazonaws.ml.mms.wlm.ModelManager - Model distilbert-base-uncased-distilled-squad loaded.
2022-03-23 14:31:17,057 [INFO ] main com.amazonaws.ml.mms.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2022-03-23 14:31:17,074 [INFO ] W-9000-distilbert-base-uncased-d com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
2022-03-23 14:31:17,151 [INFO ] main com.amazonaws.ml.mms.ModelServer - Inference API bind to: http://0.0.0.0:8080
Model server started.
2022-03-23 14:31:17,175 [INFO ] W-9000-distilbert-base-uncased-d-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
2022-03-23 14:31:17,176 [WARN ] pool-2-thread-1 com.amazonaws.ml.mms.metrics.MetricCollector - worker pid is not available yet.
2022-03-23 14:31:18,810 [INFO ] W-9000-distilbert-base-uncased-d-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model distilbert-base-uncased-distilled-squad loaded io_fd=daebe6fffed0ff6d-00000017-00000001-e136dd8f47306764-937c7c58
2022-03-23 14:31:18,816 [INFO ] W-9000-distilbert-base-uncased-d com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 1563
2022-03-23 14:31:18,818 [WARN ] W-9000-distilbert-base-uncased-d com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-distilbert-base-uncased-d-1
2022-03-23 14:31:19,265 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 31
2022-03-23 14:31:24,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 0
2022-03-23 14:31:29,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 1
2022-03-23 14:31:34,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 1
2022-03-23 14:31:39,201 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 0
2022-03-23 14:31:44,201 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 0
2022-03-23 14:31:49,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 1
2022-03-23 14:31:54,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 0
2022-03-23 14:31:57,949 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Preprocess time - 0.028371810913085938 ms
2022-03-23 14:31:57,949 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Predict time - 1648045917948.6992 ms
2022-03-23 14:31:57,949 [INFO ] W-9000-distilbert-base-uncased-d com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 70
2022-03-23 14:31:57,949 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Postprocess time - 0.07843971252441406 ms
2022-03-23 14:31:57,950 [INFO ] W-9000-distilbert-base-uncased-d ACCESS_LOG - /169.254.178.2:51978 "POST /invocations HTTP/1.1" 200 74
2022-03-23 14:31:59,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 1
2022-03-23 14:32:04,201 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 0
2022-03-23 14:32:09,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 1
2022-03-23 14:32:14,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 1
2022-03-23 14:32:19,202 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 1
2022-03-23 14:32:24,201 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 0
2022-03-23 14:47:01,381 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Preprocess time - 0.029802322387695312 ms
2022-03-23 14:47:01,381 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Predict time - 1648046821381.0054 ms
2022-03-23 14:47:01,381 [INFO ] W-9000-distilbert-base-uncased-d com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 61
2022-03-23 14:47:01,381 [INFO ] W-9000-distilbert-base-uncased-d ACCESS_LOG - /169.254.178.2:51978 "POST /invocations HTTP/1.1" 200 62
2022-03-23 14:47:01,381 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Postprocess time - 0.08940696716308594 ms
2022-03-23 14:47:02,644 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Preprocess time - 0.027418136596679688 ms
2022-03-23 14:47:02,644 [INFO ] W-9000-distilbert-base-uncased-d com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 61
2022-03-23 14:47:02,644 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Predict time - 1648046822644.274 ms
2022-03-23 14:47:02,645 [INFO ] W-9000-distilbert-base-uncased-d ACCESS_LOG - /169.254.178.2:51978 "POST /invocations HTTP/1.1" 200 62
2022-03-23 14:47:02,645 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Postprocess time - 0.08702278137207031 ms
2022-03-23 14:47:03,714 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Preprocess time - 0.02288818359375 ms
2022-03-23 14:47:03,715 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Predict time - 1648046823714.5115 ms
2022-03-23 14:47:03,715 [INFO ] W-9000-distilbert-base-uncased-d com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 61
2022-03-23 14:47:03,715 [INFO ] W-distilbert-base-uncased-d-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Postprocess time - 0.08249282836914062 ms
2022-03-23 14:47:03,715 [INFO ] W-9000-distilbert-base-uncased-d ACCESS_LOG - /169.254.178.2:51978 "POST /invocations HTTP/1.1" 200 61
2022-03-23 14:47:04,201 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:51978 "GET /ping HTTP/1.1" 200 0
When deploying a question answering model with later versions of python, transformers and pytorch the logs do not seem to be seen in cloudwatch. If I deploy the same model but with older python and module versions the logs can be seen in cloudwatch.
You say with an older version you see logs but not with a newer one? The Cases you attached are vice versa. What is correct?
Yeah sorry, fixed it.
I can see the logs with the old versions but not with the new ones.
@ibegleris i got in touch with the SageMaker Team. I ll come back when i know something
Looks like the logs are present in HF DLC with PT 1.8 as well.
I can confirm that with transformers_version = '4.17'
and pytorch_version = '1.10'
the logs appearing again.
Update: The issue is with the HF PT 1.9 containers not having latest version of sagemaker-inference
. We are working internally to resolve this.