[Bug Report] Status: Failed when host Llama2-70B on Amazon SageMaker using TRT-LLM LMI container
NarenZen opened this issue · 0 comments
NarenZen commented
Link to the notebook
Notebook
Describe the bug
A clear and concise description of what the bug is.
**To reproduce**
import time
resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)
while status == "Creating":
time.sleep(60)
resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)
print("Arn: " + resp["EndpointArn"])
print("Status: " + status)
Running the above code, gives the below error
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: Failed
Arn: arn:aws:sagemaker:us-west-2:884893011661:endpoint/llama2-djl-2023-11-30-05-48-40-806-endpoint1
Status: Failed
Logs
If applicable, add logs to help explain your problem.
You may also attach an .ipynb
file to this issue if it includes relevant logs or output.