Sagemaker serverless - ValidationError: oversize body
sid8491 opened this issue · 0 comments
Describe the bug
I have an image classification model stored in s3 location.
If I deploy the model as real-time inference, then I am able to predict any image size, however, if I deploy the model as serverless then I am not able to predict images bigger than 400x400 size.
Getting following error:
ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Request {request-id} has oversize body.
To reproduce
Deploy the model as serverless:
from sagemaker.tensorflow import TensorFlowModel
from sagemaker import get_execution_role
from sagemaker import Session
import boto3
from sagemaker.serverless import ServerlessInferenceConfig
print('starting ...')
model_data = "s3://datascience--sagemaker/model_repository/Reimbursement Flow/blur_classifier/model.tar.gz"
role = get_execution_role()
sess = Session()
bucket = sess.default_bucket()
region = boto3.Session().region_name
tf_framework_version = '2.0.0'
sm_model = TensorFlowModel(model_data = model_data,
framework_version = tf_framework_version,
role=role)
predictor = sm_model.deploy(
endpoint_name = 'blur-classifier-serverless',
serverless_inference_config = ServerlessInferenceConfig(
memory_size_in_mb= 2048,
max_concurrency= 1,
)
)
Predict as:
from PIL import Image
import numpy as np
import boto3
runtime = boto3.client("sagemaker-runtime")
import json
img_file="doc_classifier_images/012_0205.jpg"
img_file="0ed2d3e1-fe50-4026-b3bf-9aef535f48cc.jpg"
img = Image.open(img_file)
size = 600
img = img.resize((size, size))
img = np.array(img)
img = img.reshape((1, size, size, 3))
img = img/255.
img = np.around(img, decimals=3)
payload = json.dumps(np.asarray(img).astype(float).tolist())
model_name = "blur-classifier-serverless"
content_type = "application/json"
response = runtime.invoke_endpoint( EndpointName=model_name, ContentType=content_type, Body=payload)
pred=json.load(response['Body'])
Expected behavior
Prediction should be made.
If I keep size = 400
then it is working fine.
Or if I deploy the model as real-time inference then it is working fine for both 400 and 600 size images.
Additional context
Deployment script to deploy the model as real-time model:
from sagemaker.tensorflow import TensorFlowModel
from sagemaker import get_execution_role
from sagemaker import Session
print('starting ...')
model_data = "s3://datascience--sagemaker/model_repository/Reimbursement Flow/blur_classifier/model.tar.gz"
instance_type = "ml.m4.xlarge"
role = get_execution_role()
sess = Session()
bucket = sess.default_bucket()
instance_type = 'ml.m4.xlarge'
tf_framework_version = '2.0.0'
temp_endpoint_name = "temp"
sm_model = TensorFlowModel(model_data = model_data,
framework_version = tf_framework_version,
role=role)
# Now to deploy the model
tf_predictor = sm_model.deploy(endpoint_name="blurclassifier-server",
initial_instance_count=1,
instance_type=instance_type,
)