aws/amazon-sagemaker-examples

[Bug Report] Xgboost in Sagemaker gives wildly different (and incorrect) result from my local model

pfan-well opened this issue · 0 comments

Describe the bug
I trained a binary classification model with Sagemaker's container for xgboost 1.7-1.
I also have previously developed a xgboost model for the same dataset locally.
The positive rate for the dataset is generally < 4%, very low occurrence.

When I compared the predicted probabilities from the sagemaker builtin model and my local model, the results are opposite.
Given the low positive rate I believe the sagemaker model outputs are incorrect.
See images.

I checked the inputs for training and they are identical, except that on my local machine I fed the model csv file whereas for sagemaker xgboost it required the data in libsvm format. But after double checking the training data were the same. I also fed the same hyperparameter.
Screenshot 2024-01-16 at 12 03 07 PM

Screenshot 2024-01-16 at 12 03 13 PM

To reproduce
For sagemaker:

from sagemaker.xgboost.estimator import XGBoost
# version 1:
xgb_script_mode_estimator = XGBoost(
    entry_point=script_path,
    framework_version="1.7-1",  
    # hyperparameters=hyperparameters,
    role=role,
    instance_count=2,
    instance_type=instance_type,
    output_path=output_path,
    code_location=output_path
)
# calling fit

# version 2:
from sagemaker.amazon.amazon_estimator import get_image_uri
container = get_image_uri(boto3.Session().region_name, "xgboost", "1.7-1")

xgb = sagemaker.estimator.Estimator(
    container,
    role,
    instance_count=1,
    instance_type="ml.m4.xlarge",
    output_path="s3://{}/{}/output".format(s3_bucket, key, "no-show-xgb"),
    sagemaker_session=sess,
)
# calling fit

Is there any way to debug this issue?