A .Net model building job that builds several models using MLB Baseball data from 1876 - 2023.
The outcome are two classification supervised learning predictions:
- On Hall Of Fame Ballot - whether a batter will be on the Hall of Fame Ballot, based on their career statistics
- Inducted To Hall Of Fame - whether a batter will be inducted to the Hall of Fame, based on their career statistics
The model building job includes the following features:
- Builds multiple ML.NET binary classification models in a single C# "script" (training job)
- Dynamic Feature Selection - Select features from a configuration array to adjust model input dynamically
- Dynamic Supervised Learning - Includes two label fields in a single data set, that can be switched dynamically
- Base data transformer pipeline that is re-used for all trained models as a base
- Reports various performance metrics using a pre-defined holdout set
- Persists the trained models in two different formats: native ML.NET and ONNX
- Loads the persisted models from storage and performs model explainability
- Applies simple perscriptive/rules engine to select the "best model"
- Selected "best model" is used for inference on new ficticious baseball player careers (to verify overall performance)
Current Requirements:
- Visual Studio 2022 IDE (Community SKU+), .NET 9.x, ML.NET v4.0
Requirements (what the solution has been developed, compiled with orginally through current):
- Visual Studio 2019 - 2022 IDE (Community SKU+), .NET Core 3.x - .NET 8.x, ML.NET v1.1 - v3.0.1