AI: hyperparameter, metric, metricDecisionThreshold - can be multiple. Property talks about only one.
VenkatTechnologist opened this issue · 11 comments
There could be multiple hyperparameters, metrics, and metricDecisionThresholds for an AI model.
The properties 'hyperparameter', 'metric' and 'metricDecisionThreshold' talk about only one value, and their class
is DictionaryEntry, which can hold only one key-pair. How can we accommodate multiple hyperparameters,
metrics, and metricDecisionThresholds of an AI model in the AI profile of SPDX?
Some real world, open source examples where multiple hyperparameters are used at once:
Machine Learning:
- XGBoost (eXtreme Gradient Boosting) : n_estimators, learning_rate, max_depth, L1/L2 regularization parameters
- TensorFlow/Keras Convolutional Neural Network (CNN): No. of convolution layers & filters, Kernel size, Pooling layer settings, Optimizer and learning rate
- Scikit-learn Support Vector Machine (SVM): Regularization parameter C, Kernel, Gamma (for RBF Kernel)
Generative-AI:
- StyleGAN2 - TensorFlow/Keras (Generative Adversarial Network): No. of layers & filters, Learning rates for generator and discriminator, Noise input dimension, Regularization hyperparameters)
- BicycleGAN - PyTorch (Conditional GAN): Network architectures for generator & discriminator, Loss functions, Weight normalization
- OpenAI Gym with TensorFlow/Keras (Reinforcement Learning for Generative Models): Reward function design, Exploration vs exploitation, RL algorithm hyperparameters
Metrics used to assess and their associated thresholds can be multiple in machine learning as well as generative AI algorithms.
Real world, Open source examples:
Generative AI:
StyleGAN2 - PyTorch (Generative Adversarial Networks): Fréchet Inception Distance (FID), Inception Score (IS), Human Evaluation
MelNet - TensorFlow (WaveNet-based Text-to-Speech): Mel-Cepstral Distortion (MCD, Log mel Spectrogram Similarity,
OpenAI GPT-2 (Large Language Model): Perplexity, BLEU
Machine Learning:
Scikit-learn Multi-Class Classification: Accuracy, Confusion Matrix, Precision and Recall, F1-Score
TensorFlow/Keras Object Detection: Mean Average Precision, Intersection over Union
XGBoost Regression with Feature Importance: Mean Squared Error, R-squared, Feature Importance
On that note, it looks like we need a class called 'Dictionary' to list multiple hyperparameters, metrics, and metric thresholds.
See #773.
In AIPackage
, all of them (hyperparameter, metric, metricDecisionThreshold) has minCount
= 0 and no maxCount
. Which means one AIPackage
can have multiple of these properties (= multiple entries in an array)
I think we can close this one, as it is very clear that the current model can accommodate the expressed concern.
It will be through relationships like contains
and dependsOn
.
See possible relationship types here: https://spdx.github.io/spdx-spec/v3.0/model/Core/Vocabularies/RelationshipType/
@goneall I think we can close this one as it is a non-issue: the cardinality of this property in AIPackage is 0..* -- @VenkatTechnologist has agreed on this.
Closing per above suggestion