/GDR3

Primary LanguageJupyter Notebook

GDR3

This study examines the efficacy of sampling techniques in anomaly detection within astronomical data, particularly using the Gaia space telescope dataset. Focusing on Random Forest Classifiers, two distinct sampling methods—random sampling and uncertainty sampling—are evaluated to determine their impact on the class distribution of the sampled dataset and consequently, the classifier's accuracy in distinguishing between normal and anomalous stellar objects. The research reveals that uncertainty sampling significantly outperforms random sampling in both these cases. These findings highlight the importance of choosing appropriate sampling methods in handling imbalanced datasets and offer insights for future improvements in anomaly detection within the field of astronomy.