Hybrid CNN and Vision Transformer (VIT) with Asymmetric Loss (ASL) for Multi-label TB Classification Tuberculosis (TB) caused by Mycobacterium tuberculosis is a leading cause of global number one mortality. Early detection of TB relies on the bacterial isolation technique and chest X-ray (CXR) imaging, but the time-consuming nature and subjectivity of CXR interpretation pose challenges. To address this, we present a novel approach utilizing artificial intelligence (AI) and deep learning for TB diagnosis, monitoring, and early detection. Our study focuses on a hybrid model combining the Vision Transformer (ViT) and Convolutional Neural Network (CNN) architectures, enabling efficient analysis of CXR images. We designed a multi-label classification model using imbalanced CXR data with 14 specific TB-related anomalies. Remarkably, our hybrid ViT and EfficientNet achieved a good performance with Assymetric Loss.
vanya2v/Hybrid-CNN-ViT-ASL
Hybrid CNN and Vision Transformer with Asymmetric Loss for Multi-class TB Classification
Jupyter NotebookMIT