/transunet_pytorch

Primary LanguagePythonMIT LicenseMIT

Panoramic Dental X-Ray Image Semantic Segmentation with TransUnet

The unofficial implementation of TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation on Pytorch

Output Output of my implementation. (A) Original X-Ray Image; (B) Merged Image of the Predicted Segmentation Map and Original X-Ray; (C) Ground Truth; (D) Predicted Segmentation Map

TransUNet

  • On various medical image segmentation tasks, the ushaped architecture, also known as U-Net, has become the de-facto standard and achieved tremendous success. However, due to the intrinsic locality of convolution operations, U-Net generally demonstrates limitations in explicitly modeling long-range dependency. [1]
  • TransUNet employs a hybrid CNN-Transformer architecture to leverage both detailed high-resolution spatial information from CNN features and the global context encoded by Transformers. [1]

Model Architecture

Model Architecture

TransUNet Architecture Figure from Official Paper

Dependencies

  • Python 3.6+
  • pip install -r requirements.txt

Dataset

  • UFBA_UESC_DENTAL_IMAGES[2] dataset was used for training.
  • Dataset can be accessed by request[3].

Training

  • Training process can be started with following command.
    • python main.py --mode train --model_path ./path/to/model --train_path ./path/to/trainset --test_path ./path/to/testset

Inference

  • After model is trained, inference can be run with following command.
    • python main.py --mode inference --model_path ./path/to/model --image_path ./path/to/image

Other Implementations

References