/UNeXt

Reproducing UneXt: MLP-based Rapid Medical Image Segmentation Network

Primary LanguagePython

UNeXt: Paper Implementation for EN3160 - Image Processing and Machine Vision

Table of Contents

Introduction

The UNeXt is lightweight, more efficient segmentation architecture that maintains state of the art accuracy. The UNeXt paper proposes a hybrid model that combines the strengths of convolutional layers with MLP layers to enhance feature extraction, all while reducing computational costs. This project aims to implement and replicate the results of the the UNeXt model as a part of EN3160 Image Processing and Machine Vision module.

Architecture

The following diagram shows the architecture of the UNeXt model.

model_image

Installation

To set up the environment, follow these steps:

  1. Clone this repository:

    git clone https://github.com/nadunnr/UNeXt.git
    cd UNeXt
  2. Set up the dataset as specified in Dataset:

  3. Run the train.py file for model training by giving a model name:

    python3 train.py --name model_1
  4. Run the val.py file for model inferencing by specifying the model name:

    python3 val.py --name experiment1 --load_model model_1
  5. Run the export.py file to export an optimized model as ONNX format:

    python3 export.py --input model_1.pth --output model_1

Dataset

For this project, we have used the BUSI dataset, which provides annotated ultrasound images. The dataset should be organized in the following structure. Note that multiple annotation masks can be provided under nested folders inside masks folder. Rename them with class indeces.:

data/
├── images/
│   ├── 001.png
│   ├── 002.png
├── masks/
│   ├── 0/
│   |   ├── 001.png
|   |   ├── 002.png

Refer to the dataset provider’s terms and conditions for use.

Results

These results were obtained from the best model we trained in our reimplementation of the UNeXt paper.

Metric Authors' Results (BUSI) Our Results
Dice Coefficient (F1 Score) 79.37 ± 0.57 85.89
IoU (Intersection over Union) 66.95 ± 1.22 75.48
Parameter Count 1.47 M 1.4719 M
Inference Speed (per image) 25 ms 21.45 ms
GFLOPs 0.57 0.573
Dataset BUSI BUSI

Video-Segmentation

Using the UNeXt's lightweight structure we inferenced on Breast Ultrasound video footage to achieve real time segmentation at 40 fps. We also tested this on a Raspberry Pi 4 model B with 2 GB RAM, which resulted in a segmentation at 5 fps. We used the Breast Ultrasound Video Dataset for video data. The result is as follows.

References