UNeXt: Paper Implementation for EN3160 - Image Processing and Machine Vision

Introduction
Architecture
Installation
Dataset
Results
Video Segmentation
References

Introduction

The UNeXt is lightweight, more efficient segmentation architecture that maintains state of the art accuracy. The UNeXt paper proposes a hybrid model that combines the strengths of convolutional layers with MLP layers to enhance feature extraction, all while reducing computational costs. This project aims to implement and replicate the results of the the UNeXt model as a part of EN3160 Image Processing and Machine Vision module.

Architecture

The following diagram shows the architecture of the UNeXt model.

Installation

To set up the environment, follow these steps:

Clone this repository:

git clone https://github.com/nadunnr/UNeXt.git
cd UNeXt

Set up the dataset as specified in Dataset:
Run the train.py file for model training by giving a model name:
```
python3 train.py --name model_1
```
Run the val.py file for model inferencing by specifying the model name:
```
python3 val.py --name experiment1 --load_model model_1
```
Run the export.py file to export an optimized model as ONNX format:
```
python3 export.py --input model_1.pth --output model_1
```

Dataset

For this project, we have used the BUSI dataset, which provides annotated ultrasound images. The dataset should be organized in the following structure. Note that multiple annotation masks can be provided under nested folders inside masks folder. Rename them with class indeces.:

data/
├── images/
│   ├── 001.png
│   ├── 002.png
├── masks/
│   ├── 0/
│   |   ├── 001.png
|   |   ├── 002.png

Refer to the dataset provider’s terms and conditions for use.

Results

These results were obtained from the best model we trained in our reimplementation of the UNeXt paper.

Metric	Authors' Results (BUSI)	Our Results
Dice Coefficient (F1 Score)	79.37 ± 0.57	85.89
IoU (Intersection over Union)	66.95 ± 1.22	75.48
Parameter Count	1.47 M	1.4719 M
Inference Speed (per image)	25 ms	21.45 ms
GFLOPs	0.57	0.573
Dataset	BUSI	BUSI

Video-Segmentation

Using the UNeXt's lightweight structure we inferenced on Breast Ultrasound video footage to achieve real time segmentation at 40 fps. We also tested this on a Raspberry Pi 4 model B with 2 GB RAM, which resulted in a segmentation at 5 fps. We used the Breast Ultrasound Video Dataset for video data. The result is as follows.

References

UNeXt: MLP-based Rapid Medical Image Segmentation Network
Official UNeXt Implementation: GitHub Repository

devnithw/UNeXt