[DiTMoS: Delving into Diverse Tiny-Model Selection on Microcontrollers]

Description

This is an implementation example of DiTMoS(server side). DiTMoS is a framework to utilize a set of tiny models to boost the accuracy using model selection on time-series mobile applications under comparable memory and latency constraints.

Our paper won the Mark Weiser Best Paper Award on PerCom'2024 .

Requirements

The DiTMoS code is implemented by:

PyTorch v3.10.11
SciPy v1.10.1
scikit-learn v1.2.2.

How to Run DiTMoS

The code is a full example of DiTMoS implementation on the UniMiB-SHAR dataset which has 17 classes for human activity recognition. The dataset can be found in the datasets folder. Since the UniMiB-SHAR dataset is originally processed by Matlab, we provide a pre-processing module to convert the Matlab version to python version and split the full dataset to training and test sets at 80%:20%.

Run UniMiB-preprocessing.py from pre-processing folder to create training and testing datasets.
Run the example.py to implement DiTMoS on the sever and report the final accuracy.
To test the accuracy of DiTMoS, run the inference.py and see the performance saved in the saved_model folder.

The Workflow of DiTMoS

There are four steps to implement DiTMoS framework.

Pre-train a strong model which can achieve nearly 95% accuracy on UniMiB-SHAR for data splitting. (The pre-trained model will be saved to pretrained_model folder)
Extract the features of the samples from the strong model and leverage K-Means clustering to split the dataset into several subsets.
Train the classifiers and selector by an adversarial training manner. The classifiers and selector will be trained iteratively.
Execute inference to test the performance of DiTMoS and save the models to saved_model folder.

We implement the clustering, adversarial training, and testing components in the DiTMoS folder. The architectures of the strong model, the selector and classifiers are defined in model.py file. The strong model is a 6-layer CNN while the selector and classifier are both 3-layer CNN including feature aggregation module. Compared to the best baseline(79%), the default example DiTMoS can achieve more than 86% on accuracy. Users can modify the code for their own applications.

Hardware Inplementation

We test the system performance of DiTMoS on STM32F767ZI board using STM32CubeIDE software and STM32Cube.AI package. To implement the hardware version, you need to convert the model to ONNX format using torch.onnx.export command and load the classifiers and the selector to STM32Cube.AI one by one manually. If you want to implement our network slicing strategy, you need to first define the slicing model components and transfer the model parameters to the slicing components. Then implement the slicing models to the hardware.

TheMaXiao/DiTMoS

[DiTMoS: Delving into Diverse Tiny-Model Selection on Microcontrollers]

Description

Requirements

How to Run DiTMoS

The Workflow of DiTMoS

Hardware Inplementation