MouseSIS: Space-Time Instance Segmentation of Mice

This is the official repository for MouseSIS: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice, accepted at the Workshop on Neuromorphic Vision in conjunction with ECCV 2024 by Friedhelm Hamann, Hanxiong Li, Paul Mieske, Lars Lewejohann and Guillermo Gallego.

👀 Currently, the test set of this dataset is not available in preparation of a challenge (see split in the paper Tab. 2). You can still run our baseline method on the validation set and we'll soon provide access to an evaluation server. Stay tuned or in case of questions contact us!

Installation
Data Preparation
Evaluation
- Evaluation of ModelMixSORT
- Evaluating Your Own Method
Training
Acknowledgements
Citation
Additional Resources
License

Installation

Clone the repository:

git clone https://github.com/tub-rip/MouseSIS_dev.git
cd MouseSIS_dev

Set up the environment:

conda create --name MouseSIS python=3.8
conda activate MouseSIS

Install PyTorch (choose a command compatible with your CUDA version from the PyTorch website):
```
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
```
Install other dependencies:
```
pip install -r requirements.txt
```

Data Preparation

Create a folder for the original data
```
cd <project-root>
mkdir -p data/orig
```

Download the data and annotation and save it in <project-root>/data/orig. The data/orig folder should be organized as follows:

data/orig/
│
├── top/
│   ├── seq_01.hdf5
│   ├── seq_02.hdf5
│   ├── ...
│   └── seq_33.hdf5
│
├── dataset_info.csv
└── annotations.json

top/: This directory contains the frame and event data for the Mouse dataset captured from top view, stored as 33 individual .hdf5 files, each containing approximately 20 seconds of data (around 600 frames), along with temporally aligned events.
dataset_info.csv: This CSV file contains metadata for each sequence, such as recording dates, providing additional context and details about the dataset.
annotations.json: The annotation file of top view follows a structure similar to MSCOCO's format in JSON, with some modifications. The definition of json file is:

{
    "info": {
        "description": "string",  
        "version": "string",  
        "date_created": "string"  
    },
    "videos": [
        {
            "id": "string", // video_id from "01" to "33"
            "width": 1280,  // Width of the video in pixels
            "height": 720,  // Height of the video in pixels
            "length": "int"  // Number of frames in the video
        }
    ],
    "annotations": [
        {
            "id": "int",  // Instance number for the mouse
            "video_id": "string",  // Corresponding video_id from "01" to "33"
            "category_id": 1,  // The category ID for the object
            "segmentations": [
                {
                    "size": [720, 1280],  // Size of the segmentation mask
                    "counts": "RLE encoded string or null"  // RLE encoded segmentation or null
                }
            ],
            "areas": [0.0],  // Area of the object (can be null)
            "bboxes": [[0.0, 0.0, 0.0, 0.0]],  // Bounding box for the object [x_min, y_min, width, height]
            "iscrowd": 0  
        }
    ],
    "categories": [
        {
            "id": 1,  
            "name": "mouse", 
            "supercategory": "animal"  
        }
    ]
}

To evaluate the ModelMixSORT method or train the YOLO model used within it, you first need to convert the original dataset into YOLO format. For grayscale frames, Please run the following command.
```
python3 scripts/preprocess.py --data_root data/orig --data_format frame
```
For reconstructed e2vid images, Please run the following command.
```
python3 scripts/preprocess.py --data_root data/orig --data_format e2vid 
```
You can check the preprocessed data under data/prepocessed

Evaluation

Evaluation of ModelMixSORT

Download the model weights:

mkdir models
# Download yolo_e2vid.pt, yolo_frame.pt, and XMem.pth from the provided link
# and place them in the models directory

Run inference:
```
python3 scripts/inference.py --config configs/predict/combined.yaml
```
We provide several config files in configs/predict for the different inference settings. The inference script produces per sequence predictions and visualizations. All predictions are summarized in final_results.json . Each prediction follows this structure:
```
[
    {
        "video_id": int, 
        "category_id": int, 
        "segmentations": [
                {
                    "size": [int, int],
                    "counts": "RLE encoded string or null"
                },
                ...
            ],
        "score": float
    },
    ...
]
```
The final_results.json file is also saved under the src/TrackEval/data/trackers folder for use with the TrackEval evaluation tool.
Evaluate the results (based on TrackEval). The general command is:
```
python src/TrackEval/run_mouse_eval.py --TRACKERS_TO_EVAL <tracker_name> --SPLIT_TO_EVAL <split_name>
```
So, if you run inference with configs/predict/combined.yaml, the command looks like this:
```
python src/TrackEval/run_mouse_eval.py --TRACKERS_TO_EVAL combined_0.1 --SPLIT_TO_EVAL test_wo17
```
The provided result in the paper is Tab. 4 line 3 (w/o 1 & 7).

Evaluating Your Own Method

To evaluate your own method, please generate the output in JSON format, following the structure of final_result.json as described in the evaluation section. Place this JSON file in src/TrackEval/data/trackers/<your_tracker_name>/test, where your_tracker_name should be replaced with the name of your own tracker. Then, run the evaluation using the command:

python src/TrackEval/run_mouse_eval.py --TRACKERS_TO_EVAL <your_tracker_name> --SPLIT_TO_EVAL <split_name>

Training

To train the yolo models used in ModelMixSORT using preprocessed grayscale mice data, please run:

python scripts/train.py --config configs/train/frame.yaml

To train the yolo model using preprocessed e2vid mice data, please run:

python scripts/train.py --config configs/train/e2vid.yaml

Acknowledgements

We greatfully appreciate the following repositories and thank the authors for their excellent work:

Citation

If you find this work useful in your research, please consider citing:

@inproceedings{hamann2024mousesis,
  title={MouseSIS: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice},
  author={Hamann, Friedhelm and Li, Hanxiong and Mieske, Paul and Lewejohann, Lars and Gallego, Guillermo},
  booktitle={Proceedings of the European Conference on Computer Vision Workshops (ECCVW)},
  year={2024}
}

Additional Resources

License

This project is licensed under the MIT License - see the LICENSE file for details.

tub-rip/MouseSIS