This project aims to investigate the feasibility of motion detection (especially for distant objects) in the way of supervised learning, together with optical flow. Overall, optical flow of target objects are fed into neural network as inputs, while the outputs demonstrate the confidence of the objects' status, i.e., moving or still. For details, please refer to the associated paper posted on arXiv.
The project is based on several open resources as listed below:
- Dataset: nuScenes
- Optical flow algorithms: FastFlowNet, Raft
- Model: ResNet18
Here are our classification scores of different optical flow algorithms:
F1 (%) | Precision (%) | Recall (%) | |
---|---|---|---|
FastFlowNet (Kitti) | 92.9 | 94.3 | 91.7 |
Raft (Kitti) | 89.5 | 89.7 | 89.9 |
Videos of visualization can be found here. The predictions are shown in different colors: red boxes represents moving objects while blue boxes represents still objects.
The code has been tested with Python 3.8, PyTorch 1.6 and Cuda 10.2.
If you would like to have a quick try about the inference and visualization with nuScenes dataset, you can simply run the following commands to generate a demo_visual.mp4
:
mkdir demo_infer
python demo_model_visual.py
ffmpeg -r 2 -pattern_type glob -i './demo_infer/*.png' -pix_fmt yuv420p -b 8000k demo_visual.mp4
The procedure below helps you apply our model on your own images of a video:
sh custom_mkdir.sh
- Put all of your images under
custom_demo/custom_raw
directory. The images should be named in order according to time, e.g.001.png
,002.png
. - Generate the corresponding labels by yourself and save them into
custom_demo/custom_label.csv
. Please refer todemo_label.csv
for the way of organizing labels. Note thatmotionFlag
is actually not needed here. - Clone the repository of FastFlowNet and/or RAFT and set them up. To create a conda environment needed for FastFlowNet, run
source ./dev/env_fastflow.sh
to help. - Move
custom_fastflow.py
andcustom_raft.py
to the root directory of FastFlowNet or RAFT, like{FILE_PATH}/FastFlowNet/
. - Run
python custom_fastflow.py --path REPO_PATH
orpython custom_raft.py --repo_path REPO_PATH --model=models/raft-kitti.pth
, whereREPO_PATH
means the path of MotionDectection repository, to generate the optical flow data. - Run
python custom_model_visual.py
for inference and visualization. - Run
ffmpeg -r 2 -pattern_type glob -i './custom_demo/custom_infer/*.png' -pix_fmt yuv420p -b 8000k custom_visual.mp4
and you will get your own video!
Some preperations are needed before starting.
cd dev ; sh mkdir.sh
- Download nuScenes dataset and install nuscenes-devkit.
- Clone the repository of FastFlowNet and/or RAFT and set them up. To create a conda environment needed for FastFlowNet, run
source env_fastflow.sh
to help. - Move
flow_fastflow.py
andflow_raft.py
to the root directory of FastFlowNet or RAFT, like{FILE_PATH}/FastFlowNet/
. - Follow the instructions to complete
config.yaml
. Also, modify the path ofconfig.yaml
inflow_fastflow.py
andflow_raft.py
:
# Please modify the path of config.yaml
with open("{FILE_PATH}/config.yaml") as f:
config = yaml.load(f, Loader=yaml.FullLoader)
Now you are ready to go!
-
python scene_filter.py
Select target scenes from nuScenes and save their tokens intouse_scene.json
. -
python select_dataset.py
Iterate over selected scenes to find frame pairs that contain target obj(s) and save their paths and tokens intorawlst_train.json
orrawlst_valid.json
. -
python flow_fastflow.py
orpython flow_raft.py --model=models/raft-kitti.pth
Generate optical flow graphs of selected raw images. Note that the scripts are by default set to use FastFlowNet, so please replace the word "fastflow" with "raft" for the scripts in the following steps if you would like to use RAFT instead. -
python generate_label.py
Estimate the velocity of an object using two frames and label it as 1(moving) or 0(still). All labels information are then saved intolabel_train.csv
andlabel_valid.csv
. -
python model_train.py
Find objects in frames and cut them out, followed by some preproccessing. Then fed the cutting pieces into the network to train. -
python model_visual.py
Visualize the prediction with the format mentioned above. -
ffmpeg -r 2 -pattern_type glob -i 'visual/seq_20/*.png' -pix_fmt yuv420p -b 8000k visual.mp4
Generate a video. Now you successfully obtain a video, but in a simpler form which only contains keyframes. To generate a complete video that includes non-keyframes and also nearby objects, there are several more steps:
mv flow_fastflow_expand.py {FILE_PATH}/FastFlowNet/
python {FILE_PATH}/FastFlowNet/flow_fastflow_expand.py
python generate_label_expand.py
python model_visual_expand.py
ffmpeg -r 12 -pattern_type glob -i 'visual_expand/scene_6/*.png' -pix_fmt yuv420p -b 8000k visual_expand.mp4
- Enjoy! ;)
Some of the scripts, namely flow_fastflow.py
, flow_fastflow_expand.py
and flow_raft.py
, are based on the code of their original projects, FastFlowNet and RAFT.