2022 AI-CITY CHALLENGE - TRACK3: Naturalistic Driving Action Recognition
This project is developed by Cybercore AI for AI City Challenge 2022 Track 3. See details in our paper. The project is based on the open source mmaction2 and action-former
Installation
Please follow the steps in docs/Env_setup.md. We provide both Docker and Conda Env.
Training
1. Pretrain the Backbone to extract feature:
Please follow the steps in docs/Pretrain_backbone.md
2. Train the Second Stage Classifier:
Please follow the steps in docs/second_stage_classifier.md
3. Train the Detector and Generate Pseudo Labels on A2.
Please follow the steps in docs/Proposal_Generation.md
Inference on testing dataset
Step 1. Run the detector to create proposals
1.1 Extract rawframes on new videos-B:
IN_DIR='path/to/videos'
OUT_DIR="data/raw_frames/full_video/B"
reproduce_scripts/detector/extract_rawframes_fullvideo.sh $IN_DIR $OUT_DIR
where:
IN_DIR
is the directory of testing videos . For A2 test set as an example, setIN_DIR=data/raw_video/A2
. For B test set, setIN_DIR=data/raw_video/B
.- The extracted frames for each video is saved at
OUT_DIR
folder, and procecced in the step 1.2
1.2 Extract features on new dataset-B
IN_DIR='path/to/raw_frames/full_video/B'
OUT_DIR="tsp_features/round_b/"
CKPT="http://118.69.233.170:60001/open/AICity/track3/detector/ckpt/round2_tsp_67.5.pth"
./reproduce_scripts/detector/extract_tsp_b.sh $CKPT $IN_DIR $OUT_DIR
where:
IN_DIR
is the directory of testing raw frames. For A2 test set as an example, setIN_DIR=data/raw_frames/full_video/A2
. For B test set, setIN_DIR=data/raw_frames/full_video/B
.CKPT
is our pretrained checkpoint.- The extracted feature for each video is saved at
OUT_DIR
folder, and procecced in the step 1.3
1.3 Generate proposals
Using the feature extracted in step 1.1, run the following command to create proposals:
CONFIG="configs/aicity/actionformer/track3_actionformer_B.yaml"
CKPT="http://118.69.233.170:60001/open/AICity/track3/detector/ckpt/round2_map_31.55.pth.tar"
PROPOSAL_RESULT="proposals.json"
./reproduce_scripts/detector/val_actionformer.sh $CONFIG $CKPT $PROPOSAL_RESULT
where:
CONFIG
is the model's config file. We assume the extracted feature in Step 1.1. is saved astsp_features/round_b/
, which is specified in the config yaml. If you change the directory or testing video, please adjust the config file accordingly.CKPT
is the model's checkpoint.PROPOSAL_RESULT
is the output file, which is used in the step 2.
Step 2. Inference classification from action-former proposal
TEST_VIDEO_DIR=<path/to/test_video>
PROPOSAL_RESULT="proposals.json"
OUTPUT="result_submission.json"
./reproduce_scripts/second_stage_classifier/inference.sh $TEST_VIDEO_DIR $PROPOSAL_RESULT $OUTPUT
For example:
./reproduce_scripts/second_stage_classifier/inference.sh \
./data/raw_video/A2 \
proposals.json \
result_submission.json
After running this script: we have result file: result_submission.json
which is used for post-processing in the next step.
Step 3: post-processing and generate submission file on Server.
INPUT_FILE=result_submission.json
./reproduce_scripts/gen_submit.sh $INPUT_FILE
where:
INPUT_FILE
is the output from step 2.- This will generate the file
submit.txt
for submission. - NOTE: For submission, we need the
video_ids
file, such as one intools/detector/video_ids_A2clean.csv
. To run inference on B dataset (unknown at this time), we need a similar file. See input oftools/detector/convert_json2txt.py
.
Credits:
We thank mmaction2, action-former, and ConvNeXt code base. Please cite their works should you find our code is helpful.