Branches approach_0
, approach_1
, and approach_2
contain the code to the corresponding approach described on the project website. The SIFT feature matching baseline can be found under sift
and can be run with eval_sift.py
.
Run from the repo's root directory, python process_data.py <path/to/scene/data> <full/path/to/target/images> <path/to/data>
where <path/to/scene/data>
is the YCB data directory, <full/path/to/target/images>
is the generated target image directory, and <path/to/data>
is the processed dataset that stores the images and bounding boxes.
Each scene image corresponds to
- scene image (e.g.
000000_scene.png
) - bounding box file (e.g.
000000.npy
) - ten target images (e.g.
000000_target_0000.png
-000000_target_0009.png
)
<path/to/data>/class.npy
contains the class label for each scene image.
- Run
moments.py --data-dir=<path/to/data>
wherepath/to/data
is where the images and bounding boxes are store. This script generates a moments.npz file used during training to normalize the images. - Run
train.py <path/to/data> ...
wherepath/to/data
is where the images, bounding boxes and moments.npz file. Specify any other arguments, replacing...
with them, which are defined intrain.py
. You can also specify other arguments inconstants.py
if you don't want to deal with command line arguments. - Training will produce tensorboard logs and save the model in
nets/run_id
where run_id is some timestamp of the experiment.
You can modify the feature extractor network in feature_net.py
. The bounding box network can be found in bb_net.py
and predicts the bounding box from the combined features of the scene and target images.
python data_gen.py --model-dir ../YCB_Video_Models/models/002_master_chef_can --output-dir ../YCB_Video_Models/models/002_master_chef_can/target/ --image-size <width> <height>
gdown https://drive.google.com/drive/folders/11QVBhEkmpfFzsDrs5Dj59f9HOijNtCnH -O <dataset-dir> --folder