We apply ∂4 to a visual question answering problem, and they jointly learned using the CLEVR dataset end-to-end. Here for more details.
∂4 interpreter
Python 3
Pytorch 1.10.0
with
- Tensorflow 0.11.0
pip3 install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0-cp35-cp35m-linux_x86_64.whl
Generate CLEVR dataset form here.
Use this template for generating questions.
And save the rendered images and the generated CLEVR_questions.json to vqa/data directory.
We edit ∂4: extensible_dsm.py, line 275. We changed the type into float32: create_alg_op_matrixret = np.zeros([size, size,size], dtype=np.float32).
python3 scripts/extract_features.py \
--input_image_dir data/images\
--output_h5_file data/train_features.h5
Use this vocal.json for vocabs.
python3 scripts/preprocess_questions.py \
--input_questions_json data/CLEVR_questions.json \
--input_vocab_json data/vocab.json \
--output_h5_file data/train_questions.h5
Use this vocal.json for vocabs.
python3 scripts/trainer.py