Preprocessing steps were tested on a machine with 64GB RAM.
For training Graph Neural Networks CUDA GPU is required.
The project uses nbdev
to create Python files from Jupyter notebook. To "make" project run
nbdev_build_lib; pip install -e .
in the root directory.
We use ploomber for managing training and data preprocessing.
For example to create csv files with extracted READMEs run
ploomber build --partial make_readmes --skip-upstream --force
Relevant definitions can be found in pipeline.yaml
and env.yaml
Ploomber step:
run_gnn_experiment