This repository includes models and datasets to train and use as proxies as part of a GFlowNet pipline. The aim is to construct/sample crystal objects with a probability that is porportioanl to the reward calculated from the output of the proxy model. Possible targets include 'formation energy per atom' or 'ionic conductivity',
You can look at training results here
This package should always be installable as
pip install .
# or
pip install -e .
The code runs on python>=3.9,<3.12, and the required packages can be installed using
pip install -r requirements_materials.txt
If you are experiencing dependencies issues, here is a working configuration:
python -m pip install torch==2.0.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
python -m pip install torch-scatter torch-geometric -f https://data.pyg.org/whl/torch-2.0.1+cu117.html
pip install torchmetrics torchvision lightning lightning-cloud lightning-utilities black click flake8 matplotlib numpy oauthlib pandas pandocfilters Pillow pymatgen scikit-learn scipy setuptools sympy wandb wheel phast minydra faenet pyxtal
Dataset can downloaded using functions in utils/mp.py The model can then be trained using utils/ic_run.py The dataset class in utils/crystal_data.py is being modified to match cdvae pipleline.
The CDVAE paper and repository provides 3 datasets that we will use as baselines for GFlowNet training. The dataset code is under proxies/data.py and can be trained using run.py.
The 'config' folder will contain configurations/hyperparameter dictionaries to search over or use and train.
- Create a yaml file (’sweep_wandb.yml’) following the instructions given in https://docs.wandb.ai/guides/sweeps/configuration. It contains the parameters we shall sweep over.
- Initiate a wandb sweep (manually from terminal) with the command:
wandb sweep path_to_file/sweep_wandb.yml --name=’test’
. Store the sweep_id - Launch a sweep agent using a slurm script with
sbatch sweep_mlp.sh
which containswandb agent --count 5 mila-ocp/ocp/sweep_id
. The count specificies the number of hyperparam settings to test. To launch several agents (i.e. gpus), usesbtach --array=0-5
. - Visualise the results in sweeps section of wandb, under the ActiveLearningMaterials repo.
Currently incompatible with Python >= 3.12 because
mendeleev (0.14.0) requires Python >=3.8.1,<3.12