/materials_data_api_scripts

scripts to load all data from ICSD, Materials Project, and OQMD

Primary LanguageJupyter NotebookMIT LicenseMIT

Materials Data API Scripts

This repository contains python scripts and Jupyter notebooks to download and analyse the data from:

  • Materials project (MP)
  • Open quantum materials database (OQMD)
  • Inorganic crystal structure database (ICSD)

A periodic table combining data from mendeleev, ase, pymatgen and other custom descriptors is provided.

Content of this Readme

Requirements

The download scripts require

pip install requests
pip install pandas
pip install jupyter
pip install pymatgen

The notebooks require

pip install seaborn
pip install sklearn
pip install umap-learn
pip install mendeleev
pip install ase

Note that the same version of Pandas must be used to save and load the .pkl binary files, otherwise you will get errors.

Materials Project (MP)

Add your your API key by creating a file mp/api_key.json as

echo '{"api_key":"******************"}' > mp/api_key.json

Download all MP data with

python mp/download.py

A pandas.DataFrame object will be saved as mp/materials_project.pkl.

See the example_mp.ipynb and example_pca.ipynb Jupyter notebooks for usage examples.

Inorganic crystal structure database (ICSD)

Your ICSD credentials by creating a file icsd/icsd_credentials.json as

echo '{"loginid":"**********","password":"****************"}' > icsd/icsd_credentials.json

Download all ICSD cif strings with

python icsd/download.py

A pandas.DataFrame object will be saved in binary format in the file icsd/icsd_cifs.pkl. Extract information from the cif strings it contains with

python icsd/augment.py

which will extract new columns :

  • id
  • _database_code_ICSD
  • _chemical_formula_structural
  • _chemical_formula_sum
  • _cell_length_a
  • _cell_length_b
  • _cell_length_c
  • _cell_angle_alpha
  • _cell_angle_beta
  • _cell_angle_gamma
  • _cell_volume

in a new pandas.DataFrame saved in icsd/all_icsd_cifs_augmented.pkl file. The data is also saved in a .csv files icsd/icsd_formulas_all.csv, but without the cif column. Two additional files are also saved, icsd_formula_structural_integer.csv and icsd_formula_sum_integer.csv which contain stochiometric compounds only.

See the example_icsd.ipynb Jupyter notebooks for usage examples (along with example_mp_vs_icsd.ipynb if you have MP downloaded).

Open quantum materials database (OQMD)

Download all of OQMD materials (can take up to a few days) with

python oqmd/download.py

A pandas.DataFrame object will be saved in binary format in the file oqmd/oqmd.pkl.

Periodic table

Build the periodic table with

python ptable/build.py

A pandas.DataFrame object will be saved in binary format in the file ptable/ptable.pkl.

See the example_descriptors.ipynb Jupyter notebook for usage examples.


References

Main MP documentation
Main OQMD documentation:
Main ICSD API documentation:
Resources for REST:
Unsued resources (kept here for reference):
More databases which could be added to this repository