Formula graph self-attention network for representation-domain independent materials discovery (Finder)
This project requires python3.8 or above. Please make sure you have the pip3 module installed. It is recommended to use a virtual environment to install the Finder package as follows.
python -m venv Finder_env
source Finder_env/bin/activate
Inside the root directory, execute pip install -r requirements.txt to install the dependencies. This should install all packages required to run Finder. Please open an issue if there are installation errors.
Finder is built using spektral graph deep learning library. You may read the documentation of spektral here. Note that the current version of Finder requires spektral-1.1.0.
Please download the The Materials Project data used in this work from figshare. Extract the zip file and place MP_2021_July_no_polymorph directory inside data/databases/. Note that each data file should have three columns ID, formula and target. An additional cif column is required for crystal structure based predictions.
Navigate to the main directory (Finder/) and execute python trainer.py --help to see the allowed arguments.
You can train and evaluate structure-agnostic Finder model on the formation energy database by running the following.
python trainer.py --train-path data/databases/MP_2021_July_no_polymorph/formation_energy/train.csv --val-path data/databases/MP_2021_July_no_polymorph/formation_energy/val.csv --test-path data/databases/MP_2021_July_no_polymorph/formation_energy/test.csv --epochs 800 --batch-size 128 --train --test
An additional --use-crystal-structure flag is required to train structure-based Finder model. To train it for bandgap, you can run;
python trainer.py --train-path data/databases/MP_2021_July_no_polymorph/bandgap/train.csv --val-path data/databases/MP_2021_July_no_polymorph/bandgap/val.csv --test-path data/databases/MP_2021_July_no_polymorph/bandgap/test.csv --epochs 1200 --batch-size 128 --train --test --use-crystal-structure
Once you train a Finder model, a directory named saved_models/best_model_gnn that contains the best model will be created. You may then run the following to make predictions using this pre-trained model. Add --use-crystal-structure flag if this is a structure-based Finder model. If the target property value is unknown, please fill in the target column of your data file with some dummy values.
python trainer.py --model-path saved_models/best_model_gnn/ --test-path data/databases/MP_2021_July_no_polymorph/formation_energy/test.csv --test
Prediction results will be saved in results/ directory.
You may download the pre-trained Finder models for MP property prediction tasks from figshare. Assuming that the zip file is extracted in the root directory, you may run the following snippet to evaluate, for example the structure-based refractive index model.
python trainer.py --model-path Finder_pre-trained/Structure-based/best_model_gnn_refractive_index_SB/ --test-path data/databases/MP_2021_July_no_polymorph/refractive_index/test.csv --test --use-crystal-structure
We acknowledge funding received by The Institution of Engineering and Technology (IET) under the AF Harvey Research Prize. This work is supported in part by EPSRC Software Defined Materials for Dynamic Control of Electromagnetic Waves (ANIMATE) grant (No. EP/R035393/1)
Consider citing our paper if you find the Finder model and the codebase useful.
@article{Ihalage_2022_Adv_Sci,
author = {Ihalage, Achintha and Hao, Yang},
title = {Formula Graph Self-Attention Network for Representation-Domain Independent Materials Discovery},
journal = {Advanced Science},
volume = {9},
number = {18},
pages = {2200164},
keywords = {attention, epsilon-near-zero, graph-network, machine-learning, materials-informatics},
doi = {https://doi.org/10.1002/advs.202200164},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/advs.202200164},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/advs.202200164},
year = {2022}
}
