We have released the training code and all trained model weights. If you find them helpful, please cite our article!
Kang, Q.; Fang, P.; Zhang, S.; Qiu, H.; Lan, Z. Deep Graph Convolutional Network for Small-Molecule Retention Time Prediction. Journal of Chromatography A. 2023, 464439, ISSN 0021-9673.
Article url: https://authors.elsevier.com/a/1hy7U4-ggV0Qt (https://doi.org/10.1016/j.chroma.2023.464439.)
The model performances were evaluated through metrics including mean absolute error (MAE), median absolute error (MedAE), mean absolute percentage error (MAPE), mean square error (MSE), and R2.
| Depth | MAE | MedAE | MAPE | R2 | MSE | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean | Std | Mean | Std | Mean | Std | Mean | Std | Mean | Std | ||
| DeepGCN-RT | 3 | 27.97 | 0.20 | 14.01 | 0.07 | 0.035 | 0.000 | 0.892 | 0.002 | 3303 | 55 |
| DeepGCN-RT | 5 | 27.00 | 0.19 | 12.91 | 0.18 | 0.034 | 0.000 | 0.892 | 0.001 | 3288 | 33 |
| DeepGCN-RT | 8 | 26.61 | 0.09 | 12.44 | 0.05 | 0.034 | 0.000 | 0.892 | 0.001 | 3286 | 31 |
| DeepGCN-RT | 16 | 26.55 | 0.17 | 12.38 | 0.12 | 0.033 | 0.000 | 0.892 | 0.001 | 3299 | 45 |
This repository contians GNN models for rentention time prediction, including DeepGCN-RT, and plain GCN model. The models.py dataset.py and train.py contain source code. The model weights are included in the model_path folder. The transfer_learning.py contains the transfer learning code(10 fold cross validation). The full results of transfer learning for all models are contained in folder named result.
The environment dependencies for Linux system are contained in the file named environment.yaml. Use conda update to build the environment.
To run the training code, the following command could be used:
python train.py \
--model_name "DeepGCN-RT" \
--dataset "SMRT"
--num_layers 16 \
--hid_dim 200 \
--epochs 200 \
--lr 0.001 \
--batch_size 64\
--early_stop 30 \
--seed 1
Alternatively, the train and transfer learning processes could also be started by the shell scripts using for loop. To run the training process on SMRT data set, using the following command:
sh scripts/train.sh
To run the transfer learning on nine transfer learning data sets, use:
sh scripts/transfer_learning.sh
To run the inference code, the following command could be used:
python inference.py \
--SMILES "demo_SMILES"\
--model_path "model path"