Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 17 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@ DRGS-Net implements a hybrid molecular representation model that concatenates:

The combined embedding is passed to a small prediction head for downstream molecular property prediction (classification or regression).

## TL;DR (Quick usage)
1) Install dependencies (see "Quick start" below for full setup commands).
2) Copy the default config: `cp DRGS-Net.yaml config_concatenate.yaml`
3) Edit `config_concatenate.yaml` with your dataset paths and ChemBERTa checkpoint.
4) Run: `python DRGS-Net_finetune.py`

This repository contains training/finetuning scripts, dataset wrappers, model definitions, and utilities. The hybrid model is implemented in `models/concatenate_model.py` (class `HybridModel`). The finetuning orchestration is in `DRGS-Net_finetune.py` which expects a configuration file named `config_concatenate.yaml` by default — if you only have `DRGS-Net.yaml`, copy/rename it to `config_concatenate.yaml` before running (see "Quick start").
## Methodology
![DRGS-Net Architecture](./fig/Methodology.png)
Expand Down Expand Up @@ -50,7 +56,14 @@ See the model card: https://huggingface.co/DeepChem/ChemBERTa-77M-MLM

## Quick start

1) Prepare config file
1) Download the repository

```bash
git clone https://github.com/thkim-01/DRGS-Net.git
cd DRGS-Net
```

2) Prepare config file

The finetune script (`DRGS-Net_finetune.py`) by default loads `config_concatenate.yaml`. If you only have `DRGS-Net.yaml`, create a copy with the expected name:

Expand All @@ -60,7 +73,7 @@ cp DRGS-Net.yaml config_concatenate.yaml

Edit the YAML to set `task_name`, dataset paths, `fine_tune_from` (pretrained GNN checkpoint folder under `./ckpt/`) and `hybrid_specific.chemberta_model_name` (local path or Hugging Face model id).

2) Create environment & install dependencies (suggested)
3) Create environment & install dependencies (suggested)

This project requires PyTorch, HuggingFace Transformers, and optional packages like RDKit and NVIDIA Apex for mixed precision.

Expand All @@ -80,11 +93,11 @@ pip install tensorboard scikit-learn pandas numpy tqdm pyyaml

Note: Install compatible `torch-geometric` packages for your PyTorch/CUDA setup if you use PyG layers in the GNN models.

3) Prepare data
4) Prepare data

Place downstream datasets under `data/<dataset_name>/` following the MoleculeNet CSV formats. The default config uses paths like `data/bbbp/BBBP.csv` etc. See `DRGS-Net_finetune.py` for dataset mapping by `task_name`.

4) Run finetuning
5) Run finetuning

```bash
python DRGS-Net_finetune.py
Expand Down