gridfm · romeokienzler · Mar 9, 2026 · Aug 29, 2025 · Oct 7, 2025 · Oct 31, 2025
diff --git a/.github/workflows/ci-build.yaml b/.github/workflows/ci-build.yaml
@@ -67,15 +67,19 @@ jobs:
           python3 -m pip install --upgrade pip
           python3 -m pip install --upgrade "git+https://github.com/ibm/detect-secrets.git@master#egg=detect-secrets"
           python3 -m pip install boxsdk
+
       - name: Scan repository & write snapshot
         run: |
           mkdir -p security-outputs
-          # Run the scan; send stderr to a log; stdout to JSON
+
+          # Run detect-secrets while skipping binary files
           detect-secrets scan \
-            --exclude-files '.*\.ipynb$' \
+            --exclude-files '.*\.ipynb$|.*\.(png|jpg|jpeg|gif|pdf|onnx|pt|pth|bin|zip)$' \
             --exclude-lines '"(outputs|image/\w+|id|hash)":.*' \
-            > security-outputs/ds-results.json 2> security-outputs/.secrets.new
-
+            --force-use-all-plugins \
+            . \
+            > security-outputs/ds-results.json \
+            2> security-outputs/.secrets.new || true
 
 
       - name: Upload scan artifacts (for triage)

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -12,7 +12,7 @@ If you just need help or have a question, refer to [SUPPORT.md](SUPPORT.md).
 
 ## How to Contribute a Bug Fix or Change
 
-To contribute code to the project, first read over the [governance policies] page to understand the roles involved. 
+To contribute code to the project, first read over the [governance policies] page to understand the roles involved.
 
 Each contribution must meet the [PEP 8] and include..
 

diff --git a/README.md b/README.md
@@ -20,22 +20,28 @@ This library is brought to you by the GridFM team to train, finetune and interac
 
 # Installation
 
-You can install `gridfm-graphkit` directly from PyPI:
+Create and activate a virtual environment (make sure you use the right python version = 3.10, 3.11 or 3.12. I highly recommend 3.12)
+```bash
+python -m venv venv
+source venv/bin/activate
+```
 
+Install gridfm-graphkit in editable mode
 ```bash
-pip install gridfm-graphkit
+pip install -e .
 ```
 
-To contribute or develop locally, clone the repository and install in editable mode:
+Get PyTorch + CUDA version for torch-scatter
+```bash
+TORCH_CUDA_VERSION=$(python -c "import torch; print(torch.__version__ + ('+cpu' if torch.version.cuda is None else ''))")
+```
 
+Install the correct torch-scatter wheel
 ```bash
-git clone git@github.com:gridfm/gridfm-graphkit.git
-cd gridfm-graphkit
-python -m venv venv
-source venv/bin/activate
-pip install -e .
+pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH_CUDA_VERSION}.html
 ```
 
+
 For documentation generation and unit testing, install with the optional `dev` and `test` extras:
 
 ```bash
@@ -45,15 +51,15 @@ pip install -e .[dev,test]
 
 # CLI commands
 
-An interface to train, fine-tune, and evaluate GridFM models using configurable YAML files and MLflow tracking.
+Interface to train, fine-tune, evaluate, and run inference on GridFM models using YAML configs and MLflow tracking.
 
 ```bash
 gridfm_graphkit <command> [OPTIONS]
 ```
 
 Available commands:
 
-* `train` – Train a new model from scrathc
+* `train` - Train a new model from scratch
 * `finetune` – Fine-tune an existing pre-trained model
 * `evaluate` – Evaluate model performance on a dataset
 * `predict` – Run inference and save predictions
@@ -70,11 +76,11 @@ gridfm_graphkit train --config path/to/config.yaml
 
 | Argument         | Type   | Description                                                      | Default |
 | ---------------- | ------ | ---------------------------------------------------------------- | ------- |
-| `--config`       | `str`  | **Required**. Path to the training configuration YAML file.    | `None`  |
-| `--exp_name`     | `str`  | **Optional**. MLflow experiment name.                            | `timestamp`  |
-| `--run_name`     | `str`  | **Optional**. MLflow run name.                                   | `run`  |
-| `--log_dir  `    | `str`  | **Optional**. MLflow logging directory.                              | `mlruns`  |
-| `--data_path`    | `str`  | **Optional**. Root dataset directory.                            | `data`  |
+| `--config`       | `str`  | **Required**. Path to the training configuration YAML file.      | `None`       |
+| `--exp_name`     | `str`  | MLflow experiment name.                                           | `timestamp`  |
+| `--run_name`     | `str`  | MLflow run name.                                                  | `run`        |
+| `--log_dir`      | `str`  | MLflow tracking/logging directory.                                | `mlruns`     |
+| `--data_path`    | `str`  | Root dataset directory.                                           | `data`       |
 
 ### Examples
 
@@ -89,15 +95,15 @@ gridfm_graphkit train --config examples/config/case30_ieee_base.yaml --data_path
 ## Fine-Tuning Models
 
 ```bash
-gridfm_graphkit finetune --config path/to/config.yaml --model_path path/to/model.pth
+gridfm_graphkit finetune --config path/to/config.yaml --model_path path/to/model.pt
 ```
 
 ### Arguments
 
 | Argument       | Type  | Description                                     | Default   |
 | -------------- | ----- | ----------------------------------------------- | --------- |
 | `--config`     | `str` | **Required**. Fine-tuning configuration file.   | `None`    |
-| `--model_path` | `str` | **Required**. Path to a pre-trained model file. | `None`    |
+| `--model_path` | `str` | **Required**. Path to a pre-trained model state dict. | `None`    |
 | `--exp_name`   | `str` | MLflow experiment name.                         | timestamp |
 | `--run_name`   | `str` | MLflow run name.                                | `run`     |
 | `--log_dir`    | `str` | MLflow logging directory.                       | `mlruns`  |
@@ -109,38 +115,63 @@ gridfm_graphkit finetune --config path/to/config.yaml --model_path path/to/model
 ## Evaluating Models
 
 ```bash
-gridfm_graphkit evaluate --config path/to/eval.yaml --model_path path/to/model.pth
+gridfm_graphkit evaluate --config path/to/eval.yaml --model_path path/to/model.pt
 ```
 
 ### Arguments
 
-| Argument       | Type  | Description                              | Default   |
-| -------------- | ----- | ---------------------------------------- | --------- |
-| `--config`     | `str` | **Required**. Path to evaluation config. | `None`    |
-| `--model_path` | `str` | Path to the trained model file.          | `None`    |
-| `--exp_name`   | `str` | MLflow experiment name.                  | timestamp |
-| `--run_name`   | `str` | MLflow run name.                         | `run`     |
-| `--log_dir`    | `str` | MLflow logging directory.                | `mlruns`  |
-| `--data_path`  | `str` | Dataset directory.                       | `data`    |
+| Argument              | Type  | Description                                                                                                   | Default   |
+| --------------------- | ----- | ------------------------------------------------------------------------------------------------------------- | --------- |
+| `--config`            | `str` | **Required**. Path to evaluation config.                                                                      | `None`    |
+| `--model_path`        | `str` | Path to the trained model state dict.                                                                         | `None`    |
+| `--normalizer_stats`  | `str` | Path to `normalizer_stats.pt` from a training run. Restores `fit_on_train` normalizers from saved statistics instead of re-fitting on the current data split. | `None`    |
+| `--exp_name`          | `str` | MLflow experiment name.                                                                                       | timestamp |
+| `--run_name`          | `str` | MLflow run name.                                                                                              | `run`     |
+| `--log_dir`           | `str` | MLflow logging directory.                                                                                     | `mlruns`  |
+| `--data_path`         | `str` | Dataset directory.                                                                                            | `data`    |
+| `--compute_dc_ac_metrics` | `flag` | Compute ground-truth AC/DC power balance metrics on the test split.                                      | `False`   |
+| `--save_output`       | `flag` | Save predictions as `<grid_name>_predictions.parquet` under MLflow artifacts (`.../artifacts/test`).       | `False`   |
+
+### Example with saved normalizer stats
+
+When evaluating a model on a dataset, you can pass the normalizer statistics from the original training run to ensure the same normalization parameters are used:
+
+```bash
+gridfm_graphkit evaluate \
+  --config examples/config/HGNS_PF_datakit_case118.yaml \
+  --model_path mlruns/<experiment_id>/<run_id>/artifacts/model/best_model_state_dict.pt \
+  --normalizer_stats mlruns/<experiment_id>/<run_id>/artifacts/stats/normalizer_stats.pt \
+  --data_path data
+```
+
+> **Note:** The `--normalizer_stats` flag only affects normalizers with `fit_strategy = "fit_on_train"` (e.g. `HeteroDataMVANormalizer`). Per-sample normalizers (`HeteroDataPerSampleMVANormalizer`) always recompute their statistics from the current dataset regardless of this flag.
 
 ---
 
 ## Running Predictions
 
 ```bash
-gridfm_graphkit predict --config path/to/config.yaml --model_path path/to/model.pth
+gridfm_graphkit predict --config path/to/config.yaml --model_path path/to/model.pt
 ```
 
 ### Arguments
 
-| Argument        | Type  | Description                                   | Default   |
-| --------------- | ----- | --------------------------------------------- | --------- |
-| `--config`      | `str` | **Required**. Path to prediction config file. | `None`    |
-| `--model_path`  | `str` | Path to the trained model file.               | `None`    |
-| `--exp_name`    | `str` | MLflow experiment name.                       | timestamp |
-| `--run_name`    | `str` | MLflow run name.                              | `run`     |
-| `--log_dir`     | `str` | MLflow logging directory.                     | `mlruns`  |
-| `--data_path`   | `str` | Dataset directory.                            | `data`    |
-| `--output_path` | `str` | Directory where predictions are saved.        | `data`    |
+| Argument              | Type  | Description                                                                                                   | Default   |
+| --------------------- | ----- | ------------------------------------------------------------------------------------------------------------- | --------- |
+| `--config`            | `str` | **Required**. Path to prediction config file.                                                                 | `None`    |
+| `--model_path`        | `str` | Path to the trained model state dict.                                                                         | `None`    |
+| `--normalizer_stats`  | `str` | Path to `normalizer_stats.pt` from a training run. Restores `fit_on_train` normalizers from saved statistics. | `None`    |
+| `--exp_name`          | `str` | MLflow experiment name.                                                                                       | timestamp |
+| `--run_name`          | `str` | MLflow run name.                                                                                              | `run`     |
+| `--log_dir`           | `str` | MLflow logging directory.                                                                                     | `mlruns`  |
+| `--data_path`         | `str` | Dataset directory.                                                                                            | `data`    |
+| `--output_path`       | `str` | Directory where predictions are saved as `<grid_name>_predictions.parquet`.                                  | `data`    |
+
+Use built-in help for full command details:
+
+```bash
+gridfm_graphkit --help
+gridfm_graphkit <command> --help
+```
 
 ---
diff --git a/RELEASE.md b/RELEASE.md
@@ -1 +0,0 @@
-