PAGE-DT: Digital Twin Framework for SDS-PAGE Outcome Prediction

Predict gel outcomes before running the experiment.

Overview

PAGE-DT is a research-grade Digital Twin of SDS-PAGE electrophoresis. It combines:

Physics engine — Ferguson equation, diffusion, smearing physics
ML predictor — Random Forest + XGBoost + Neural Network ensemble
Synthetic gel image generator — OpenCV-based realistic gel rendering
Failure diagnosis — Rule-based expert system with confidence scoring
Parameter optimizer — Grid-search + hill-climbing for target separation
AI Assistant — Claude-powered expert chatbot

Quick Start

1. Install dependencies

pip install -r requirements.txt

2. Train all models (generates 12k experiments, trains 3 ML models)

python train_all.py

3. Launch dashboard

python src/dashboard/app.py

Open http://localhost:7860 in your browser.

Architecture

PAGE-DT/
├── src/
│   ├── simulator/
│   │   └── physics_engine.py        # Phase 1: Ferguson equation simulator
│   ├── data/
│   │   └── dataset_generator.py     # Phase 2: 12,000 virtual experiments
│   ├── models/
│   │   └── ml_predictor.py          # Phase 3: RF + XGBoost + MLP
│   ├── imaging/
│   │   └── gel_generator.py         # Phase 4: OpenCV gel image generator
│   ├── diagnosis/
│   │   └── failure_engine.py        # Phase 5: Failure diagnosis engine
│   ├── optimizer/
│   │   └── param_optimizer.py       # Phase 6: Parameter optimizer
│   └── dashboard/
│       └── app.py                   # Phase 7: Gradio dashboard
├── data/
│   ├── raw/                         # Generated CSV dataset
│   ├── processed/
│   └── models/                      # Saved ML models (.pkl)
├── outputs/
│   └── images/                      # Generated gel images
├── train_all.py                     # Master training script
├── requirements.txt
└── README.md

Modules

Phase 1: Physics Engine (`physics_engine.py`)

Implements the Ferguson equation: log(Rf) = log(Y0) - KR × T

Predicts:

Migration distance (cm)
Relative mobility (Rf)
Band width (mm) — from diffusion + overloading
Smearing score (0–1) — voltage, concentration, time effects
Band intensity (0–1) — Beer-Lambert approximation
Separation quality (0–1) — composite score

Phase 2: Dataset Generator (`dataset_generator.py`)

Generates 12,000 synthetic experiments via:

70% uniform random sampling
20% Latin Hypercube stratified sampling
10% edge/stress cases

Each row: 5 inputs + 6 physics outputs + 4 derived ML features = 15 columns

Phase 3: ML Predictor (`ml_predictor.py`)

Trains 3 models per target (band_position, band_intensity, smearing_score):

Model	Architecture
Random Forest	200 trees, max_depth=12
XGBoost	300 estimators, lr=0.05, subsample=0.8
Neural Network	MLP [128→64→32], ReLU, early stopping

Saves best model per target. Reports MAE, RMSE, R².

Phase 4: Gel Image Generator (`gel_generator.py`)

OpenCV-based rendering with:

Gaussian band profiles (vertical + horizontal)
Smearing trails (exponential decay)
Shot noise + background gradient
MW ladder with auto-computed positions
Lane labels and MW axis

Phase 5: Failure Diagnosis (`failure_engine.py`)

10 failure modes with confidence scoring:

OVERLOADED_SAMPLE, VOLTAGE_TOO_HIGH/LOW
POOR_GEL_CONCENTRATION, SMEARING_OVERCONCENTRATION
INCOMPLETE_SEPARATION, RUN_TOO_SHORT/LONG
BAND_NEAR_STACKING, OPTIMAL

Phase 6: Parameter Optimizer (`param_optimizer.py`)

Coarse grid (N³) + fine local search around best point.

Objective function:

50% minimum band gap
20% mean separation quality
−15% smearing penalty
−15% runoff penalty

Phase 7: Dashboard (`app.py`)

6 Gradio tabs:

Predict Outcome — single protein prediction
Virtual Gel — multi-lane gel generation
What-If Simulator — parameter sweep charts
Failure Diagnosis — health gauge + confidence bars
Parameter Optimizer — sensitivity analysis + optimal gel preview
AI Assistant — Claude-powered expert chatbot

Example Usage (Python API)

from src.simulator.physics_engine import SDSPAGEPhysicsEngine

engine = SDSPAGEPhysicsEngine()
result = engine.simulate(
    protein_mw=50.0,      # kDa
    gel_percentage=10.0,  # %
    voltage=100.0,        # V
    run_time=60.0,        # min
    concentration=1.0     # µg/µL
)
print(f"Rf = {result.relative_mobility:.4f}")
print(f"Position = {result.migration_distance:.3f} cm")
print(f"Smearing = {result.smearing_score:.4f}")

from src.optimizer.param_optimizer import ParameterOptimizer

opt = ParameterOptimizer()
result = opt.optimize(target_proteins=[50.0, 60.0])
print(f"Use {result.optimal_gel_pct}% gel at {result.optimal_voltage}V for {result.optimal_runtime} min")

Scientific Basis

Effect	Model
Mobility vs MW	Ferguson equation + Stokes radius
Band position	Rf × gel length × run completeness
Band broadening	Einstein-Smoluchowski diffusion
Smearing	Concentration overload + voltage + time
Intensity	Beer-Lambert approximation

Requirements

Python 3.9+
4 GB RAM (8 GB recommended for training)
No GPU required — runs on laptop CPU

License

MIT License — research use encouraged.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
src		src
.gitignore		.gitignore
README.md		README.md
generate_pdf_report.py		generate_pdf_report.py
requirements.txt		requirements.txt
streamlit_app_fixed.py		streamlit_app_fixed.py
streamlit_app_v2.py		streamlit_app_v2.py
test_gradio.py		test_gradio.py
test_streamlit.py		test_streamlit.py
train_all.py		train_all.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PAGE-DT: Digital Twin Framework for SDS-PAGE Outcome Prediction

Overview

Quick Start

1. Install dependencies

2. Train all models (generates 12k experiments, trains 3 ML models)

3. Launch dashboard

Architecture

Modules

Phase 1: Physics Engine (`physics_engine.py`)

Phase 2: Dataset Generator (`dataset_generator.py`)

Phase 3: ML Predictor (`ml_predictor.py`)

Phase 4: Gel Image Generator (`gel_generator.py`)

Phase 5: Failure Diagnosis (`failure_engine.py`)

Phase 6: Parameter Optimizer (`param_optimizer.py`)

Phase 7: Dashboard (`app.py`)

Example Usage (Python API)

Scientific Basis

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PAGE-DT: Digital Twin Framework for SDS-PAGE Outcome Prediction

Overview

Quick Start

1. Install dependencies

2. Train all models (generates 12k experiments, trains 3 ML models)

3. Launch dashboard

Architecture

Modules

Phase 1: Physics Engine (physics_engine.py)

Phase 2: Dataset Generator (dataset_generator.py)

Phase 3: ML Predictor (ml_predictor.py)

Phase 4: Gel Image Generator (gel_generator.py)

Phase 5: Failure Diagnosis (failure_engine.py)

Phase 6: Parameter Optimizer (param_optimizer.py)

Phase 7: Dashboard (app.py)

Example Usage (Python API)

Scientific Basis

Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Phase 1: Physics Engine (`physics_engine.py`)

Phase 2: Dataset Generator (`dataset_generator.py`)

Phase 3: ML Predictor (`ml_predictor.py`)

Phase 4: Gel Image Generator (`gel_generator.py`)

Phase 5: Failure Diagnosis (`failure_engine.py`)

Phase 6: Parameter Optimizer (`param_optimizer.py`)

Phase 7: Dashboard (`app.py`)

Packages