Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ jobs:

publish-pypi:
name: Publish to PyPI (OIDC)
# Disabled until PyPI name is finalized (audit 2026-05-18: name squat detected)
if: false
needs: build
runs-on: ubuntu-latest
environment:
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://pypi.org/project/dose/)

> ⚠️ **PyPI name notice**: A package named `dose` already exists on PyPI from a different author. **Do NOT** run `pip install dose`. This project is distributed via GitHub source only until a unique PyPI name is chosen. See https://github.com/hinanohart/dose#installation for the correct install method.
`dose` measures how much the helpful and harmful capability subspaces of a language model overlap — and how that overlap changes as intervention strength varies. The core metric, **PSI (Pharmakon Separability Index)**, quantifies whether a model's representations treat safe and unsafe behaviors as geometrically separable directions.

Urbina et al. (2022) showed that the same AI system used to discover therapeutics can be redirected to generate chemical weapons with minimal effort — the capability is not separated, only constrained by convention. `dose` extends this framing to the activation geometry of language models and measures it quantitatively.
Expand All @@ -29,7 +30,7 @@ The H2 auto-decision logic (`orchestrator._auto_decide_h2`) and the reproducibil
## Installation

```bash
pip install dose
pip install dose # do not run, see above
# GPU (recommended for full runs):
pip install dose torch --index-url https://download.pytorch.org/whl/cu121
# With Gradio demo:
Expand Down
Loading