dx_chat_entropy

dx_chat_entropy is a Python-first research repository for estimating, auditing, and comparing likelihood ratios (LRs) in clinical diagnostic-reasoning tasks. It supports transcript-to-assessment workflows, pairwise differential LR estimation, and one-vs-rest LR estimation with a Bayes-coherent projection step.

The manuscript associated with this work is under journal review. It is not accepted or published, and there is no public article DOI, PMID, PMCID, volume, issue, or page/article number to cite yet. Until a public scholarly record exists, cite the repository software and the commit or release used.

Start Here

Prerequisites:

Python 3.11+
uv
OPENAI_API_KEY in your environment or .env for model-backed runs

Initial setup:

make uv-sync-notebooks
make notebook-kernel
make fmt
make lint
make test
make audit

If you use VS Code notebooks, select the kernel Python (dx-chat-entropy). Most notebook import problems in this repo are kernel-selection problems rather than missing packages.

Project Status

This public repository contains research software, public workflow inputs, and generated aggregate/model-output artifacts needed to inspect the current analysis workflow. It does not contain an accepted manuscript, a public preprint, publisher text, private reviewer materials, API keys, or local machine state.

The internal preprint/manuscript draft is intentionally not mirrored here. When a public preprint, accepted manuscript, or final article appears, update README.md, llms.txt, CITATION.cff, and GitHub metadata in the same pull request.

Authors, Affiliations, Funding, And COI

Maintainer and corresponding repository contact:

Brian W. Locke, MD; ORCID: 0000-0002-3588-5238; GitHub: @reblocke

Author order, affiliations, funding, acknowledgments, and conflicts of interest should be taken from the eventual public manuscript record when it exists. Do not infer or publish unverified publication metadata from private drafts.

Repository Shape

config/: scenario registries and workflow configuration
data/raw/: active public source spreadsheets, transcript PDFs, and assessment templates
data/processed/: generated manifests, intermediate workbooks, and model outputs retained for review of the current workflow
notebooks/: interactive entry points, QA notebooks, and older analysis notebooks
scripts/: canonical batch runtimes, audit tools, and packaging tools
src/dx_chat_entropy/: shared parsing, runtime, audit, and bundle logic
docs/: specifications, design decisions, pipeline notes, and data-management policy
archive/: historical source code, run notes, and legacy provenance material that is not part of the active workflow unless explicitly named by a script

In general, the scripts are the canonical batch entry points. The notebooks are for interactive execution, QA, or older workflows that are still kept for reference.

Data And Privacy Boundaries

Committed active inputs are clinical-reasoning scenarios, LR matrices, assessment templates, and transcript artifacts used by the current workflow. Do not add private manuscript drafts, private reviewer materials, personal correspondence, API outputs containing secrets, local machine state, or third-party/publisher PDFs.

For machine-readable variable and artifact documentation, see:

data_dictionary.md
data_dictionary.csv
docs/SPECIFICATION.md
docs/DATA_MANAGEMENT.md

If a raw source has a defect, do not edit it in place. Preserve the raw file and correct the issue in code or in a generated output layer.

Choose A Workflow

If you are not sure which path you need:

use the assessment workflow to turn transcripts into assessment workbooks,
use the differential workflow when each finding should compare two diagnoses at a time,
use the one-vs-rest workflow when you want a full diagnosis-by-finding LR table plus a coherent version of that table.

1. Assessment Feature + LR Labeling

Use this when you want to extract findings from transcript PDFs and populate the assessment workbook.

Run order:

notebooks/10_assessment_extract_features.ipynb
notebooks/11_assessment_estimate_lrs.ipynb

Inputs:

data/raw/chatbot_transcripts/*.pdf
data/raw/assessment_templates/asssessment_template_new.xlsx

Outputs:

data/processed/assessments/answers_*.xlsx
data/processed/assessments/completed_lrs.xlsx

2. Differential LR Pipeline

Use this when each scenario should be broken into all diagnosis-pair comparisons and each finding should receive a differential LR for that pair.

Run order:

notebooks/20_differential_build_inputs.ipynb
scripts/run_differential_batch.py or notebooks/21_differential_estimate_lrs.ipynb
scripts/audit_differential_outputs.py

Example run:

DX_MODEL_ID=gpt-5.3-chat-latest \
DX_RESUME_MODE=skip_passing \
uv run --group notebooks python scripts/run_differential_batch.py

uv run --group notebooks python scripts/audit_differential_outputs.py \
  --manifest data/processed/lr_differential/manifests/pairs_manifest.csv \
  --outputs-root data/processed/lr_differential/outputs_by_model \
  --summary-out data/processed/lr_differential/manifests/quality_summary.csv \
  --invalid-out data/processed/lr_differential/manifests/invalid_rows.csv

Inputs:

canonical raw LR matrices under data/raw/lr_matrices/
scenario registry in config/lr_differential_scenarios.yaml

Outputs:

pairwise input workbooks in data/processed/lr_differential/inputs/
model-scoped filled outputs in data/processed/lr_differential/outputs_by_model/
manifests, ledgers, logs, and audit CSVs in data/processed/lr_differential/manifests/

Notes:

The model ID in the example is illustrative; swap in the model you actually want to run.
21_differential_estimate_lrs.ipynb is the interactive wrapper around the same runtime logic used by the script.
22_differential_prepare_inputs_qa.ipynb is for inspection and QA, not the canonical transformation step.

3. One-vs-Rest LR + Coherence Pipeline

Use this when you want a full LR table for each diagnosis versus all others in a scenario, followed by a coherence step that converts independently estimated one-vs-rest LRs into a Bayes-coherent multiclass version.

Run order:

scripts/build_one_vs_rest_inputs.py
scripts/run_one_vs_rest_batch.py
scripts/project_one_vs_rest_coherent_lrs.py
scripts/audit_one_vs_rest_outputs.py

Example run:

uv run --group notebooks python scripts/build_one_vs_rest_inputs.py \
  --config config/lr_differential_scenarios.yaml

uv run --group notebooks python scripts/run_one_vs_rest_batch.py \
  --manifest data/processed/lr_one_vs_rest/manifests/inputs_manifest.csv \
  --model-id gpt-5.3-chat-latest

uv run --group notebooks python scripts/project_one_vs_rest_coherent_lrs.py \
  --model-id gpt-5.3-chat-latest

# audit raw outputs
uv run --group notebooks python scripts/audit_one_vs_rest_outputs.py \
  --manifest data/processed/lr_one_vs_rest/manifests/inputs_manifest.csv \
  --outputs-root data/processed/lr_one_vs_rest/outputs_by_model \
  --summary-out data/processed/lr_one_vs_rest/manifests/quality_summary_gpt-5.3-chat-latest.csv \
  --invalid-out data/processed/lr_one_vs_rest/manifests/invalid_cells_gpt-5.3-chat-latest.csv \
  --model-id gpt-5.3-chat-latest

# audit coherent outputs
uv run --group notebooks python scripts/audit_one_vs_rest_outputs.py \
  --manifest data/processed/lr_one_vs_rest/manifests/inputs_manifest.csv \
  --coherence-mode \
  --priors-manifest data/processed/lr_one_vs_rest/manifests/schema_priors.csv \
  --raw-outputs-root data/processed/lr_one_vs_rest/outputs_by_model \
  --outputs-root data/processed/lr_one_vs_rest/coherent_outputs_by_model \
  --summary-out data/processed/lr_one_vs_rest/manifests/coherent_quality_summary_gpt-5.3-chat-latest.csv \
  --invalid-out data/processed/lr_one_vs_rest/manifests/coherent_invalid_cells_gpt-5.3-chat-latest.csv \
  --coherence-summary-out data/processed/lr_one_vs_rest/manifests/coherence_quality_summary_gpt-5.3-chat-latest.csv \
  --coherence-invalid-out data/processed/lr_one_vs_rest/manifests/coherence_invalid_rows_gpt-5.3-chat-latest.csv \
  --model-id gpt-5.3-chat-latest

Inputs:

raw LR matrices in data/raw/lr_matrices/
scenario registry in config/lr_differential_scenarios.yaml

Outputs:

normalized one-vs-rest input workbooks in data/processed/lr_one_vs_rest/inputs/
raw model outputs in data/processed/lr_one_vs_rest/outputs_by_model/
coherent outputs in data/processed/lr_one_vs_rest/coherent_outputs_by_model/
manifests and quality summaries in data/processed/lr_one_vs_rest/manifests/

Notes:

The model ID in the example is illustrative; swap in the model you actually want to run.
The coherence step is a separate local projection stage. It does not overwrite the raw one-vs-rest outputs.
notebooks/32_one_vs_rest_project_coherent_lrs.ipynb is a notebook wrapper for the coherence projection over existing raw outputs.
notebooks/30_one_vs_rest_estimate_lrs.ipynb and notebooks/31_one_vs_rest_compare_lr_estimates.ipynb are the older comparison workflow, not the canonical batch runtime.

Review Bundles

For external review or handoff:

Differential pipeline: scripts/run_differential_and_package.sh
One-vs-rest pipeline: scripts/package_one_vs_rest_review_bundle.py

These packages are meant to ship the relevant code, manifests, and outputs for a workflow. They are not full-repository snapshots and should not include private manuscript drafts, API keys, or local system state.

Dependencies

Core package dependencies are declared in pyproject.toml. Use the notebooks dependency group for model-backed notebook and batch runs.

Use	Command	Notes
Core development	`uv sync`	Installs package, tests, and lint tooling.
Notebook/model workflows	`make uv-sync-notebooks`	Installs the `notebooks` dependency group.
Notebook kernel	`make notebook-kernel`	Registers `Python (dx-chat-entropy)`.
Repository checks	`make fmt && make lint && make test && make audit`	Run before PRs.

Documentation Map

Use the document that matches the question:

llms.txt: compact machine-readable index for LLMs and search systems
README.md: what this repo does, which workflow to choose, and how to run it
data_dictionary.md and data_dictionary.csv: source, derived, and output artifact dictionary
docs/SPECIFICATION.md: detailed pipeline contracts, artifact paths, manifests, and review-bundle scope
docs/PIPELINES.md: short index of current pipelines and notebook/script order
docs/DATA_MANAGEMENT.md: active-vs-archive data placement and provenance rules
docs/DECISIONS.md: non-obvious design and policy decisions
AGENTS.md: project-specific coding-agent instructions

Citation

Until a public paper, preprint, or conference record exists, cite the repository software:

Locke BW. dx_chat_entropy: Clinical reasoning entropy and likelihood-ratio workflows.
GitHub. https://github.com/reblocke/dx_chat_entropy. Commit or release used.

Machine-readable citation metadata are in CITATION.cff. Do not add a preferred-citation for the manuscript until the public scholarly record exists.

License

Repository code is released under the MIT License. Third-party materials, clinical source documents, private drafts, publisher artifacts, and externally supplied data remain under their original terms and should not be copied into the public branch unless their public license and provenance are documented.

Common Problems

Notebook imports fail even after syncing dependencies

Usually the notebook is attached to the wrong interpreter.

Fix:

Run make uv-sync-notebooks
Run make notebook-kernel
In VS Code, switch to Python (dx-chat-entropy)
Restart the kernel

A script or notebook is still reading from `archive/`

Treat that as a migration or legacy-path issue. Active workflows should read from data/raw/ and write to data/processed/ unless the workflow is explicitly labeled archival.

Contact

Use GitHub issues or pull requests for repository-specific questions. For publication or data-access questions, contact Brian W. Locke through the contact route listed on his public GitHub profile or CV.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dx_chat_entropy

Start Here

Project Status

Authors, Affiliations, Funding, And COI

Repository Shape

Data And Privacy Boundaries

Choose A Workflow

1. Assessment Feature + LR Labeling

2. Differential LR Pipeline

3. One-vs-Rest LR + Coherence Pipeline

Review Bundles

Dependencies

Documentation Map

Citation

License

Common Problems

Notebook imports fail even after syncing dependencies

A script or notebook is still reading from `archive/`

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
.claude		.claude
.github		.github
archive		archive
artifacts		artifacts
config		config
data		data
docs		docs
notebooks		notebooks
reports		reports
scripts		scripts
src/dx_chat_entropy		src/dx_chat_entropy
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
data_dictionary.csv		data_dictionary.csv
data_dictionary.md		data_dictionary.md
llms.txt		llms.txt
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

dx_chat_entropy

Start Here

Project Status

Authors, Affiliations, Funding, And COI

Repository Shape

Data And Privacy Boundaries

Choose A Workflow

1. Assessment Feature + LR Labeling

2. Differential LR Pipeline

3. One-vs-Rest LR + Coherence Pipeline

Review Bundles

Dependencies

Documentation Map

Citation

License

Common Problems

Notebook imports fail even after syncing dependencies

A script or notebook is still reading from archive/

Contact

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

A script or notebook is still reading from `archive/`

Packages