Feature Request
Remove the internal/proprietary Oak Health Insurance benchmark data from GitHub — including from git history — and replace it with an externally-sourced (public/shareable) dataset.
Motivation / Problem
The Oak Health Insurance benchmark (benchmarks/oak_health_insurance/) currently ships internal data (e.g. oak_data.json, oak_health_test_suite_v1.json, oak_policies.py and the oak-* policy definitions) that should not live in this repository or its history. This needs to be scrubbed from the git history (not just removed in a new commit) and replaced with an equivalent dataset sourced externally/publicly so the benchmark continues to function for all contributors.
Use Case
- Contributors without access to the internal Oak data should still be able to clone the repo, run
benchmarks/oak_health_insurance/eval.sh, and reproduce Oak benchmark results using the external dataset.
- The repo can be shared/open-sourced without exposing internal/proprietary content, including in its commit history.
Proposed Solution
- Identify all internal Oak data files and references (
oak_data.json, oak_health_test_suite_v1.json, oak_policies.py, and any other internal oak-* content under benchmarks/oak_health_insurance/).
- Source or construct an equivalent external/public dataset that exercises the same capabilities, and swap it in as the benchmark's data source (update loaders/config/registry as needed so
eval.sh/compare.sh keep working).
- Rewrite git history to remove the internal data (e.g. via
git filter-repo or BFG Repo-Cleaner) — this is a destructive, history-rewriting operation that requires coordination (force-push, all clones/forks need to be re-fetched), so plan and communicate accordingly.
Alternatives Considered
N/A
Priority
High
Additional Context
Relevant files under benchmarks/oak_health_insurance/: oak_data.json, oak_health_test_suite_v1.json, oak_policies.py, eval_bench_sdk.py.
Feature Request
Remove the internal/proprietary Oak Health Insurance benchmark data from GitHub — including from git history — and replace it with an externally-sourced (public/shareable) dataset.
Motivation / Problem
The Oak Health Insurance benchmark (
benchmarks/oak_health_insurance/) currently ships internal data (e.g.oak_data.json,oak_health_test_suite_v1.json,oak_policies.pyand theoak-*policy definitions) that should not live in this repository or its history. This needs to be scrubbed from the git history (not just removed in a new commit) and replaced with an equivalent dataset sourced externally/publicly so the benchmark continues to function for all contributors.Use Case
benchmarks/oak_health_insurance/eval.sh, and reproduce Oak benchmark results using the external dataset.Proposed Solution
oak_data.json,oak_health_test_suite_v1.json,oak_policies.py, and any other internaloak-*content underbenchmarks/oak_health_insurance/).eval.sh/compare.shkeep working).git filter-repoor BFG Repo-Cleaner) — this is a destructive, history-rewriting operation that requires coordination (force-push, all clones/forks need to be re-fetched), so plan and communicate accordingly.Alternatives Considered
N/A
Priority
High
Additional Context
Relevant files under
benchmarks/oak_health_insurance/:oak_data.json,oak_health_test_suite_v1.json,oak_policies.py,eval_bench_sdk.py.