Skip to content

Latest commit

 

History

History
42 lines (33 loc) · 3.18 KB

File metadata and controls

42 lines (33 loc) · 3.18 KB

Getting started

Get LabTrust-Gym installed, run your first benchmarks, and optionally fork and extend for your organization.

I want to...

I want to... First step
Run benchmarks only pip install labtrust-gym[env,plots] then labtrust quick-eval
Add my coordination method (or task) Extension development + entry_points; see examples/extension_example for a minimal plugin
Fork and customize policy Forker guide and labtrust forker-quickstart
Use labtrust-gym as a library Extension development + --profile + extension_packages in a lab profile
Run the full security suite labtrust run-security-suite; full suite needs .[env]; use --skip-system-level when the [env] extra is missing
Run the PCS QC-release demo (proof-carrying science) PCS quickstartscripts/setup_pcs_dev.ps1, then labtrust run-demo qc-release
Connect the LabTrust Portal (Lovable) to live data Set VITE_DATA_BASE_URL in the portal to this repo’s deployed viewer-data URL; see Portal context — Portal live data connection
Export a UI bundle (tables + coordination charts) for the portal labtrust ui-export --run <dir> --out <zip>; when run has coordination pack output, zip includes SOTA leaderboards and coordination/graphs/ HTML charts. See Frontend handoff.

New to the project

Document Description
Installation Pip install, extras, environment variables, development setup.
Build your own agent Implement an agent and run it with eval-agent (5–10 min).
Example agents Reference agents and run commands.
Example experiments Reproducible experiments (trust vs performance).

Forkers and operators

Document Description
Forker guide Fork, customize policy, run the full pipeline, add partner overlays and coordination methods.
Demo readiness Prerequisites, Windows notes, and risk-register usage for the three presentation demos.
Recommended Windows setup Path, shell, file-lock mitigation, and locale for Windows-only users.
Troubleshooting Common issues and fixes.

Next steps