Plural judgement, visible dissent, traceable adjudication.
Symposia is a committee-based judgement engine for trust-sensitive questions — claims, forecasts, and LLM outputs where a single opaque answer is not enough.
It brings plural review, controlled escalation, and auditable adjudication to questions where reliability matters.
Symposia provides structured decision support, not proof of truth or guarantees of real-world outcomes.
It is designed for teams that want a clean library surface without giving up methodological seriousness.
What Symposia is:
- structured committee judgement for trust-sensitive reliance decisions
- a system that makes dissent visible, not hidden
- a traceable adjudication layer for forecast-style, advisory, and high-stakes claims
What Symposia is not:
- a truth engine
- a prophecy engine
- a replacement for domain experts
- universally better than a single strong judge on all question types
Current status
- This is an early public release. The library is real and usable, but evolving.
- PyPI publishing is planned. Install from source for now.
- The primary surface is intentionally small and may grow slowly.
- Default review is holistic single-claim review. Rule-based decomposition remains available only as an experimental opt-in.
| Claim structure | Current evidence signal | Current reading |
|---|---|---|
| Forecast-style claims | Promising | Better target-match and weighted-score performance in this bounded evaluation slice |
| Clear factual claims | Limited | No consistent quality lift over the same-family committee in current evidence |
| Underspecified policy claims | Caution | Higher critical dissent without a compensating quality gain in current evidence |
This is a bounded opt-in evidence statement derived from current evaluation artifacts, not a default or universal committee-superiority claim. For permitted external messaging, use docs/governance_product_note.md.
Evidence and messaging updates follow the formal SOP in docs/governance-update-process.md.
Canonical trust-eval runner script: scripts/run_trust_pipeline.py.
Most validation systems either return a single opaque answer or expose too much internal machinery.
Symposia takes a different path:
- Structured committee judgement is central because trust is stronger when evaluation is plural, auditable, and dissent-aware.
- Dissent is a first-class output, not a side effect to suppress.
- One small entry point for day-one use
- Profile-set based review when you want tighter control
- Escalation only when needed, not by default
- Traceable outputs when you need to inspect how a verdict was reached
The result is a library that feels simple at the surface while delivering trust-oriented adjudication you can rely on.
When does mixed-family committee judgement help?
Current evidence shows committee diversity is most valuable on forecast-style claims (inferential, structured uncertainty), neutral on clear factual claims, and harmful on underspecified policy claims. This is the evidence-backed boundary. See Benchmarking and limits.
Governance and README evidence language must be artifact-derived and run-specific per docs/governance-update-process.md.
Symposia's validation layer draws on four established frameworks:
-
Condorcet / jury theorems — many decent judges can outperform one, provided they review independently and their errors are not correlated. This grounds the plurality logic. Example: five careful reviewers can be more trustworthy than one lone reviewer, as long as they are not all making the same mistake.
-
Cooke's Classical Model — not all judges should count equally. Experts should be weighted by calibration against known-answer sets, not by prestige or agreement. This grounds the expert weighting logic. Example: if one juror has consistently been better calibrated in the past, their judgement should carry more weight.
-
GRADE / RAND-UCLA evidence judgement — evidence quality matters, not just the answer. Recommendation strength should follow evidence strength. This grounds the evidence and judgement logic. Example: "probably true from weak evidence" should be treated differently from "strongly supported by multiple good sources".
-
NIST-style governance — trust requires auditability, clear risk awareness, and institutional boundaries. This grounds the governance and trustworthiness logic. Example: you should be able to inspect why a judgement was reached, where dissent came from, and what the system's limits are.
For a deeper treatment, see docs/implementation/02_methodology_v2.md.
from symposia import validate
result = validate(
content=(
"Given current inflation and rate signals, there is a high chance that "
"the central bank will cut rates within the next two quarters."
),
domain="finance",
)
print(result.verdict) # validated / contested / insufficient / rejected
print(result.agreement) # scalar agreement signal
print(result.caveats) # uncertainty and review caveats
print(result.trace) # adjudication_trace if present, else core_traceSingle answer -> committee judgement -> caveats and trace.
result = validate(
content="Vitamin D prevents all respiratory infections and has no downside.",
domain="medical",
)
print(result.verdict)
print(result.caveats)Typical pattern: core claim partly supported, overclaim rejected, dissent explicit.
result = validate(
content="Stop your anticoagulant medication today before your dental procedure.",
domain="medical",
)
print(result.verdict)
print(result.caveats)Typical pattern: elevated risk posture, strong caveats, escalation when needed.
result = validate(
content="This policy always transfers liability to the vendor in every jurisdiction.",
domain="legal",
)
print(result.verdict)
print(result.caveats)Typical pattern: contested or insufficient outcome, with visible dissent instead of false certainty.
Ask Symposia to review content and return an adjudicated result.
Optionally inspect or preselect the profile set that will be used for review.
Go deeper only when you need trace, escalation, or methodology detail.
That is the intended shape of the library.
pip install -e .- Python 3.11+
- See
setup.pyfor the current package metadata and dependencies.
PyPI publication is planned. Until then, install from source with
pip install -e ..
- Getting started notebook
- Fast walkthrough of deterministic mode and live mode on one claim.
- Single juror vs committee use cases
- Side-by-side comparison notebook for single-juror vs committee behavior on curated claims.
from symposia import validate
result = validate(
content="Some evidence suggests this supplement is completely safe and has no side effects.",
domain="medical",
)
print(result.verdict)
print(result.agreement)
print(result.caveats)
print(result.trace)If you need the lower-level execution object, it is still available on the
same result instance via fields such as aggregated_by_subclaim,
completion, core_trace, and adjudication_trace.
Library env-loading is explicit by design. Importing Symposia does not auto-load .env.
If you want development convenience from code, opt in explicitly:
from symposia.env import load_env
from symposia import validate
load_env()
result = validate(content="...", domain="medical")from symposia import load_profile_set, validate
profile_set = load_profile_set(domain="finance")
result = validate(
content="Early findings suggest this strategy could be beneficial with no known risks.",
domain="finance",
profile_set=profile_set.id,
)from symposia import validate
result = validate(
content="This clause always transfers liability away from the buyer.",
domain="legal",
profile="legal_specialist_v1",
)Symposia’s root surface is intentionally small.
Product category note: Symposia is a structured committee judgement engine; validate(...) is the primary API verb for invoking that adjudication flow.
validate(
content,
domain,
profile_set=None,
profile=None,
model=None,
escalation_model=None,
routing=None,
provider_config=None,
decomposition_mode="holistic",
live=None,
)Validate content for a given domain and return an InitialReviewResult.
Use this for almost all first-day library usage.
Customization ladder:
- Default
validate(content, domain="medical")- Simple BYOM
validate(content, domain="medical", model="openai:gpt-4.1-mini")Optional escalation override:
validate(
content,
domain="medical",
model="openai:gpt-4.1-mini",
escalation_model="openai:gpt-4.1",
)- Advanced routing
validate(content, domain="medical", routing="default_initial")- Explicit live mode
validate(
content,
domain="medical",
live=True,
)This defaults to a single OpenAI live juror (openai:gpt-5.4-mini) for initial.
In auto mode (live=None), Symposia selects a live default when provider
credentials are available and falls back to deterministic mode otherwise.
Single-juror live model override:
validate(
content,
domain="medical",
model="openai:gpt-4o-mini",
live=True,
)Committee live path (experimental, opt-in):
validate(
content,
domain="medical",
routing="default_initial_openai",
live=True,
)Precedence and conflict contract:
routing>model/escalation_model> built-in defaults- passing
routingtogether withmodelorescalation_modelraises an error modelandescalation_modelmust be inprovider:modelformat- default
live=Noneauto-selects live execution when provider credentials exist - deterministic fallback emits a warning when no live provider is available
live=Trueforces real LLM execution and validates provider credentialslive=Truewith no explicit routing/model selects a provider-default model (OpenAI-first)- current live path is initial-only; live escalation is not wired yet
- committee live path is experimental and requires explicit
routing=...
Execution boundary note:
- The API ladder and validation contract are wired.
- The public
validate(...)surface defaults to auto mode (live=None). - Auto mode runs live when credentials exist and warns on deterministic fallback.
- The committee live path remains available as an explicit experimental opt-in via routing.
- The OpenAI smoke path remains the validated narrow slice via
examples/openai_initial_live_smoke.pyanddefault_initial_openai.
load_profile_set(domain, profile_set=None, profile=None)Resolve the profile set that would be used for a given domain or explicit selection.
Use this when you want to inspect or preflight profile choice before validation.
The top-level package also exposes:
InitialReviewResultRisk
Everything else should generally be treated as advanced usage and imported from explicit module paths.
At a high level, Symposia follows this path:
- Review the full claim by default as one adjudication unit
- Run an initial review using a fixed profile set
- Aggregate judgments into a result
- Escalate only when needed
- Emit traceable outputs for inspection and governance
decomposition_mode="rule_based" enables the current sentence-splitting path, but that path is not the default because dependency-heavy claims can be distorted by naive decomposition.
The internal machinery is more detailed than the public API suggests by design.
Profile sets are how Symposia controls review posture without forcing the user to assemble a committee manually.
A profile set defines things such as:
- which profiles participate
- domain defaults
- thresholds
- review posture
- escalation sensitivity
For most users, specifying domain is enough.
For more control, choose a named profile set explicitly.
See:
docs/profile-set-selection-guide.md
Current domain-oriented usage focuses on:
- general
- medical
- legal
- finance
The intent is not to claim absolute truth. The library aims to determine whether content is sufficiently supported to rely on under the current review setup.
Symposia is built around a calibrated adjudication approach:
- independent juror review first
- committee structure when plural judgement adds measurable value
- escalation only when initial review is not sufficient
- deterministic thresholds
- traceability and replayability
- small public surface, stronger internal discipline
The experimental ladder (see docs/implementation/22_experimental_ladder_and_testing_revamp.md) governs how committee advantage is evaluated: plurality effect first, cross-family diversity effect second, quality–count frontier last. Claims follow the evidence; each rung must be earned.
For deeper detail, see the documentation set and methodology files in the repository.
Recommended starting points:
docs/implementation/02_methodology_v2.mddocs/implementation/03_system_spec.mddocs/implementation/06_calibration_and_evaluation.mddocs/implementation/13_profile_sets.mddocs/implementation/14_profile_selection_strategy.mddocs/implementation/15_testing_strategy.mddocs/implementation/22_experimental_ladder_and_testing_revamp.mddocs/governance_product_note.md
Symposia includes evaluation and benchmark tooling, but it is important to interpret results carefully.
Current version notes:
- benchmark and acceptance suites are useful, but not the same thing as universal proof
- committee advantage depends on meaningful profile diversity
- profile behaviour is still partly code-defined and will be externalised carefully over time
- domain-specific behaviour should be treated as controlled review posture, not as absolute authority
Controlled decomposition experiments (Steps 2 vs 3 on the experimental ladder) show a clear boundary, but they should be read as decomposition-path experiments rather than the default product path:
| Case family | Mixed-family committee vs same-family | Notes |
|---|---|---|
| forecast-style claims | positive lift — target match and weighted score improve in both splits | Consistent across development and holdout |
| low-risk clear factual | neutral — no measurable lift | Committees converge regardless of family; diversity adds nothing |
| underspecified legal/policy | harmful — weighted score falls, critical dissent spikes | Diversity increases disagreement without improving judgement |
The bounded claim is:
Symposia's mixed-family committee value is claim-structure-dependent. It appears strongest on forecast-style questions, neutral on clear factual questions, and harmful on underspecified policy questions.
This is the first positive proof case for the diversity-adds-value thesis. It is family-scoped and does not generalize across all claim types.
For public positioning and permitted messaging, use docs/governance_product_note.md as the canonical reference.
Detailed evidence sources:
docs/benchmark-summary.mdartifacts/trust_pipeline_runs/2026-03-21-family-focused-validation/family_lift_and_focused_validation_summary.jsonartifacts/trust_pipeline_runs/2026-03-21-adjacent-family-validation/adjacent_family_validation_summary.json
This is an early public release of a real, usable library.
The design is stable at the surface. Internals will continue to evolve.
Current focus:
- expanding the trust evaluation dataset
- running Step 1 vs Step 2 on the experimental ladder (plurality effect)
- preparing profile-set externalisation
- PyPI publication
For contributors and future maintainers:
- keep policy separate from behaviour
- avoid making configuration a second hidden codebase
- preserve deterministic contracts wherever possible
- prefer contract-first expansion over feature sprawl
If a field’s meaning is still enforced by string matching or spread across multiple logic sites, it is not ready to move into configuration yet.
Key reference files:
CHANGELOG.mdRELEASE_NOTES_0.1.1.mddocs/release-checklist.mddocs/profile-set-selection-guide.mddocs/benchmark-summary.mddocs/governance_product_note.mdexamples/locked_end_to_end.py
And in docs/implementation/:
- methodology
- system specification
- build phases
- testing strategy
- profile sets
- profile selection strategy
Current release line: 0.1.1
This repository is source-available for noncommercial use only.
- Noncommercial use is permitted under the license in LICENSE.
- Commercial use requires a separate paid commercial license.
For commercial licensing, OEM use, internal business deployment, hosted or SaaS use, or other revenue-generating use, contact:
Alphanso Walker
team@symposia.ai
See COMMERCIAL-LICENSE.md for the commercial licensing note.
What counts as commercial use?
Commercial use includes internal business use, client work, paid consulting, production use by a for-profit entity, SaaS or hosted deployments, embedding in paid products, and redistribution in commercial offerings.
Can I use Symposia for personal research or open academic work?
Yes. Noncommercial personal study, research, experimentation, and qualifying educational or public-interest uses are covered by the public repository license.
Can my company evaluate it before buying a commercial license?
If the evaluation is for a commercial organization or anticipated commercial application, treat that as commercial use and request a commercial license.
Where do I ask about commercial terms or a buyout?
Contact team@symposia.ai.