Skip to content

Fix #90: render identity-threat framing in persona/reasoning context#104

Closed
DeveshParagiri wants to merge 2 commits intomainfrom
codex/issue-90-identity-threat-framing
Closed

Fix #90: render identity-threat framing in persona/reasoning context#104
DeveshParagiri wants to merge 2 commits intomainfrom
codex/issue-90-identity-threat-framing

Conversation

@DeveshParagiri
Copy link
Copy Markdown
Collaborator

Summary

  • Extend ReasoningContext with identity_threat_summary
  • Add deterministic identity-threat detection in engine context building using scenario text plus agent identity attributes (political, religious, race/ethnicity, gender/sexual identity, parental role, citizenship)
  • Inject a dedicated Identity Relevance prompt section so agents can explicitly reason when an issue feels identity-relevant
  • Add tests covering context construction and prompt inclusion

Testing

pytest -q tests/test_engine.py::TestTokenAccumulation::test_build_reasoning_context_adds_identity_threat_summary tests/test_reasoning_prompts.py::TestPhaseAPromptFeatures::test_identity_relevance_included

Closes #90


Recreated from closed PR #97 (base branch was deleted)


⚠️ CHANGES REQUESTED - DO NOT MERGE

This PR uses hardcoded scenario keywords for identity detection:

if political_value and scenario_mentions(
    ("liberal", "conservative", "book ban", "school board", ...)
):

Problems:

  1. Not configurable - can't add/remove keywords without code changes
  2. Scenario-specific leakage - book ban, school board are from test scenario
  3. False positives - men/man matches management, manual, etc.
  4. Not extensible - new identity dimensions require code changes

Suggested fix: Add identity_dimensions field to scenario spec:

identity_dimensions:
  - dimension: political_orientation
    reason: "The policy is framed along partisan lines"

See full review: #97 (review)

@DeveshParagiri
Copy link
Copy Markdown
Collaborator Author

Superseded by #106 which implements a proper LLM-driven approach for identity_dimensions detection.

@DeveshParagiri DeveshParagiri deleted the codex/issue-90-identity-threat-framing branch February 23, 2026 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Identity-threat framing not rendered in persona/reasoning prompts (Tenet 10)

2 participants