Fix #90: render identity-threat framing in persona/reasoning context by RandomOscillations · Pull Request #97 · exaforge/extropy

RandomOscillations · 2026-02-17T08:12:25Z

Summary

extend ReasoningContext with identity_threat_summary
add deterministic identity-threat detection in engine context building using scenario text plus agent identity attributes (political, religious, race/ethnicity, gender/sexual identity, parental role, citizenship)
inject a dedicated Identity Relevance prompt section so agents can explicitly reason when an issue feels identity-relevant
add tests covering context construction and prompt inclusion

Testing

pytest -q tests/test_engine.py::TestTokenAccumulation::test_build_reasoning_context_adds_identity_threat_summary tests/test_reasoning_prompts.py::TestPhaseAPromptFeatures::test_identity_relevance_included
ruff check extropy/core/models/simulation.py extropy/simulation/engine.py extropy/simulation/reasoning.py tests/test_engine.py tests/test_reasoning_prompts.py

Closes #90

DeveshParagiri

Code Review

Verdict: ✅ Ready to merge

Summary

Implements deterministic identity-threat framing per architecture doc Tenet 10 and §Fix 3. Scenarios that threaten group identity now get an "## Identity Relevance" section in prompts.

Identity Dimensions Detected

Political orientation (liberal, conservative, republican, democrat)
Religious affiliation (church, mosque, temple, faith)
Race/ethnicity (racial, ethnic, minority, immigration)
Gender/sexual identity (LGBT, transgender, pronouns)
Parent/family role (children, school, curriculum, parental rights)
Citizenship (immigration, border, deportation)

Edge Cases Handled

Case	Behavior
No identity relevance in scenario	Returns `None`, no section
Agent with no identity attributes	Returns `None`, no section
Sentinel values ("unknown", "none")	Skipped
Future timeline events	Excluded from corpus

Design Note

Keyword-based detection is simple but appropriate for deterministic, zero-API-cost detection. May need refinement if false positives become problematic in practice.

No changes required.

DeveshParagiri

Code Review

Verdict: ❌ Needs changes - hardcoded keywords are not general-purpose

Problem

The identity-threat detection uses hardcoded keyword lists:

if political_value and scenario_mentions(
    (
        "liberal", "conservative", "left", "right", "republican", "democrat",
        "politic", "ideolog", "culture war", "censorship", "book ban", "school board", " ban ",
    )
):

Issues:

Not configurable - can't add/remove keywords without code changes
Scenario-specific leakage - book ban, school board are clearly from the test scenario
False positives - men/man will match management, manual, humanity, etc.
Not extensible - new identity dimensions require code changes

Suggested Fix

Add an identity_dimensions field to the scenario spec that lets authors declare which identity aspects are threatened:

# scenario.v1.yaml
meta:
  name: "Library Book Removal"
  
identity_dimensions:
  - dimension: political_orientation
    reason: "The policy is framed along partisan lines"
  - dimension: parental_status  
    reason: "Parents are the primary stakeholders in school content decisions"
  - dimension: religious_affiliation
    reason: "Some removals are driven by religious concerns about content"

Then in _render_identity_threat_context():

def _render_identity_threat_context(self, agent: dict[str, Any], timestep: int) -> str | None:
    if not self.scenario.identity_dimensions:
        return None
    
    relevant = []
    for dim in self.scenario.identity_dimensions:
        agent_value = self._identity_value(agent, _IDENTITY_ATTR_KEYS.get(dim.dimension, ()))
        if agent_value:
            relevant.append(f"{dim.dimension} ({agent_value}): {dim.reason}")
    
    if not relevant:
        return None
        
    return (
        "This development can feel identity-relevant, not just practical. "
        f"Parts of who I am that may feel implicated: {'; '.join(relevant)}. "
        "If it feels personal, acknowledge that in both your internal reaction and what you choose to say publicly."
    )

This approach:

Puts scenario authors in control
Zero false positives (explicit declaration)
Extensible without code changes
Documents why each dimension is relevant (useful for prompt quality)

The _IDENTITY_ATTR_KEYS would be a simple mapping from dimension name to agent attribute keys - that's the only hardcoded part, and it's stable.

RandomOscillations added 2 commits February 17, 2026 03:10

Fix #89 persist and classify THINK vs SAY divergence

26d1df3

Fix #90 add identity-threat framing to reasoning context

1563e0d

DeveshParagiri reviewed Feb 17, 2026

View reviewed changes

DeveshParagiri requested changes Feb 17, 2026

View reviewed changes

DeveshParagiri force-pushed the codex/issue-89-think-vs-say-divergence branch from 26d1df3 to 1e680b2 Compare February 17, 2026 19:01

DeveshParagiri deleted the branch codex/issue-89-think-vs-say-divergence February 17, 2026 19:01

DeveshParagiri closed this Feb 17, 2026

DeveshParagiri mentioned this pull request Feb 17, 2026

Fix #90: render identity-threat framing in persona/reasoning context #104

Closed

DeveshParagiri deleted the codex/issue-90-identity-threat-framing branch February 23, 2026 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #90: render identity-threat framing in persona/reasoning context#97

Fix #90: render identity-threat framing in persona/reasoning context#97
RandomOscillations wants to merge 2 commits intocodex/issue-89-think-vs-say-divergencefrom
codex/issue-90-identity-threat-framing

RandomOscillations commented Feb 17, 2026

Uh oh!

DeveshParagiri left a comment

Uh oh!

DeveshParagiri left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RandomOscillations commented Feb 17, 2026

Summary

Testing

Uh oh!

DeveshParagiri left a comment

Choose a reason for hiding this comment

Code Review

Summary

Identity Dimensions Detected

Edge Cases Handled

Design Note

Uh oh!

DeveshParagiri left a comment

Choose a reason for hiding this comment

Code Review

Problem

Suggested Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants