Skip to content

Write tests for #4 and #5 per research test plan #15

@mingles-agent

Description

@mingles-agent

From research #11

Overview

Implement the layered test suite specified in the research findings for issues #4 (reputation-adjusted selection weight) and #5 (intra-epoch circuit breaker + probe recovery). Core implementation is already merged — this task is about coverage gaps: missing unit tests for edge cases and wiring, and new Testermint integration test classes.

Reference: #11 (full test specifications in comments)

Implementation Checklist

Phase 1: Missing unit tests — epochgroup/ (~0.5 days)

  • TestCalculateSelectionWeight_ExactFloorEquality in epoch_group_test.go — reputation=1 → adjusted=floor, verify no double-floor
  • TestCalculateSelectionWeight_3NodeDistribution in epoch_group_test.go — 3 nodes (rep 100/50/0), run 200k draws, assert ratios within ±2% tolerance
  • TestBuildSelectionWeightsMap_NilEntry — nil ValidationWeight entry in slice is skipped gracefully

Phase 2: Missing unit tests — keeper/circuit_breaker_test.go (~0.5 days)

  • TestHealthFilter_ExactMissThreshold — 3 hits / 1 miss = exactly 25% must NOT trigger exclusion (strict >, not >=)
  • TestHealthFilter_InferenceCountAndMissedRequestsSumToTotal — verify total = InferenceCount + MissedRequests used in filter (not just InferenceCount)
  • TestHealthFilter_NilParticipantDefaultInclude — participant not found in store → default include (safety fallback)
  • TestHealthFilter_ProbeStateIgnoresMissRate — node in PROBE state included even with 100% miss rate in stats

Phase 3: Feedback hook wiring tests (~1 day)

  • TestFinishInference_ProbeCBTransitionsToHealthy in msg_server_finish_inference_test.go — set executor to PROBE state, submit MsgFinishInference, assert CB entry deleted (→ defaults healthy)
  • TestExpiredInference_MissedRequestsIncrement in endblock_test.go or similar — call inference expiry handler with a HEALTHY executor, verify MissedRequests is incremented AND RecordCBResult(false) is a no-op (healthy stays healthy)
  • TestExpiredInference_ProbeCBTransitionsToExcluded — expire an inference where executor is in PROBE state, verify CB transitions to EXCLUDED with doubled cooldown

Phase 4: Testermint client — add CB state query (~0.5 days)

  • Add getCBState(address: String): CircuitBreakerEntry to ApplicationCLI.kt using inferenced query inference circuit-breaker-state {address} CLI
  • Add data class CircuitBreakerEntry(state, excludedAtBlock, cooldownBlocks, probeAttempts) to data/ package
  • Add integration assertion helper assertCBState(pair, address, expectedState) to TestUtils.kt

Phase 5: Testermint integration tests — CircuitBreakerTests.kt (new file, ~1.5 days)

  • fast exclusion - node stops receiving traffic after min_samples misses — mock join node to return invalid JSON, run inferences until excluded (4+ misses), verify subsequent requests go to genesis only
  • all-degraded fallback - network keeps running when all nodes excluded — mock ALL nodes to fail, accumulate exclusions for all, verify GetRandomExecutor still returns a result (fallback to full list), verify no crash
  • probe recovery - excluded node recovers after cooldown — exclude a node, wait 50+ blocks (DefaultCBInitialCooldownBlocks), verify exactly one probe inference is sent to that node, restore mock, verify normal traffic resumes
  • exponential backoff - cooldown doubles on repeated probe failures — fail 3 consecutive probes, verify cooldown sequence: 50 → 100 → 200 blocks (capped at 500)
  • epoch boundary clears CB state — exclude a node, trigger epoch transition, verify CB state is cleared (node receives traffic again)

Phase 6: Testermint integration tests — ReputationWeightTests.kt (new file, ~1 day)

  • high reputation node receives more executor traffic — set up two nodes with known reputation difference via EpochsCompleted (multi-epoch run), run 200 requests, assert ratio is within expected ±15% of theoretical
  • zero reputation node receives floor traffic — fresh participant (rep=0) vs established node (rep=100), run 1000 requests, assert new node gets at least 0.5% of traffic

Dependencies & Blockers

  • CB params (DefaultCBMissThresholdPct, DefaultCBMinSamples, DefaultCBInitialCooldownBlocks, DefaultCBMaxCooldownBlocks) are currently Go constants (not proto params). Phases 5 scenarios C/D require 50-block waits unless these are promoted to ValidationParams proto. Consider filing a follow-up to promote them, or adjust test design to use node.waitForMinimumBlock(currentBlock + 55).
  • Testermint CLI command for CB state query needs to match whatever endpoint the chain exposes — verify inferenced query inference circuit-breaker-state {address} exists or find the actual route.
  • Phase 6 reputation tests depend on being able to run multiple epochs in a test or manipulate EpochsCompleted — check if genesis spec can pre-load reputation via epochsCompleted field.

Estimated Total: 4–5 days

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions