feat: add context-aware adaptive risk scoring by varshini-nandula · Pull Request #157 · Devnil434/Eagle

varshini-nandula · 2026-06-17T18:39:43Z

Related Issue

Closes #152

Overview

This PR introduces a Context-Aware Adaptive Risk Scoring Engine that replaces the existing static severity calculation approach with a configurable, policy-driven scoring mechanism.

The implementation enhances alert prioritization by considering contextual signals such as restricted-zone presence, repeated approaches, loitering behavior, after-hours activity, and reasoning confidence while preserving all existing workflows, APIs, and frontend functionality.

What Changed

1. Added YAML-Based Risk Policy Configuration

New File

configs/risk_policy.yaml

Introduced configurable risk weights for contextual signals:

Restricted zone presence
Repeated approach behavior
Loitering / dwell time
After-hours activity
Reasoning confidence

This allows risk-scoring behavior to be adjusted without modifying application code.

2. Added Adaptive Risk Scoring Engine

New File

services/reasoning/risk_scoring.py

Implemented a reusable AdaptiveRiskScorer that:

Loads risk weights from YAML
Normalizes contextual signals
Calculates weighted risk scores
Produces risk levels (Low / Medium / High)
Returns explainable contributing factors

The scorer is designed to be modular and extensible for future risk signals.

3. Integrated Scoring into Existing Reasoning Pipeline

Modified File

services/reasoning/pipeline.py

Updated the existing severity calculation workflow to use the new adaptive scoring engine internally.

Key points:

No existing API contracts were changed
No frontend changes were required
Existing response structures remain intact
Existing severity-based alert prioritization continues to function

The adaptive scorer now powers severity calculation while preserving the current system behavior.

4. Added Unit Tests

New File

tests/test_risk_scoring.py

Added tests covering:

YAML policy loading
Signal normalization
Weighted score calculation
Risk level classification
Output structure validation
Deterministic scoring behavior

🔄 Updated Workflow

Previous Workflow

Detection
    ↓
Tracking
    ↓
Temporal Memory
    ↓
VLM
    ↓
LLM Reasoning
    ↓
Static Severity Calculation
    ↓
Alert Generation

Updated Workflow

Detection
    ↓
Tracking
    ↓
Temporal Memory
    ↓
VLM
    ↓
LLM Reasoning
    ↓
Adaptive Risk Scoring Engine
    ↓
Severity Score
    ↓
Alert Generation

The adaptive scorer now evaluates contextual signals before producing the final severity score used by the existing alerting pipeline.

🧪 Testing

All existing tests continue to pass.

Additional tests were added to validate:

Policy loading
Scoring calculations
Risk classification
Pipeline integration behavior

📌 Summary

This PR introduces a configurable, context-aware risk scoring system that improves alert prioritization while maintaining full backward compatibility with Eagle's existing architecture and workflows.

Summary by CodeRabbit

Release Notes

New Features
- Implemented an adaptive risk-scoring system that evaluates multiple contextual signals including restricted zone proximity, repeated approaches, loitering duration, and after-hours activity.
- Risk scores are now normalized on a 0–100 scale and classified into Low, Medium, and High severity levels.
- Added configurable risk-assessment policies to fine-tune scoring weights and classification thresholds.
Tests
- Added comprehensive test coverage for the risk-scoring system, including signal normalization, weighted scoring, and severity classification validation.

coderabbitai · 2026-06-17T18:39:57Z

Warning

Review limit reached

@varshini-nandula, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 43 minutes and 14 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fdf5c580-3bc1-4985-bf98-63a031ba1627

📥 Commits

Reviewing files that changed from the base of the PR and between f0dd265 and b7370c6.

📒 Files selected for processing (3)

services/reasoning/pipeline.py
services/reasoning/risk_scoring.py
tests/test_risk_scoring.py

📝 Walkthrough

Walkthrough

Introduces a YAML-configurable AdaptiveRiskScorer in services/reasoning/risk_scoring.py that computes a normalized 0–100 risk score from contextual signals. ReasoningPipeline is updated to inject this scorer, replacing the removed static _W weights. A YAML policy file and a deterministic pytest suite are added alongside.

Changes

Adaptive Risk Scoring Engine

Layer / File(s)	Summary
YAML policy config and `RiskScoringResult` contract `configs/risk_policy.yaml`, `services/reasoning/risk_scoring.py`	Defines signal weights, classification thresholds (`Low`/`Medium`/`High`), and normalization limits in YAML; declares the `RiskScoringResult` TypedDict and module-level signal-label mapping constants.
`AdaptiveRiskScorer` implementation `services/reasoning/risk_scoring.py`	Implements `__init__` (YAML loading with `FileNotFoundError`/`ValueError` validation and section checks), `score` (normalization → weighted aggregation → 0–100 scaling → classification → factor extraction), and all helper methods.
`ReasoningPipeline` wiring and `_attach_severity` rewrite `services/reasoning/pipeline.py`	Adds `AdaptiveRiskScorer` import, removes static `_W` weight dict, injects optional `risk_scorer` into the constructor, and rewrites `_attach_severity` to derive contextual signals (restricted zone, repeated approach, dwell/loitering, after-hours, confidence) and delegate to `self._risk_scorer.score`.
Test suite `tests/test_risk_scoring.py`	Deterministic pytest suite with fixtures for temporary YAML policy; covers YAML loading, normalization ratios, clamping, weighted scores for inactive/active signal scenarios, classification boundary verification, and result structure invariants.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Devnil434/Eagle#52: Introduces a RiskAnalyzer that computes a capped 0–100 risk score from zone/object/person factors — directly overlaps with the weighted multi-signal risk scoring logic introduced by AdaptiveRiskScorer.

Poem

🐇 A rabbit once weighed every sign,
From loitering dwell to the after-hours shrine,
With YAML in paw and weights summed to one,
The score climbs to High when the signals have won,
Low, Medium, High — now the risk engine's fine! 🎯

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add context-aware adaptive risk scoring' clearly and concisely summarizes the main change—introducing an adaptive risk scoring system that is context-aware.
Linked Issues check	✅ Passed	All primary objectives from issue `#152` are met: YAML-based configurable policy [152], contextual signal evaluation (restricted zone, loitering, repeated approach, after-hours, reasoning confidence) [152], normalized risk scoring with Low/Medium/High classification [152], integration into reasoning pipeline [152], and comprehensive test coverage [152].
Out of Scope Changes check	✅ Passed	All changes are directly aligned with issue `#152` objectives: risk policy configuration, adaptive scorer implementation, pipeline integration, and tests. No unrelated modifications or scope creep detected.
Docstring Coverage	✅ Passed	Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/test_risk_scoring.py (1)

65-174: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add regression tests for policy validation and negative-value clamping.

Current tests miss two critical contracts: rejecting invalid policy weights/thresholds and flooring negative continuous inputs to 0.0. Adding these will prevent silent scoring regressions.

Suggested test cases

+def test_yaml_invalid_weight_sum_raises(tmp_path):
+    bad = tmp_path / "risk_policy.yaml"
+    bad.write_text(
+        _VALID_POLICY.replace("weight: 0.10", "weight: 0.50"),
+        encoding="utf-8",
+    )
+    with pytest.raises(ValueError):
+        AdaptiveRiskScorer(policy_path=bad)
+
+
+def test_negative_continuous_signals_floor_to_zero(scorer):
+    normalized = scorer._normalize_signals({
+        "repeated_approach": -3,
+        "loitering": -15.0,
+    })
+    assert normalized["repeated_approach"] == 0.0
+    assert normalized["loitering"] == 0.0

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_risk_scoring.py` around lines 65 - 174, Add two new test functions
to cover missing regression scenarios for the AdaptiveRiskScorer class. First,
create a test that verifies the scorer rejects or handles invalid policy weights
and thresholds that do not sum to 1.0 or fall outside acceptable ranges. Second,
add a test that passes negative values for continuous signal inputs (like
negative loitering or repeated_approach values) to the _normalize_signals or
score methods and verifies they are clamped to 0.0 rather than producing
incorrect results. These tests will ensure the implementation properly validates
configuration and handles edge cases in signal normalization.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@services/reasoning/pipeline.py`:
- Around line 239-241: The is_after_hours calculation at line 239-241 uses
datetime.datetime.now().hour which depends on the server's current time, causing
the same event to be scored differently based on when the pipeline runs. Instead
of using datetime.datetime.now().hour, extract the event timestamp, apply the
configured timezone to it, and then check if the event's hour falls within the
configured after-hours window (currently hardcoded as 20 or less than 6).
Replace the server time dependency with event-based time logic that uses both
the event timestamp and timezone configuration.
- Line 236: The approach_count variable is currently counting the string
"repeated_approach" in seq.action_summary, which is fragile and format-dependent
and can lead to incorrect counts. Instead, iterate through seq.events directly
and count the number of events that contain ActionHint.REPEATED_APPROACH to
extract this signal in a stable, structured way that is independent of text
formatting.

In `@services/reasoning/risk_scoring.py`:
- Around line 166-167: The condition at lines 166-167 that checks if value > 0.0
is too permissive for the reasoning_confidence factor, causing low confidence
values like 0.1 to incorrectly receive the "High reasoning confidence" label.
Replace the simple value > 0.0 check with a more appropriate threshold (such as
checking if the value exceeds a meaningful confidence cutoff like 0.5 or 0.7) to
ensure only actually high confidence values produce that factor label.
Alternatively, implement conditional logic to apply different factor labels
based on confidence tiers if low and medium confidence levels should be
distinguished.
- Around line 139-140: The normalization expressions for "repeated_approach" and
"loitering" only clamp to the upper bound of 1.0 using min(), but do not clamp
to the lower bound of 0.0. This allows negative normalized values to be
returned, which can suppress total risk calculation. Wrap both the
"repeated_approach" and "loitering" normalization expressions with an additional
max() function call to ensure the result is clamped between 0.0 and 1.0,
preventing negative values from affecting the risk scoring.
- Around line 195-201: The _load_policy method currently only validates that
required sections exist in the policy data but does not enforce constraints on
the actual values within those sections (weights, thresholds, normalization
values). Add comprehensive schema validation after the section presence checks
to verify that weights and thresholds are valid numeric values within acceptable
ranges, and that normalization parameters meet expected constraints. This
ensures invalid configurations are caught at load-time rather than causing
silent failures during risk score calculations.
- Line 29: The risk_scoring.py module imports the yaml module, but PyYAML is not
declared as a dependency in services/reasoning/requirements.txt. This will cause
a ModuleNotFoundError when the service is deployed independently. Add the line
pyyaml>=6.0 to the services/reasoning/requirements.txt file to ensure the
required dependency is installed.

---

Outside diff comments:
In `@tests/test_risk_scoring.py`:
- Around line 65-174: Add two new test functions to cover missing regression
scenarios for the AdaptiveRiskScorer class. First, create a test that verifies
the scorer rejects or handles invalid policy weights and thresholds that do not
sum to 1.0 or fall outside acceptable ranges. Second, add a test that passes
negative values for continuous signal inputs (like negative loitering or
repeated_approach values) to the _normalize_signals or score methods and
verifies they are clamped to 0.0 rather than producing incorrect results. These
tests will ensure the implementation properly validates configuration and
handles edge cases in signal normalization.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 23b60538-b348-4b80-a680-838d2c067487

📥 Commits

Reviewing files that changed from the base of the PR and between 3429ec9 and f0dd265.

📒 Files selected for processing (4)

configs/risk_policy.yaml
services/reasoning/pipeline.py
services/reasoning/risk_scoring.py
tests/test_risk_scoring.py

coderabbitai · 2026-06-17T18:44:38Z

+        import datetime
+        current_hour = datetime.datetime.now().hour
+        is_after_hours = current_hour >= 20 or current_hour < 6


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

After-hours currently depends on server clock, not event time.

Line 239–241 makes severity depend on when the pipeline runs, so replaying the same track at a different hour can produce a different score. Compute after-hours from event timestamp + configured timezone/window instead.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/reasoning/pipeline.py` around lines 239 - 241, The is_after_hours calculation at line 239-241 uses datetime.datetime.now().hour which depends on the server's current time, causing the same event to be scored differently based on when the pipeline runs. Instead of using datetime.datetime.now().hour, extract the event timestamp, apply the configured timezone to it, and then check if the event's hour falls within the configured after-hours window (currently hardcoded as 20 or less than 6). Replace the server time dependency with event-based time logic that uses both the event timestamp and timezone configuration.

coderabbitai · 2026-06-17T18:44:38Z

+from pathlib import Path
+from typing import Any, TypedDict
+
+import yaml


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Candidate dependency manifests ==" fd -HI 'pyproject.toml|requirements.*|Pipfile|setup.py|poetry.lock' echo echo "== Entries mentioning yaml/pyyaml ==" rg -n --hidden \ --iglob 'pyproject.toml' \ --iglob 'requirements*.txt' \ --iglob 'Pipfile' \ --iglob 'setup.py' \ --iglob 'poetry.lock' \ '(?i)\bpyyaml\b|\byaml\b'

Repository: Devnil434/Eagle

Length of output: 14533

🏁 Script executed:

cat services/reasoning/requirements.txt

Repository: Devnil434/Eagle

Length of output: 148

Add PyYAML to services/reasoning/requirements.txt.

The module imports yaml but PyYAML is not listed in the service's dependency manifest. If services/reasoning/ is deployed independently, startup will fail with ModuleNotFoundError. Add pyyaml>=6.0 to services/reasoning/requirements.txt.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/reasoning/risk_scoring.py` at line 29, The risk_scoring.py module imports the yaml module, but PyYAML is not declared as a dependency in services/reasoning/requirements.txt. This will cause a ModuleNotFoundError when the service is deployed independently. Add the line pyyaml>=6.0 to the services/reasoning/requirements.txt file to ensure the required dependency is installed.

coderabbitai · 2026-06-17T18:44:38Z

+        for section in ("risk_scoring", "risk_levels", "normalization"):
+            if section not in data:
+                raise ValueError(
+                    f"Risk policy missing required section: '{section}'"
+                )
+
+        return data


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce full policy schema constraints at load-time.

_load_policy currently checks section presence only. Invalid weights/thresholds/normalization values pass startup and produce silently wrong risk scores.

Proposed hard validation in loader

for section in ("risk_scoring", "risk_levels", "normalization"): if section not in data: raise ValueError( f"Risk policy missing required section: '{section}'" ) + + weights = [] + for signal, cfg in data["risk_scoring"].items(): + weight = cfg.get("weight") + if not isinstance(weight, (int, float)) or weight < 0: + raise ValueError(f"Invalid weight for '{signal}': {weight}") + weights.append(float(weight)) + if abs(sum(weights) - 1.0) > 1e-6: + raise ValueError("Risk policy weights must sum to 1.0") + + low_max = data["risk_levels"].get("low_max") + medium_max = data["risk_levels"].get("medium_max") + if not (isinstance(low_max, (int, float)) and isinstance(medium_max, (int, float))): + raise ValueError("risk_levels.low_max and medium_max must be numeric") + if not (0 <= low_max <= medium_max <= 100): + raise ValueError("Risk thresholds must satisfy 0 <= low_max <= medium_max <= 100") + + loitering_max = data["normalization"].get("loitering_max_seconds") + approach_max = data["normalization"].get("repeated_approach_max_count") + if not (isinstance(loitering_max, (int, float)) and loitering_max > 0): + raise ValueError("normalization.loitering_max_seconds must be > 0") + if not (isinstance(approach_max, (int, float)) and approach_max > 0): + raise ValueError("normalization.repeated_approach_max_count must be > 0") return data

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/reasoning/risk_scoring.py` around lines 195 - 201, The _load_policy method currently only validates that required sections exist in the policy data but does not enforce constraints on the actual values within those sections (weights, thresholds, normalization values). Add comprehensive schema validation after the section presence checks to verify that weights and thresholds are valid numeric values within acceptable ranges, and that normalization parameters meet expected constraints. This ensures invalid configurations are caught at load-time rather than causing silent failures during risk score calculations.

feat: add context-aware adaptive risk scoring

f0dd265

coderabbitai Bot reviewed Jun 17, 2026

View reviewed changes

fix: improve adaptive risk scoring validation

b7370c6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add context-aware adaptive risk scoring#157

feat: add context-aware adaptive risk scoring#157
varshini-nandula wants to merge 2 commits into
Devnil434:mainfrom
varshini-nandula:feat/context-aware-risk-scoring

varshini-nandula commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 17, 2026

Uh oh!

coderabbitai Bot Jun 17, 2026

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

varshini-nandula commented Jun 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Overview

What Changed

1. Added YAML-Based Risk Policy Configuration

2. Added Adaptive Risk Scoring Engine

3. Integrated Scoring into Existing Reasoning Pipeline

4. Added Unit Tests

🔄 Updated Workflow

Previous Workflow

Updated Workflow

🧪 Testing

📌 Summary

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

varshini-nandula commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading