refactor: Cypher-Pushdown für Entitätsfilter + Bounded Queue im GraphMemoryUpdater by arn0ld87 · Pull Request #8 · arn0ld87/agora

arn0ld87 · 2026-04-23T14:55:12Z

Summary

Setzt die zwei konkret umrissenen Refactorings aus dem Architektur-Audit um:

Cypher-Pushdown für EntityReader — statt get_all_nodes + get_all_edges im RAM zu filtern, läuft Filterung + Adjazenz jetzt als Cypher-Query in Neo4j. Neue Storage-Methode GraphStorage.get_filtered_entities_with_edges. Kontrakt von FilteredEntities bleibt identisch; die vier bestehenden Call-Sites (simulation_entities, simulation_prepare, simulation_history, simulation_manager) brauchen keine Änderung.
Bounded Queue + Backpressure im GraphMemoryUpdater — unbegrenzte Queue() ersetzt durch Queue(maxsize=GRAPH_MEMORY_QUEUE_MAX) mit put(timeout=GRAPH_MEMORY_PUT_TIMEOUT). Bei Überlauf wird das Event verworfen (neuer dropped_count in den Stats), statt OOM zu triggern. max_queue_size=0 behält die alte unbounded-Semantik als Opt-in.

Die großen Architektur-Vorschläge aus dem Audit (Event-Bus-IPC via Redis, Temporal Graph Evolution, Dynamic Ontology Mutation, Multi-Agent Consensus Metrics) sind als separate GitHub-Issues eingetragen — siehe unten in der Issue-Liste.

Konkrete Änderungen

backend/app/storage/graph_storage.py — neue abstrakte Methode get_filtered_entities_with_edges
backend/app/storage/neo4j_storage.py — Cypher-Implementierung mit zwei Varianten (mit/ohne Edge-Enrichment) und defensivem Post-Processing (Direction-Logic, Dedup, Typ-Whitelist als Parameter, nicht als Interpolation)
backend/app/services/entity_reader.py — delegiert jetzt an die Storage-Methode
backend/app/services/graph_memory_updater.py — bounded Queue, dropped_count, Config-Anbindung
backend/app/config.py — GRAPH_MEMORY_QUEUE_MAX (10000), GRAPH_MEMORY_PUT_TIMEOUT (2.0)
package.json — lint:backend-Scope um refaktorierte Dateien und neue Tests erweitert
CHANGELOG.md — Eintrag unter v0.4.1
Neue Tests (14 Cases): test_entity_reader.py, test_graph_memory_updater.py, test_neo4j_filtered_entities.py

Test plan

npm run test:backend — 102/102 grün
npm run lint:backend (gescopter Rollout, inkl. neuer Dateien) — clean
npm run lint:frontend — clean
npm run build:frontend — clean
uv run python -m compileall app scripts — clean

Ausstehend vor Merge (benötigt laufendes Neo4j): Live-Probe des neuen Cypher-Queries gegen einen real gebauten Graphen, inkl. Messung der Memory/Latency-Verbesserung auf einem Graphen mit >1k Entitäten.

https://claude.ai/code/session_01CiS1Gg8J8YBkRy3S2QJvPi

Generated by Claude Code

Addresses two concrete bottlenecks from the architecture audit: 1. EntityReader.filter_defined_entities used to pull every node and every edge of a graph into Python memory and filter there. For graphs with more than a few thousand entities this caused large memory spikes and latency. The filtering, type whitelist check, and adjacency lookup now run inside Neo4j via a new GraphStorage.get_filtered_entities_with_edges method. EntityReader only assembles EntityNode dataclasses from the returned rows. 2. GraphMemoryUpdater held an unbounded queue of agent activities. When Neo4j ingestion was slower than OASIS event generation the queue grew without limit and could OOM the backend. The queue is now bounded via GRAPH_MEMORY_QUEUE_MAX (default 10000) with a put-side timeout (GRAPH_MEMORY_PUT_TIMEOUT) that applies backpressure and, on overflow, drops events with a dropped_count stat instead of crashing. Public contracts are preserved: FilteredEntities keeps the same shape and callers in simulation_entities / simulation_prepare / simulation_history / simulation_manager do not need changes. GraphMemoryUpdater's constructor keeps backwards compatibility via default parameters, and max_queue_size=0 still gives the old unbounded behaviour as an opt-in. Tests: 14 new tests covering EntityReader delegation, Neo4jStorage dict assembly (direction logic, dedup, type whitelist as parameter, edgeless path), and the bounded queue (drop on overflow, DO_NOTHING skip path, unbounded opt-in, stats surface). Quality gate: 102/102 backend tests green, scoped ruff clean, frontend lint + build clean. https://claude.ai/code/session_01CiS1Gg8J8YBkRy3S2QJvPi

gemini-code-assist

Code Review

This pull request introduces a Cypher-pushdown optimization for entity filtering and adjacency fetching, delegating these operations to Neo4j to reduce memory consumption in the backend. It also implements a bounded queue with backpressure in the GraphMemoryUpdater to protect against OOM crashes during high-load simulations. The updates include new configuration options, storage interface extensions, and a suite of unit tests. Reviewer feedback suggests adopting f-strings for logging to maintain stylistic consistency with the rest of the project.

gemini-code-assist · 2026-04-23T15:00:48Z

+        logger.info(
+            "GraphMemoryUpdater initialized: graph_id=%s, batch_size=%s, "
+            "queue_max=%s, put_timeout=%ss",
+            graph_id,
+            self.BATCH_SIZE,
+            self._max_queue_size or "unbounded",
+            self._put_timeout,
+        )


Zur Verbesserung der Konsistenz im Code-Stil würde ich vorschlagen, hier f-Strings für das Logging zu verwenden, so wie es auch in anderen Teilen dieser Datei (stop()-Methode) und des Projekts der Fall ist.

Die Verwendung von %-Formatierung ist zwar für das Logging nicht falsch, aber die Mischung verschiedener Stile kann die Lesbarkeit und Wartbarkeit erschweren.

logger.info( f"GraphMemoryUpdater initialized: graph_id={graph_id}, batch_size={self.BATCH_SIZE}, " f"queue_max={self._max_queue_size or 'unbounded'}, put_timeout={self._put_timeout}s" )

gemini-code-assist · 2026-04-23T15:00:48Z

+            logger.error(
+                "Activity queue full for graph %s (maxsize=%s) — dropping "
+                "event: %s by %s",
+                self.graph_id,
+                self._max_queue_size,
+                activity.action_type,
+                activity.agent_name,
+            )


Auch hier wird die %-Formatierung für das Logging verwendet, während an anderen Stellen f-Strings genutzt werden. Um die Konsistenz im Code zu wahren, empfehle ich, auch diesen Log-Aufruf auf f-Strings umzustellen.

logger.error( f"Activity queue full for graph {self.graph_id} (maxsize={self._max_queue_size}) — dropping " f"event: {activity.action_type} by {activity.agent_name}" )

arn0ld87 marked this pull request as ready for review April 23, 2026 14:59

arn0ld87 merged commit 9f0050e into main Apr 23, 2026
5 checks passed

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Cypher-Pushdown für Entitätsfilter + Bounded Queue im GraphMemoryUpdater#8

refactor: Cypher-Pushdown für Entitätsfilter + Bounded Queue im GraphMemoryUpdater#8
arn0ld87 merged 1 commit intomainfrom
claude/architecture-audit-refactor-9Vn6W

arn0ld87 commented Apr 23, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

arn0ld87 commented Apr 23, 2026

Summary

Konkrete Änderungen

Test plan

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants