Implement FinAcumen core architecture with FT and FM#56
Conversation
- Implemented `lookup_ohlc`, `NumericalReasoningEngine`, and `AnswerConsolidationGate` in `src/core/tools.py` - Created `FinancialMemory` in `src/core/memory.py` with sqlite and TfidfVectorizer - Created `AnnotatorAgent` and `SolverAgent` in `src/agents/` - Implemented `src/finacumen_main.py` asynchronous orchestrator - Added specific tests and verified linter via `ruff` Co-authored-by: laurentvv <8775515+laurentvv@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
- Updated `finacumen_main.py` with `argparse` for CLI support and JSON state caching. - Integrated `get_latest_finacumen_signal` in `main.py` to seamlessly pass cached data into the hybrid decision model. - Added `finacumen_decision` parameter to `EnhancedDecisionEngine`. - Updated `config_weights.py` to assign a baseline weight for FinAcumen. - Updated `README.md` to document the FinAcumen components, execution command, and architectural role. Co-authored-by: laurentvv <8775515+laurentvv@users.noreply.github.com>
| error = None | ||
| try: | ||
| # exec modifie le namespace en place | ||
| exec(code, self.namespace) |
There was a problem hiding this comment.
CRITICAL: Arbitrary code execution via LLM-controlled exec
NumericalReasoningEngine.execute() runs Python supplied by the LLM with a persistent namespace that still exposes Python builtins. An LLM can mutate namespace state or bypass the intended lookup_ohlc boundary, so this should be replaced with a sandboxed evaluator, AST whitelist, or subprocess isolation before production use.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
| "model": self.model_name, | ||
| "prompt": current_prompt, | ||
| "stream": False, | ||
| "format": "json", # Relaxed json structure since it can be python_code OR final answer |
There was a problem hiding this comment.
CRITICAL: Loose Ollama JSON format violates schema invariant
This new production LLM call uses "format": "json" instead of a strict schema object with additionalProperties: false, while the project invariant requires every production call site to use schema-strict format: SCHEMA_*. The same call also strips thinking tokens at line 75, which conflicts with the documented dual-layer defence and can reintroduce Gemma thought-channel JSON debris.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
|
|
||
| raw_output = response.json().get("response", "").strip() | ||
| # Remove thinking tokens | ||
| clean_output = _strip_thinking_prefix(raw_output) |
There was a problem hiding this comment.
CRITICAL: Annotator LLM call bypasses schema and strips thinking output
This direct Ollama call has no strict format schema and removes thinking-prefix debris with _strip_thinking_prefix. The project invariant requires production LLM call sites to keep schema enforcement and the never add a 'thought' key prompt defence intact; this free-text call should either use a strict schema or be removed from the production path.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
| output_dir.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| # Fichier d'état lu par main.py | ||
| state_file = output_dir / f"finacumen_{args.ticker}.json" |
There was a problem hiding this comment.
WARNING: Unsanitized ticker in cache filename
args.ticker is embedded directly into the output path. A ticker containing path separators, for example ../x, can write outside data_cache/finacumen; validate or sanitize the ticker before opening the cache file.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
| import json | ||
| from datetime import datetime | ||
|
|
||
| state_file = Path("data_cache/finacumen") / f"finacumen_{ticker}.json" |
There was a problem hiding this comment.
WARNING: Unsanitized ticker in FinAcumen cache path
This helper also composes the cache filename from the raw ticker. If it is ever enabled with external ticker input, ../ could read JSON files outside data_cache/finacumen; validate or sanitize the ticker before opening the file.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
| "oil_bench": oil_bench_decision, | ||
| "grebenkov": grebenkov_decision, | ||
| "hmm_model": hmm_decision, | ||
| "finacumen": finacumen_decision, |
There was a problem hiding this comment.
WARNING: Missing FinAcumen decisions become HOLD votes
When finacumen_decision is None, _convert_legacy_inputs() still creates a ModelDecision(signal="HOLD", confidence=0.0). If the FinAcumen cache or agent is unavailable, the ensemble silently gains a zero-confidence HOLD vote; failed or missing model outputs should be skipped or marked with failed=True and filtered.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
| "tensortrade": 0.05, | ||
| "grebenkov": 0.05, # From enhanced_decision_engine.py | ||
| "hmm_model": 0.05, | ||
| "finacumen": 0.15, |
There was a problem hiding this comment.
WARNING: New base weight breaks existing alignment test
The new "finacumen": 0.15 entry is not reflected in tests/test_weight_alignment.py, whose expected model set and total still match the pre-PR weights. Update that test or the weight config before merge so the repository tests remain green.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
| for entry in memories_data: | ||
| xml_output.append(" <Entry>") | ||
| xml_output.append(f" <Question>{entry['question']}</Question>") | ||
|
|
There was a problem hiding this comment.
WARNING: XML output is not escaped
Memory text is interpolated directly into XML. Questions, findings, or cautions containing &, <, or > will produce malformed XML and may confuse downstream prompts; escape text with xml.sax.saxutils.escape() before formatting.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
| with sqlite3.connect(self.db_path) as conn: | ||
| # We use an explicit parameter list for the query to prevent SQL injection issues | ||
| placeholders = ",".join("?" for _ in top_ids) | ||
| cursor = conn.execute( |
There was a problem hiding this comment.
SUGGESTION: Redundant SQL query
The first SELECT ... WHERE id IN (...) result is never used; the same query is executed again to build rows_dict. Remove the first query or reuse its rows to avoid unnecessary database work.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
|
|
||
| # 2. Construction du contexte | ||
| current_context = f"Analyze {symbol}. Details: {json.dumps(context_details)}" | ||
|
|
There was a problem hiding this comment.
WARNING: Context serialization can crash the pipeline
json.dumps(context_details) assumes the market context is JSON-serializable. Once real market data is integrated from pandas/numpy, this can raise TypeError and prevent the failure-state cache from being written. Use default=str or normalize values before serialization.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
Code Review SummaryStatus: 5 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)CRITICAL
WARNING
SUGGESTION
Files Reviewed (6 files)
Previous Review Summary (commit 5e9c1cc)Current summary above is authoritative. Previous snapshots are kept for context only. Previous review (commit 5e9c1cc)Status: 11 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)CRITICAL
WARNING
SUGGESTION
Files Reviewed (12 files)
Reviewed by laguna-m.1-20260312:free · Input: 557.4K · Output: 7.6K · Cached: 191.1K |
- Hardened `NumericalReasoningEngine` to use strict `__builtins__` for safe `exec` execution. - Enforced strict project schemas `SCHEMA_FINACUMEN_SOLVER` and `SCHEMA_FINACUMEN_ANNOTATOR` for Ollama JSON formatting. - Applied `re` regex sanitation to `ticker` inputs for file pathing to prevent path traversal vulnerability. - Omitted FinAcumen missing decision from `EnhancedDecisionEngine` instead of inserting silent HOLD bias. - Synced `test_weight_alignment.py` with the updated 0.15 weight configuration. - Used `xml.sax.saxutils.escape` to prevent XML malformation in Financial Memory exports. - Optimized Financial Memory retrieval DB query and fixed `TypeError` in JSON context serialization via `default=str`. Co-authored-by: laurentvv <8775515+laurentvv@users.noreply.github.com>
This PR implements the architecture outlined in the "FinAcumen" research paper within the Trading-AI project, giving the compact local LLM a persistent namespace via Financial Tools (FT) and a Financial Memory (FM).
Implementation Details:
Financial Tools (
src/core/tools.py):lookup_ohlc: Data retrieval interface directly tied to yfinance capabilities.NumericalReasoningEngine: Python state execution namespace allowing cascading mathematical operations without losing variables.AnswerConsolidationGate: Validation layer ensuring trajectory logic, correct signal formatting, and appropriate unit definitions.Financial Memory (
src/core/memory.py):sklearn.TfidfVectorizerto meet environment restrictions.question, gold_answer, findings, cautions.<Memory_Block>using a strict cosine similarity threshold (tau = 0.65).Agents (
src/agents/):AnnotatorAgent: Extracts applicable rules (MUST/DO NOT) mapped to current market contexts.SolverAgent: Implements the ReAct 16-step reasoning loop encapsulating<Think_Steps>and querying theNumericalReasoningEngine.Orchestrator (
src/finacumen_main.py):PR created automatically by Jules for task 9873678374876231965 started by @laurentvv