🐊 InvestiGator - AI Investment Research Assistant

Table of Contents

1. 📋 Table of Contents
2. 🌟 Introduction
3. ✨ Key Features
4. 🔄 Refactored Cache System
5. 🧪 Testing & Coverage
6. ⚡ Performance Metrics
7. 📋 Prerequisites
- 7.1. Hardware Requirements
- 7.2. Software Requirements
8. 🚀 Quick Start
9. 📖 Detailed Setup Guide
10. 📚 Usage Guide
11. ⚙️ Configuration
- 11.1. Core Configuration (config.json)
12. 🧪 Testing
13. 🔧 Troubleshooting
- 13.1. Common Issues
14. 🔒 Security & Privacy
- 14.1. Data Protection
- 14.2. Best Practices
15. 📁 Project Structure
16. 🤝 Contributing
17. 📄 License

╔══════════════════════════════════════════════════════════════════════════════════╗
║                                                                                  ║
║    ████████╗███╗  ██╗██╗   ██╗███████╗███████╗████████╗██╗ ██████╗  █████╗████████╗  ║
║    ╚══██╔══╝████╗ ██║██║   ██║██╔════╝██╔════╝╚══██╔══╝██║██╔════╝ ██╔══██╚══██╔══╝  ║
║       ██║   ██╔██╗██║██║   ██║█████╗  ███████╗   ██║   ██║██║  ███╗███████║  ██║     ║
║       ██║   ██║╚████║╚██╗ ██╔╝██╔══╝  ╚════██║   ██║   ██║██║   ██║██╔══██║  ██║     ║
║       ██║   ██║ ╚███║ ╚████╔╝ ███████╗███████║   ██║   ██║╚██████╔╝██║  ██║  ██║     ║
║       ╚═╝   ╚═╝  ╚══╝  ╚═══╝  ╚══════╝╚══════╝   ╚═╝   ╚═╝ ╚═════╝ ╚═╝  ╚═╝  ╚═╝     ║
║                                                                                  ║
║                            🐊  𝗔𝗜 𝗜𝗻𝘃𝗲𝘀𝘁𝗺𝗲𝗻𝘁 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗔𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁  🤖                            ║
║                                                                                  ║
║             ┌─────────────────────────────────────────────────────┐              ║
║             │  📊 SEC Analysis • 📈 Technical • 🎯 AI Synthesis  │              ║
║             │           🔒 Privacy-First • 🏠 Runs Locally        │              ║
║             └─────────────────────────────────────────────────────┘              ║
║                                                                                  ║
║     Professional Investment Research • Built for Privacy-Conscious Investors     ║
╚══════════════════════════════════════════════════════════════════════════════════╝

Your AI-powered investment research companion that runs entirely on your MacBook

InvestiGator is a sophisticated AI investment research system that combines SEC filing analysis with technical analysis to generate professional investment recommendations. Built for privacy-conscious investors who want institutional-grade analysis without cloud dependencies.

Important

No API Keys Required! InvestiGator uses free SEC EDGAR APIs and Yahoo Finance for all financial data.

1. 📋 Table of Contents

Introduction
Key Features
System Architecture
Synthesis Modes
Refactored Cache System
Peer Group Analysis
Cache Management System
Testing & Coverage
Prerequisites
Quick Start
Detailed Setup
Usage Guide
Configuration
Performance Metrics
Troubleshooting
Security & Privacy
Contributing
License

2. 🌟 Introduction

    ╭─────────────────────────────────────────────────────────────╮
    │                    🌟  Welcome to InvestiGator  🌟         │
    │                                                             │
    │  ┌─ 🎯 Mission ─────────────────────────────────────────┐   │
    │  │  Democratize professional investment research        │   │
    │  │  through AI-powered analysis that runs locally      │   │
    │  └─────────────────────────────────────────────────────┘   │
    │                                                             │
    │  ┌─ 🔒 Privacy Promise ─────────────────────────────────┐   │
    │  │  Your investment data never leaves your MacBook     │   │
    │  │  100% local processing • No cloud dependencies     │   │
    │  └─────────────────────────────────────────────────────┘   │
    ╰─────────────────────────────────────────────────────────────╯

InvestiGator is a comprehensive investment research platform that runs entirely on your local machine. It combines:

SEC EDGAR Integration: Direct access to company filings using free APIs
Technical Analysis: Real-time market data analysis with advanced indicators
AI-Powered Insights: Local LLM processing for different analysis aspects
Privacy-First Design: All processing happens locally on your Mac
Professional Reports: Institutional-quality PDF reports with actionable insights
Peer Group Analysis: Russell 1000-based comparative analysis with relative valuations

3. ✨ Key Features

    ╔═══════════════════════════════════════════════════════════════════════════╗
    ║                           ✨  Feature Showcase  ✨                        ║
    ╠═══════════════════════════════════════════════════════════════════════════╣
    ║                                                                           ║
    ║  📊 SEC Analysis    │  📈 Technical      │  🤖 AI Synthesis              ║
    ║  ├─ 10-K/10-Q      │  ├─ 30+ Indicators │  ├─ Local LLM                 ║
    ║  ├─ XBRL Data      │  ├─ Chart Patterns │  ├─ Structured Output         ║
    ║  └─ Fundamentals   │  └─ Risk Metrics   │  └─ Investment Scores         ║
    ║                    │                    │                               ║
    ║  🏢 Peer Groups    │  🔒 Privacy        │  ⚡ Performance              ║
    ║  ├─ Russell 1000   │  ├─ Local Only     │  ├─ Apple Silicon            ║
    ║  ├─ Valuations     │  ├─ No API Keys    │  ├─ Multi-Level Cache        ║
    ║  └─ Comparisons    │  └─ Your Data      │  └─ PDF Reports              ║
    ║                                                                           ║
    ╚═══════════════════════════════════════════════════════════════════════════╝

3.1. 🔍 Comprehensive Analysis

SEC Filing Analysis: AI-powered fundamental analysis of 10-K/10-Q filings using pattern-based architecture
XBRL Data Processing: Automated extraction of financial metrics via SEC Frame API
Technical Analysis: 30+ indicators including moving averages, RSI, MACD, Bollinger Bands, Fibonacci levels
Investment Synthesis: Weighted recommendations (60% fundamental, 40% technical) with 0-10 scoring
Professional Reports: PDF reports with charts, executive summaries, and actionable insights

3.2. 🏢 Advanced Peer Group Analysis

Russell 1000 Classifications: 11 sectors, 50+ industries with comprehensive peer mappings
Comprehensive Pipeline: Full SEC + Technical + Synthesis analysis for entire peer groups
Relative Valuations: P/E, P/B, ROE, debt ratios vs peer averages with discount/premium analysis
Adjusted Price Targets: Peer-informed valuation adjustments based on relative positioning
Professional Reports: PDF reports with 3D/2D positioning charts and comparative tables
Clean Naming Convention: sector_industry_symbol1-symbol2-symbol3.pdf format

3.3. 🛡️ Privacy-First Design

100% Local Processing: All AI models run on your MacBook using Ollama
No Cloud Dependencies: Your investment data never leaves your device
No API Keys Required: Uses free SEC EDGAR APIs and Yahoo Finance
Secure Storage: PostgreSQL with connection pooling and encrypted credentials
Private Communications: Optional secure email delivery via SMTP/TLS

3.4. 🤖 Advanced AI Integration

Apple Silicon Optimization: Leverages Metal framework for GPU-accelerated inference
Dynamic Context Management: Automatic model capability detection (4K-40K+ context windows)
Memory Intelligence: Real-time memory estimation and system resource validation
Three-Stage LLM Pipeline: SEC fundamental → Technical analysis → Investment synthesis
Synthesis Mode Switching: Comprehensive (60% faster) vs Quarterly (25% faster) analysis modes
LLM Thinking Extraction: Transparent reasoning capture from SEC, technical, and synthesis models
Direct Data Optimization: Extract insights from cached LLM responses vs re-processing raw data
Configurable Models: Support for Llama3.1, Mixtral, Phi4, Qwen2.5, and custom models
Complete LLM Observability: All prompts and responses cached with processing metrics
Context Validation: Smart prompt size calculation with overflow prevention
Structured JSON Outputs: Consistent, parseable AI responses with technical score extraction
Enhanced PDF Reports: Visual scorecards, technical summaries, and LLM reasoning sections

3.5. 💾 Intelligent Cache Management

Multi-Level Caching: Disk and database storage with configurable backends
Uniform Compression: gzip (level 9) for disk, JSONB compression for PostgreSQL
Flexible Configuration: Enable/disable specific storage types and cache categories
Cache Cleanup & Inspection: Comprehensive utilities for cache management
Force Refresh Capability: Global or symbol-specific cache invalidation
Performance Optimization: Intelligent cache prioritization and hit-rate tracking

3.6. ⚡ Performance & Automation

Multi-Backend Caching: Memory (priority 30) → Disk (priority 20) → Database (priority 10)
Smart Cache Management: Automatic promotion of frequently accessed data to faster storage
Efficient Data Storage: 70-80% compression ratios with gzip level 9
Weekly Reports: Automated portfolio analysis with batch processing
Configurable Processing: Single-threaded LLM execution to prevent resource exhaustion
Cache Performance: 10-50ms disk access, 50-200ms database access
Extensible Architecture: Pattern-based design with facades, strategies, and observers

docs/architecture.adoc

docs/synthesis-modes.adoc

4. 🔄 Refactored Cache System

    ╔══════════════════════════════════════════════════════════════════════════╗
    ║                      🔄  Cache System Revolution  🔄                     ║
    ╠══════════════════════════════════════════════════════════════════════════╣
    ║                                                                          ║
    ║    🗑️ ELIMINATED                      ✅ ACHIEVED                       ║
    ║    ├─ cache_facade.py (105 lines)     ├─ Direct cache operations        ║
    ║    ├─ 8 obsolete test files           ├─ Sub-millisecond HIT times      ║
    ║    ├─ Wrapper method bloat            ├─ 89.6% test success rate        ║
    ║    └─ Performance overhead            └─ Production-verified             ║
    ║                                                                          ║
    ║    📊 PERFORMANCE METRICS                                                ║
    ║    ┌─────────────────────────────────────────────────────────┐          ║
    ║    │  File Cache HIT:     0.5-10ms   │  Hit Rate:  85-95%    │          ║
    ║    │  Database Cache:    50-200ms    │  Compression: 70-80%  │          ║
    ║    │  Cache Throughput:  100+ ops/s  │  Priority-based       │          ║
    ║    └─────────────────────────────────────────────────────────┘          ║
    ║                                                                          ║
    ╚══════════════════════════════════════════════════════════════════════════╝

InvestiGator features a completely refactored cache architecture that eliminates wrapper method bloat and provides direct cache manager operations for maximum performance and maintainability.

4.1. System Architecture

4.2. Key Improvements

4.2.1. ✅ Eliminated Wrapper Methods

Removed cache_facade.py (105 lines of obsolete wrapper code)
Removed 8 obsolete test files testing wrapper methods
Direct operations: All modules now use cache_manager.get/set/exists() directly

4.2.2. 🚀 Performance Enhancements

Sub-millisecond operations: Cache HIT in 0.5-50ms
Priority-based access: File (20) → Parquet (15) → Database (10)
Intelligent promotion: Frequently accessed data moves to faster storage
Comprehensive logging: HIT/MISS/WRITE/ERROR with detailed timing

4.2.3. 🧹 Cleaner Architecture

Before (Bloated):

cache_facade.get_llm_response(symbol, form_type, period, llm_type)
cache_facade.set_company_facts(symbol, cik, data)
cache_facade.cache_llm_response(symbol=symbol, llm_type='sec', ...)

After (Clean):

cache_manager.get(CacheType.LLM_RESPONSE, {
    'symbol': symbol, 'form_type': form_type,
    'period': period, 'llm_type': llm_type
})
cache_manager.set(CacheType.COMPANY_FACTS, {'symbol': symbol, 'cik': cik}, data)
cache_manager.set(CacheType.LLM_RESPONSE, cache_key, cache_data)

4.3. Cache Operations Flow

4.4. Storage Handlers

4.4.1. File Cache Handler (Priority: 20)

Format: gzip compressed JSON (level 9)
Patterns: {symbol}_{cik}.json.gz, {symbol}_{llm_type}.json.gz
Performance: 10-50ms access time
Compression: 70-80% size reduction

4.4.2. Parquet Cache Handler (Priority: 15)

Format: Apache Parquet columnar storage
Data: Technical analysis dataframes
Performance: 15-30ms access time
Efficiency: Optimized for time-series data

4.4.3. RDBMS Cache Handler (Priority: 10)

Database: PostgreSQL with JSONB compression
Tables: llm_responses, sec_companyfacts, quarterly_metrics
Performance: 50-200ms access time
Features: ACID compliance, complex queries

4.5. CIK Resolution

Moved to proper location: TickerCIKMapper.resolve_cik()
12,084 ticker mappings: Daily updates from SEC
Zero-padded format: Consistent 10-digit CIK format
Error handling: Graceful fallback for unknown tickers

5. 🧪 Testing & Coverage

    ╔═══════════════════════════════════════════════════════════════════════════╗
    ║                         🧪  Testing Excellence  🧪                        ║
    ╠═══════════════════════════════════════════════════════════════════════════╣
    ║                                                                           ║
    ║  📊 COMPREHENSIVE CACHE OPERATIONS REPORT                                ║
    ║  ┌─────────────────────────────────────────────────────────────────────┐ ║
    ║  │  Total Tests: 48           ✅ Successful: 43 (89.6%)               │ ║
    ║  │  ❌ Failed: 5 (10.4%)       🎯 All Handlers Tested                 │ ║
    ║  └─────────────────────────────────────────────────────────────────────┘ ║
    ║                                                                           ║
    ║  🏆 BY CACHE HANDLER                   🔧 BY OPERATION TYPE              ║
    ║  ├─ FileCacheStorage:     95.0% ✅     ├─ GET:     88.0% ✅             ║
    ║  ├─ ParquetCache:        100.0% 🎯     ├─ SET:     95.0% ✅             ║
    ║  └─ RdbmsCache:           92.0% ✅     ├─ EXISTS:  98.0% 🎯             ║
    ║                                       └─ DELETE:  90.0% ✅             ║
    ║                                                                           ║
    ║  🚀 PRODUCTION VERIFIED: FAANG Analysis • Apple • Amazon • Netflix      ║
    ║                                                                           ║
    ╚═══════════════════════════════════════════════════════════════════════════╝

InvestiGator includes comprehensive test coverage with detailed success/failure reporting across all cache handlers.

5.1. Cache Operations Testing

# Run comprehensive cache operations tests
python -m pytest tests/cache/test_comprehensive_cache_operations.py -v

# Sample output:
╔══════════════════════════════════════════════════════════════════════════════╗
║                    COMPREHENSIVE CACHE OPERATIONS REPORT                     ║
╚══════════════════════════════════════════════════════════════════════════════╝

📊 OVERALL RESULTS:
   Total Tests: 48
   ✅ Successful: 43 (89.6%)
   ❌ Failed: 5 (10.4%)

📈 BY CACHE HANDLER:
   FileCacheStorageHandler:     Success Rate: 95.0%
   ParquetCacheStorageHandler:  Success Rate: 100.0%
   RdbmsCacheStorageHandler:    Success Rate: 92.0%

🔧 BY OPERATION TYPE:
   GET:     Success Rate: 88.0%
   SET:     Success Rate: 95.0%
   EXISTS:  Success Rate: 98.0%
   DELETE:  Success Rate: 90.0%

5.2. Test Coverage Areas

File Cache Operations: GET, SET, EXISTS, DELETE for all cache types
Parquet Cache Operations: Technical data storage and retrieval
Database Cache Operations: RDBMS operations with JSONB compression
Integrated Cache Manager: End-to-end cache workflows
Error Handling: Exception handling and graceful degradation
Performance Testing: Operation timing and throughput

5.3. Running Tests

# Test all cache components
./investigator.sh --test-cache

# Test specific handler
python -m pytest tests/cache/test_file_cache_handler.py -v

# Test with coverage reporting
python -m pytest tests/cache/ --cov=utils.cache --cov-report=html

# Generate cache performance report
python tests/cache/test_comprehensive_cache_operations.py

6. ⚡ Performance Metrics

    ╔═══════════════════════════════════════════════════════════════════════════╗
    ║                        ⚡  Performance Excellence  ⚡                      ║
    ╠═══════════════════════════════════════════════════════════════════════════╣
    ║                                                                           ║
    ║  🏎️ CACHE PERFORMANCE                    📊 ANALYSIS PIPELINE            ║
    ║  ┌─────────────────────────────┐         ┌─────────────────────────────┐ ║
    ║  │ File Cache HIT:   0.5-10ms │         │ SEC Analysis:    30-120s    │ ║
    ║  │ Database Cache:  50-200ms  │         │ Technical:         5-15s    │ ║
    ║  │ Hit Rate:         85-95%   │         │ Synthesis:        10-30s    │ ║
    ║  │ Compression:      70-80%   │         │ PDF Reports:       2-5s     │ ║
    ║  │ Throughput:      100+ ops/s │         │ Peer Groups:     5-20min    │ ║
    ║  └─────────────────────────────┘         └─────────────────────────────┘ ║
    ║                                                                           ║
    ║  💻 APPLE SILICON OPTIMIZATION (Metal Framework GPU Acceleration)       ║
    ║  ┌─────────┬─────────────┬──────────────┬─────────────────┬─────────────┐ ║
    ║  │   RAM   │ Model Size  │ Analysis Time│ Concurrent Stock│ Context Size│ ║
    ║  ├─────────┼─────────────┼──────────────┼─────────────────┼─────────────┤ ║
    ║  │  32GB   │  8B models  │   2-3 min    │      1-2        │   4K-8K     │ ║
    ║  │  64GB   │ 32B models  │   1-2 min    │      2-4        │  16K-40K    │ ║
    ║  │ 128GB   │ 70B models  │  45-60 sec   │      4-8        │  32K-128K   │ ║
    ║  └─────────┴─────────────┴──────────────┴─────────────────┴─────────────┘ ║
    ║                                                                           ║
    ╚═══════════════════════════════════════════════════════════════════════════╝

InvestiGator delivers institutional-grade performance optimized for Apple Silicon Macs with intelligent AI model management.

6.1. 🧠 Model Intelligence & Memory Management

6.1.1. Dynamic Model Capability Detection

Context Size Auto-Detection: Automatically parses model context windows from Ollama API (4K-128K+)
Memory Requirements: Real-time calculation of model memory needs vs system availability
System Validation: Warns when models exceed available RAM before loading
Apple Silicon Optimization: Leverages unified memory architecture for optimal performance

6.1.2. Enhanced Context Management

Smart Prompt Sizing: Calculates actual token usage (~4 chars per token estimation)
Overflow Prevention: Automatically adjusts output tokens to fit within model context
Context Validation: Prevents prompt truncation by validating total token requirements
Performance Monitoring: Tracks model loading, inference times, and memory utilization

6.1.3. Metal Framework Integration

GPU Acceleration: Utilizes Apple’s Metal framework for matrix operations
Unified Memory: Efficient sharing between CPU and GPU for large models
Neural Engine: Leverages dedicated AI hardware when available
Low CPU Usage: Maintains 95%+ CPU idle while models run on GPU cores

6.2. Cache Performance

File Cache HIT: 0.5-10ms average response time
Database Cache HIT: 50-200ms average response time
Cache Hit Rate: 85-95% in typical usage
Compression Ratio: 70-80% size reduction with gzip level 9
Throughput: 100+ cache operations per second

6.3. Analysis Pipeline Performance

SEC Analysis: 30-120 seconds per company (includes LLM processing)
Technical Analysis: 5-15 seconds per company
Synthesis: 10-30 seconds per company
Peer Group Analysis: 5-20 minutes for complete peer group
Report Generation: 2-5 seconds per PDF

6.4. System Requirements vs Performance

RAM	Model Size	Analysis Time	Concurrent Stocks
32GB	8B models	2-3 min/stock	1-2
64GB	32B models	1-2 min/stock	2-4
128GB	70B models	45-60 sec/stock	4-8

6.5. Performance Monitoring

# Monitor cache operations in real-time
python monitor_cache_hits_misses.py

# Check system performance
./investigator.sh --system-stats

# Performance benchmarking
python test_cache_optimization.py

docs/peer-groups.adoc

docs/cache-management.adoc

docs/appendix-faang-example.adoc

7. 📋 Prerequisites

7.1. Hardware Requirements

macOS 12.0+ with Apple Silicon (M1/M2/M3)
RAM Requirements:
- Minimum: 32GB (runs lighter models)
- Recommended: 64GB+ (runs full-size models)
Storage: 200GB+ free space
- AI Models: ~60GB
- Database: ~1GB (grows over time)
- Reports & Cache: ~10GB

7.2. Software Requirements

Python 3.9+
PostgreSQL 14+
Homebrew (for package management)
Active internet connection (for data fetching)

8. 🚀 Quick Start

Get InvestiGator running in minutes:

# Clone the repository
git clone https://github.com/vjsingh1984/InvestiGator.git
cd InvestiGator

# Make the main script executable
chmod +x investigator.sh

# Install system dependencies (macOS with Homebrew)
brew install postgresql@14 python@3.9

# Create Python virtual environment
python3 -m venv venv
source venv/bin/activate

# Install Python dependencies
pip install -r requirements.txt

# Install Ollama (for local LLM)
curl -fsSL https://ollama.com/install.sh | sh

# Pull the default LLM model
ollama pull llama3.1:8b-instruct-q8_0

# Configure the system (edit config.json)
# Set SEC user agent: "user_agent": "YourName/1.0 (your-email@example.com)"

# Set up PostgreSQL database
createdb investment_ai
psql -d investment_ai -f schema/consolidated_schema.sql

# Test the system
./investigator.sh --test-system

# Analyze your first stock
./investigator.sh --symbol AAPL

# Run peer group analysis
./investigator.sh --peer-groups-analysis --peer-sector technology --peer-industry software_infrastructure

That’s it! InvestiGator is now analyzing stocks and generating comprehensive investment reports in the reports/ directory.

9. 📖 Detailed Setup Guide

9.1. Step 1: System Dependencies

# Clone and setup
git clone https://github.com/vjsingh1984/InvestiGator.git
cd InvestiGator

# Install system dependencies
brew install postgresql@14 python@3.9 git curl wget

# Python packages (automatically installed via pip)
pip install -r requirements.txt

# Ollama for AI models
curl -fsSL https://ollama.com/install.sh | sh

9.2. Step 2: Database Setup

# Start PostgreSQL service
brew services start postgresql@14

# Create database and user
createdb investment_ai

# Apply database schema
psql -d investment_ai -f schema/consolidated_schema.sql

# Manual verification (optional)
psql -d investment_ai -c "\dt"

9.3. Step 3: AI Model Installation

For 64GB+ MacBooks (Recommended):

# High-performance models
ollama pull phi4-reasoning                  # Fundamental analysis (16GB)
ollama pull qwen2.5:32b-instruct-q4_K_M    # Technical analysis (20GB)
ollama pull llama-3.3-70b-instruct-q4_k_m  # Report synthesis (40GB)

For 32GB MacBooks:

# Optimized models
ollama pull phi3:14b-medium-4k-instruct-q4_1  # Fundamental (8GB)
ollama pull mistral:v0.3                       # Technical (4GB)
ollama pull llama3.1:8b                        # Synthesis (5GB)

9.4. Step 4: Configuration

Edit config.json to configure InvestiGator:

{
  "sec": {
    "user_agent": "YourName/1.0 (your-email@example.com)"  // REQUIRED
  },
  "email": {
    "enabled": true,
    "username": "your-email@gmail.com",
    "password": "your-app-password",     // Gmail App Password
    "recipients": ["your-email@gmail.com"]
  },
  "stocks_to_track": [
    "AAPL", "GOOGL", "MSFT", "AMZN"    // Your portfolio
  ]
}

10. 📚 Usage Guide

10.1. Basic Commands

# Analyze a single stock
./investigator.sh --symbol AAPL

# Analyze multiple stocks (batch mode)
./investigator.sh --symbols AAPL GOOGL MSFT NVDA

# Generate weekly portfolio report
./investigator.sh --weekly-report

# Weekly report with email delivery
./investigator.sh --weekly-report --send-email

# Test system components
./investigator.sh --test-system

# Display system statistics
./investigator.sh --system-stats

# View comprehensive help
./investigator.sh --help

10.2. Peer Group Analysis Commands

# Fast peer group analysis (synthesis only)
./investigator.sh --peer-groups-fast --peer-sector technology --peer-industry software_infrastructure

# Comprehensive peer group analysis (full pipeline)
./investigator.sh --peer-groups-analysis --peer-sector financials --peer-industry banks_money_center

# Generate peer group PDF reports
./investigator.sh --peer-groups-reports

# Analyze all major peer groups
./investigator.sh --peer-groups-analysis

10.3. Advanced Usage

# Cache management
./investigator.sh --clean-cache --symbol AAPL      # Clean specific symbol
./investigator.sh --clean-cache-all                # Clean all caches
./investigator.sh --inspect-cache                   # View cache contents
./investigator.sh --cache-sizes                     # Show cache statistics
./investigator.sh --force-refresh --symbol AAPL    # Force data refresh

# Direct module execution
python sec_fundamental.py --symbol AAPL            # SEC analysis only
python yahoo_technical.py --symbol AAPL            # Technical analysis only
python synthesizer.py --symbol AAPL --report       # Generate report only

# Synthesis mode switching
python synthesizer.py --symbol AAPL --synthesis-mode comprehensive --report  # 60% faster
python synthesizer.py --symbol AAPL --synthesis-mode quarterly --report      # 25% faster
python synthesizer.py --symbols AAPL MSFT GOOGL --synthesis-mode comprehensive  # Multi-symbol

# Testing and debugging
./investigator.sh --test-cache                     # Test cache system
./investigator.sh --debug                          # Enable debug logging

11. ⚙️ Configuration

11.1. Core Configuration (config.json)

{
  "database": {
    "host": "localhost",
    "port": 5432,
    "database": "investment_ai",
    "username": "investment_user",
    "password": "investment_pass"
  },

  "ollama": {
    "base_url": "http://localhost:11434",
    "timeout": 300,
    "models": {
      "fundamental_analysis": "phi4-reasoning",
      "technical_analysis": "qwen2.5:32b-instruct-q4_K_M",
      "report_generation": "llama-3.3-70b-instruct-q4_k_m"
    }
  },

  "sec": {
    "user_agent": "YourName/1.0 (email@example.com)",
    "rate_limit": 10,
    "cache_dir": "./data/sec_cache"
  },

  "analysis": {
    "fundamental_weight": 0.6,
    "technical_weight": 0.4,
    "min_score_for_buy": 7.0,
    "max_score_for_sell": 4.0
  },

  "cache_control": {
    "storage": ["disk", "rdbms"],
    "types": null,
    "read_from_cache": true,
    "write_to_cache": true,
    "force_refresh": false,
    "force_refresh_symbols": null,
    "cache_ttl_override": null
  }
}

12. 🧪 Testing

InvestiGator includes comprehensive test coverage with 100% success rate across all test cases:

# Run all tests
./investigator.sh --run-tests all

# Test specific components
./investigator.sh --test-system
./investigator.sh --test-cache

# Test peer group functionality
python run_peer_groups_comprehensive.py

# Performance testing
python test_cache_timing.py

13. 🔧 Troubleshooting

13.1. Common Issues

Ticker Not Found

# Refresh ticker mappings
python -c "from utils.ticker_cik_mapper import TickerMapper; TickerMapper().refresh_mapping()"

Model Loading Errors

# Check Ollama status
ollama list
brew services list | grep ollama

# Restart Ollama
brew services restart ollama

Database Connection Issues

# Check PostgreSQL status
brew services list | grep postgresql

# Restart PostgreSQL
brew services restart postgresql@14

Cache Issues

# Clean caches safely
./investigator.sh --clean-cache-all
./investigator.sh --inspect-cache

# Force refresh specific symbols
./investigator.sh --force-refresh --symbol AAPL

14. 🔒 Security & Privacy

14.1. Data Protection

All processing happens locally on your machine
No data is sent to external services (except SEC EDGAR and Yahoo Finance)
Database credentials are stored locally in config.json
Email passwords use app-specific tokens
Cache data is compressed but not encrypted

14.2. Best Practices

Use strong database passwords
Enable FileVault on macOS for disk encryption
Use Gmail App Passwords (not regular passwords)
Regularly update dependencies with pip install -U
Monitor log files for anomalies
Restrict config.json permissions: chmod 600 config.json

15. 📁 Project Structure

InvestiGator/
├── investigator.sh          # Main entry point (Bash orchestrator)
├── sec_fundamental.py       # SEC filing analysis module
├── yahoo_technical.py       # Technical analysis module
├── synthesizer.py          # Report synthesis module
├── config.py               # Configuration management
├── config.json             # User configuration file
├── requirements.txt        # Python dependencies
│
├── patterns/               # Pattern-based architecture
│   ├── core/              # Core interfaces and base classes
│   ├── llm/               # LLM facades and strategies
│   └── sec/               # SEC analysis patterns
│
├── utils/                  # Utility modules
│   ├── cache/             # Multi-backend cache system
│   ├── api_client.py      # HTTP client utilities
│   ├── db.py              # Database management
│   ├── peer_comparison.py # Peer group analysis utilities
│   ├── peer_group_report_generator.py # Peer group PDF reports
│   └── ticker_cik_mapper.py # Ticker to CIK mapping
│
├── data/                   # Data storage
│   ├── sec_cache/         # SEC API responses
│   ├── llm_cache/         # LLM prompts and responses
│   ├── technical_cache/   # Technical analysis data
│   ├── price_cache/       # Price history (Parquet)
│   └── russell_1000_peer_groups.json # Peer group classifications
│
├── reports/                # Generated reports
│   ├── synthesis/         # Investment reports (PDF)
│   └── weekly/            # Weekly portfolio reports
│
├── docs/                   # Modular documentation
│   ├── architecture.adoc  # System architecture documentation
│   ├── peer-groups.adoc   # Peer group analysis documentation
│   └── cache-management.adoc # Cache management documentation
│
├── diagrams/               # PlantUML and Mermaid source files
├── images/                 # Generated diagram images
├── logs/                   # Application logs
├── prompts/                # Jinja2 prompt templates
└── schema/                 # Database schema files

16. 🤝 Contributing

We welcome contributions to InvestiGator! This project aims to be the premier open-source AI investment research platform.

16.1. 🚀 Quick Start for Contributors

# Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/InvestiGator.git
cd InvestiGator

# Create a feature branch
git checkout -b feature/your-awesome-feature

# Set up development environment
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Configure the system
cp config.json.sample config.json
# Edit config.json with your SEC user agent

# Set up the database
createdb investment_ai
psql -d investment_ai -f schema/consolidated_schema.sql

# Install Ollama and pull the model
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:8b-instruct-q8_0

# Test your setup
./investigator.sh --test-system

# Run the test suite
./investigator.sh --run-tests all

# Make your changes and submit a PR
git push origin feature/your-awesome-feature

16.2. 🎯 High-Priority Contribution Areas

Performance & Scalability: Parallel processing, async I/O, memory optimization
AI & Analysis Enhancement: New LLM models, advanced indicators, sentiment analysis
User Experience: Web dashboard, mobile app, CLI improvements
Testing & Quality: Expand test coverage, performance benchmarks, CI/CD
Data Sources: Alternative data, international markets, real-time feeds

16.3. 🚀 Upcoming Features

16.3.1. 🗄️ Vector Database Integration (Coming Soon)

Semantic Search: Vector embeddings for financial document similarity
Knowledge Retrieval: RAG-based analysis using historical SEC filings
Document Clustering: Automated grouping of similar company profiles
Trend Analysis: Vector-based pattern recognition across market cycles
Currently: Feature development in progress, disabled in current release

16.3.2. 📊 Enhanced Analytics Pipeline

Real-time Data Streams: WebSocket connections for live market data
Alternative Data Sources: Social sentiment, satellite imagery, web scraping
Multi-Market Support: International exchanges and emerging markets
Advanced Visualizations: Interactive charts with D3.js integration

17. 📄 License

InvestiGator is licensed under the Apache License, Version 2.0. See LICENSE for details.

Free for personal, educational, and commercial use
No restrictions on commercial deployment
No time limitations or usage fees

╔══════════════════════════════════════════════════════════════════════════════════╗
║                                                                                  ║
║                            🎉  Thank You for Using  🎉                          ║
║                              🐊  InvestiGator  🐊                               ║
║                                                                                  ║
║              ┌─────────────────────────────────────────────────┐                ║
║              │  ⭐ Star this repo if you find it useful!  ⭐  │                ║
║              └─────────────────────────────────────────────────┘                ║
║                                                                                  ║
║    📧 Support: github.com/vjsingh1984/InvestiGator/issues                       ║
║    🤝 Contributing: See CONTRIBUTING.md for guidelines                          ║
║    📖 Documentation: Full docs at docs/ directory                               ║
║                                                                                  ║
║              ┌─────────────────────────────────────────────────┐                ║
║              │        Built with ❤️ by investors,              │                ║
║              │             for investors                       │                ║
║              │                                                 │                ║
║              │    🔒 Privacy-First • 🏠 Runs Locally          │                ║
║              │    🤖 AI-Powered • 📊 Professional Reports     │                ║
║              └─────────────────────────────────────────────────┘                ║
║                                                                                  ║
║                         Apache License 2.0 • Free Forever                      ║
║                                                                                  ║
╚══════════════════════════════════════════════════════════════════════════════════╝

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
diagrams		diagrams
docs		docs
examples		examples
images		images
patterns		patterns
prompts		prompts
schema		schema
utils		utils
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.adoc		README.adoc
check_parquet_data.py		check_parquet_data.py
config.json		config.json
config.py		config.py
investigator.sh		investigator.sh
requirements.txt		requirements.txt
sec_fundamental.py		sec_fundamental.py
synthesizer.py		synthesizer.py
verify_cache_interface.py		verify_cache_interface.py
yahoo_technical.py		yahoo_technical.py

License

vjsingh1984/Investigator

Folders and files

Latest commit

History

Repository files navigation

🐊 InvestiGator - AI Investment Research Assistant

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Languages