An intelligent log analysis tool that uses LangChain and a dual-database system (MySQL + ChromaDB) to provide conversational insights into Windows Event Logs.
- Conversational Interface: Interact with your logs through an intuitive chatbot powered by LangGraph.
- Automated Log Ingestion: A PowerShell script automatically extracts new Windows Application error logs.
- Incremental Processing: Remembers the last log processed to prevent duplicate entries.
- Dual Database Storage:
- MySQL: Stores structured log data for precise, query-based lookups.
- ChromaDB: Stores vector embeddings of log messages for powerful semantic search.
- Intelligent AI Analysis: Leverages LLMs to analyze error frequency, explain complex error messages, and suggest solutions.
- Live System Probing: Can run validated PowerShell commands to fetch real-time system information.
The application operates in a clear, multi-stage process to turn raw event logs into actionable insights.
-
Log Ingestion (
process_logs.py):- A background script executes a PowerShell command to fetch new Windows Application error logs since the last run, identified by a bookmark (
last_recordedId.txt). - The raw logs are parsed from JSON into a structured format.
- A background script executes a PowerShell command to fetch new Windows Application error logs since the last run, identified by a bookmark (
-
Data Persistence (Dual DB):
- SQL (MySQL): Structured data like
RecordId,EventID,Source, andTimeCreatedis inserted into theapplication_errorstable. This is ideal for exact lookups (e.g., "Get error with RecordId 12345"). - Vector (ChromaDB): A descriptive sentence for each log is generated and stored as a vector embedding. This enables semantic search (e.g., "Find errors related to network failures").
- SQL (MySQL): Structured data like
-
User Interaction (Flask App):
- The user interacts with a web-based chatbot served by a Flask application.
-
AI Orchestration (LangGraph):
- User queries are routed by a central LangGraph agent which decides the best tool for the job.
-
Tool Execution:
query_sql_database: Used for specific, filtered queries against the MySQL database.query_chroma: Used for broad, semantic, or similarity-based searches against the ChromaDB vector store.probe_system: Used to execute safe, whitelisted PowerShell commands to get live system data.
-
Response Generation:
- The results from the tools are synthesized by the LLM into a coherent, human-readable answer.
.
├── ProgramFiles/
│ ├── powershell/
│ │ └── extract_logs.ps1 # PowerShell script to extract logs from Windows Event Viewer.
│ └── python/
│ ├── dependency/
│ │ ├── __init__.py
│ │ ├── Agents/
│ │ │ ├── chatbot.py # Main LangGraph agent orchestrator.
│ │ │ └── ... # Other specialized agents.
│ │ │
│ │ ├── AdditionalTools/
│ │ │ ├── tools/
│ │ │ │ ├── queryDBase.py # Tool for querying the structured MySQL database.
│ │ │ │ ├── queryChroma.py # Tool for semantic search in the ChromaDB vector store.
│ │ │ │ ├── frequencyTool.py # Tool for finding frequency using timestamps.
│ │ │ │ ├── result.py # Tool for analysing event and providing solution.
│ │ │ │ └── probeSystem.py # Tool for executing safe system commands.
│ │ │ │
│ │ │ ├── chatbotTools.py # collection of tools.
│ │ │ ├── literals.py # Prompts for agents.
│ │ │ └── sqlConnection.py # Reusable class for MySQL database connections.
│ │ │
│ │ └── initialSetups/
│ │ ├── setupTools # Tools used by process_logs.py
│ │ │ ├── ...
│ │ ├── createDatabase.py # One-time script to create the MySQL database and table.
│ │ ├── initialise.py # Creates databases and populates them with data.
│ │ ├── runInitialiser.py # Script to run initialise.py once a day when main.py is run.
│ │ └── process_logs.py # Script to batch logs of both MySQL and ChromaDB.
│ │
│ └── main.py # Flask app entry point to run the web interface.
│
├── static/ # CSS and JS for the web interface.
├── templates/
│ └── chat.html # HTML template for the chatbot UI.
├── TextFiles/
│ └── last_recordedId.txt # Stores the last processed RecordId to avoid duplicates.
├── .env # Environment variables (API keys, DB credentials).
├── chromaDB/ # Directory for the persistent ChromaDB vector store.
└── README.md # You're here!
- Python 3.9+
- PowerShell (for running on Windows)
- An active MySQL server instance
- An Azure OpenAI resource
Create a .env file in the project root and populate it with your credentials:
AZURE_OPENAI_API_KEY="your-api-key"
APP_BOOKMARK="your-Appbookmark"
SYS_BOOKMARK="your-Sysbookmark"
AZURE_DEPLOYMENT_NAME="your-deployment-name"
AZURE_RESOURCE_NAME="your-resource-name"
AZURE_API_VERSION="2024-02-15-preview"
MYSQL_USER="your-mysql-username"
MYSQL_PASSWORD="your-mysql-password"
Install the required Python packages from your requirements.txt file.
pip install -r requirements.txtYour requirements.txt should include:
Flask
langchain
langgraph
langchain-openai
python-dotenv
mysql-connector-python
chromadb
markdown2
typing_extensions
Start the Flask server to launch the chatbot interface. It automaticaly creates databases using the files in initialSetup
python ProgramFiles/python/main.pyNavigate to http://127.0.0.1:5000 in your web browser to start chatting with your logs.
- Implement user authentication and authorization.
- Add UI controls for filtering logs by date, severity, or source.
- Support other log types (e.g., System, Security).
- Archive old logs to optimize performance.
This project is intended for internal use and educational purposes. If you plan to deploy it publicly, ensure you comply with the usage policies of Microsoft Azure and OpenAI.