LogAI is a hybrid log classification system that combines traditional techniques with advanced NLP models (Sentence Transformers and LLaMA 3.3) to automatically cluster and categorize system logs.
- Accepts raw logs via CSV input
- Converts log messages into embeddings using Sentence Transformers
- Clusters similar logs using DBSCAN
- Applies Regex-based classification for known log patterns
- Uses:
- Sentence Transformers + Logistic Regression for medium-sized unknown clusters
- LLaMA 3.3 via GroqCloud API for rare/unclassified logs
- Exposes a FastAPI
/classifyendpoint to automate the entire process - Outputs the results into
output.csv
- Python
- Sentence Transformers
- DBSCAN (from scikit-learn)
- Regex (Python's
remodule) - LLaMA 3.3 (via GroqCloud API)
- FastAPI
- Uvicorn
-
Clone the repo
git clone https://github.com/rprahadeep/LogAI.git cd LogAI -
Install dependencies
pip install -r requirements.txt
-
Configure environment variables
Create a
.envfile in the root directory:GROQ_API_KEY=your_groq_api_key
uvicorn server:app --reload- Description: Upload a CSV file containing a
logscolumn - Response: Saves
output.csvlocally with an addedcategorycolumn
Sample Request (using curl):
curl -X POST "http://127.0.0.1:8000/classify" \
-F "file=@logs.csv"The server will generate an output.csv file with the original logs and their predicted categories.
- Embeddings: Convert logs into vector space using Sentence Transformers
- Clustering: Use DBSCAN to identify log groupings
- Classification:
- Regex for known patterns
- Sentence Transformer + Logistic Regression for common unknowns
- LLaMA 3.3 for rare or uncategorized logs