Skip to content

GUI Dashboard to launch GitHub workflows with a RAG chatbot, MLOps, and documentation generation

License

Notifications You must be signed in to change notification settings

dewitt4/github-process-manager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– GitHub Process Manager - AI-Powered Workflow and Documentation Assistant

A lightweight, local AI-powered assistant that combines Retrieval-Augmented Generation (RAG) with the Gemini API and GitHub repository integration. Upload reference documents, connect to your GitHub repositories, and get intelligent responses for process documentation, SOX compliance, MLOps workflows, DevOps pipelines, and more.

✨ Features

  • 🧠 RAG-Powered Responses: Upload documents (.txt, .pdf, .docx) to create a knowledge base
  • πŸ€– Gemini AI Integration: Leverages Google's Gemini Pro for intelligent responses
  • πŸ”— GitHub Repository Connection: Access PRs, issues, workflow runs, and repository files
  • ⚑ GitHub Actions Control: Manually trigger workflows directly from the interface
  • οΏ½ Word Document Generation: Create professionally formatted process documentation
  • πŸ’» Browser-Based UI: Clean, professional interface with light blue and white theme
  • πŸͺΆ Lightweight & Local: Runs entirely on your machine with minimal resource usage
  • πŸ“Š ChromaDB Vector Storage: Efficient document embedding and retrieval
  • πŸ”’ Secure Configuration: Environment-based secrets management
  • 🎯 Multi-Template Support: SOX audits, MLOps workflows, DevOps pipelines, and generic documentation

🎯 Use Cases

SOX Compliance & Auditing

  • Document internal controls and procedures
  • Generate 5-section SOX control analysis reports
  • Track testing procedures and results
  • Create audit-ready Word documents

MLOps Workflows

  • Document machine learning pipelines
  • Track model training and validation
  • Generate deployment documentation
  • Monitor ML workflow processes

DevOps Pipelines

  • Document CI/CD pipelines
  • Track build and deployment processes
  • Generate pipeline documentation
  • Monitor infrastructure changes

General Process Documentation

  • Create structured process documentation
  • Generate professional Word reports
  • Track project workflows
  • Document best practices and procedures

πŸ“‹ Prerequisites

  • Python 3.8+
  • Gemini API Key (Get one here)
  • GitHub Personal Access Token (optional, for GitHub features)
  • Git (for cloning the repository)

πŸš€ Quick Start

1. Clone the Repository

git clone <your-repo-url>
cd github-process-manager

2. Create Virtual Environment

# Windows
python -m venv venv
venv\Scripts\activate

# macOS/Linux
python3 -m venv venv
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Configure Environment Variables

Copy the template and edit with your credentials:

# Windows
copy .env.template .env

# macOS/Linux
cp .env.template .env

Edit .env file:

# Required: Gemini API Key
GEMINI_API_KEY=your_gemini_api_key_here

# Optional: GitHub Integration
GITHUB_TOKEN=your_github_personal_access_token_here
GITHUB_REPO_URL=https://github.com/username/repository

# Flask Configuration
FLASK_SECRET_KEY=your_secret_key_here
FLASK_DEBUG=True

Getting Your API Keys:

  • Gemini API Key: Visit Google AI Studio
  • GitHub Token: Go to GitHub Settings β†’ Developer Settings β†’ Personal Access Tokens β†’ Generate new token
    • Required scopes: repo, workflow (for triggering actions)

5. Run the Application

python app.py

The application will be available at: http://localhost:5000

οΏ½ Docker Deployment (Recommended)

For a consistent, isolated environment, use Docker:

Quick Start with Docker Compose

# 1. Configure environment
cp .env.template .env
# Edit .env with your API keys

# 2. Start the application
docker-compose up -d

# 3. View logs
docker-compose logs -f app

# 4. Access at http://localhost:5000

Development with VS Code Dev Container

  1. Install Remote - Containers extension
  2. Open project in VS Code
  3. Press F1 β†’ "Remote-Containers: Reopen in Container"
  4. Environment is automatically configured with all dependencies

Docker Commands

# Stop the application
docker-compose down

# Rebuild after changes
docker-compose up -d --build

# Production mode
docker-compose -f docker-compose.prod.yml up -d

# View container shell
docker-compose exec app /bin/bash

For detailed Docker setup, see README.docker.md

οΏ½πŸ“– Usage Guide

Upload Reference Documents

  1. Navigate to the main Chat page
  2. Click "Choose File" in the upload section
  3. Select a document (.txt, .pdf, or .docx)
  4. Click "Upload" to process the document
  5. The document will be chunked, embedded, and stored in ChromaDB

Connect to GitHub Repository

  1. Go to the Settings page
  2. Enter your GitHub repository URL (e.g., https://github.com/username/repo)
  3. Click "Connect Repository"
  4. Once connected, the chatbot can access PRs, issues, and workflows

Chat with the AI

  1. Type your question in the chat input
  2. The chatbot will:
    • Retrieve relevant document chunks from your uploaded files
    • Fetch related GitHub repository data (if connected)
    • Generate a response using Gemini AI with all context
  3. Responses cite sources from documents and GitHub data

Trigger GitHub Actions

  1. Go to Settings β†’ GitHub Actions
  2. Click "Load Workflows"
  3. Click "Trigger" on any workflow to manually start it

Customize AI Behavior

The application supports customizable system prompts to tailor AI responses to your needs:

Using Pre-defined Templates

  1. Go to Settings β†’ AI System Prompt Configuration
  2. Select a template from the dropdown:
    • Default - Balanced assistant for general queries
    • Technical Expert - Deep technical explanations with code examples
    • Security Auditor - Security-focused analysis and compliance
    • Developer Assistant - Code-heavy responses with best practices
    • Data Analyst - Structured analysis with metrics and insights
    • Technical Educator - Clear explanations for learning purposes
  3. Click "Update Prompt" to apply (changes last for your session)
  4. See the preview to verify the selected template

Creating Custom Prompts

  1. Go to Settings β†’ AI System Prompt Configuration
  2. Select "Custom Prompt" from the dropdown
  3. Write your own system instruction in the text editor
  4. Click "Update Prompt" to apply
  5. Example custom prompt:
    You are a helpful assistant specializing in cloud infrastructure.
    Focus on AWS best practices, security, and cost optimization.
    Provide actionable recommendations with specific service names.
    

Permanent Configuration (via .env)

For persistent customization across server restarts:

  1. Edit your .env file
  2. Set one of these variables:
    # Use a pre-defined template
    SYSTEM_PROMPT_TEMPLATE=technical_expert
    
    # Or set a custom prompt
    CUSTOM_SYSTEM_PROMPT="Your custom system instruction here"
  3. Restart the application

Available Templates: default, technical_expert, security_auditor, developer_assistant, data_analyst, technical_educator

Note: Session-based changes (via UI) take priority over .env settings until the server restarts.

Customize Document Templates

The application supports configurable Word document templates with custom branding:

Available Document Templates

  1. SOX Audit - 5-section compliance reports (Control Objective, Risks, Testing, Results, Conclusion)
  2. MLOps Workflow - ML pipeline documentation (Model Overview, Data Pipeline, Training, Validation, Deployment)
  3. DevOps Pipeline - CI/CD documentation (Pipeline Overview, Build Steps, Quality Gates, Deployment, Monitoring)
  4. Generic - General purpose documentation (Overview, Components, Procedures, Results, Recommendations)

Customize Branding

Edit your .env file to personalize generated documents:

# Project name for document headers
PROJECT_NAME=GitHub Process Manager

# Optional: Add company name to headers
COMPANY_NAME=Your Company Name

# Brand color (hex format #RRGGBB)
BRAND_COLOR=#4A90E2

# Optional: Add logo to document headers (.png, .jpg, .jpeg)
DOCUMENT_LOGO_PATH=/path/to/your/logo.png

# Default template type
DEFAULT_TEMPLATE_TYPE=generic

Create Custom Templates

Modify document_templates.json to add new templates:

{
  "templates": {
    "your_template": {
      "name": "Your Template Name",
      "report_title": "Your Report Title",
      "sections": [
        {"number": 1, "title": "Section 1", "key": "Section 1"},
        {"number": 2, "title": "Section 2", "key": "Section 2"}
      ],
      "keywords": ["keyword1", "keyword2"]
    }
  }
}

Template Features:

  • Custom section structures (3-7 sections recommended)
  • Keyword-based auto-detection
  • Configurable headers and colors
  • Optional logo support
  • Professional formatting (Calibri, proper spacing, page numbers)

Using for MLOps

The application includes specialized MLOps templates and workflows for managing machine learning operations.

MLOps Documentation Templates

Located in templates/mlops/, these guides provide comprehensive MLOps best practices:

  1. mlops_guide.md - Complete MLOps lifecycle guide covering:

    • Model development and version control
    • Experiment tracking (MLflow, Weights & Biases)
    • Training best practices and reproducibility
    • Model validation strategies
    • Deployment strategies (Blue-Green, Canary, Shadow)
    • Monitoring and drift detection
    • Model retraining triggers
  2. model_validation_template.md - Structured validation report template:

    • Model overview and business context
    • Validation methodology (unit, integration, performance, regression)
    • Performance metrics and comparison with baseline
    • Bias and fairness analysis
    • Failure pattern analysis
    • Deployment recommendations
  3. deployment_checklist.md - Comprehensive pre-deployment checklist:

    • Model readiness verification
    • Security and compliance checks
    • Monitoring and observability setup
    • Testing requirements (functional, performance, integration)
    • Deployment strategy selection
    • Rollback procedures
  4. monitoring_guide.md - Production monitoring strategies:

    • Performance metrics tracking
    • Data drift detection methods
    • Infrastructure monitoring
    • Alert configuration
    • Incident response procedures

Using MLOps Templates as RAG Documents

  1. Navigate to the Chat page
  2. Upload MLOps template files from templates/mlops/
  3. Ask questions about ML workflows:
    • "What metrics should I track for a classification model?"
    • "How do I implement canary deployment for my model?"
    • "What are the best practices for detecting data drift?"
    • "Create a validation checklist for my model deployment"

MLOps GitHub Actions Workflows

Located in .github/workflows/mlops/, trigger workflows for automated documentation:

Model Validation Report (mlops-model-validation.yml):

  • Navigate to Settings β†’ GitHub Actions
  • Select "MLOps Model Validation Report"
  • Provide inputs:
    • Model Name: Your model identifier
    • Model Version: Semantic version (e.g., 1.2.0)
    • Validation Type: unit, integration, performance, or regression
    • Metrics JSON: {"accuracy": 0.95, "f1": 0.93, "precision": 0.94}
  • Click Trigger to generate a validation report document
  • Download from GitHub Actions artifacts

Deployment Documentation (mlops-deployment-doc.yml):

  • Select "MLOps Deployment Documentation"
  • Provide inputs:
    • Model Name: Model to deploy
    • Model Version: Version number
    • Deployment Target: staging, production, canary, or development
    • Deployment Strategy: blue-green, canary, rolling, or shadow
  • Generated document includes deployment plan and rollback procedures

Example MLOps Queries

Try these queries with MLOps templates uploaded:

Model Training:

"Document the training process for a fraud detection model with 95% accuracy"

Deployment Planning:

"Create a deployment checklist for deploying a recommendation model to production"

Monitoring Setup:

"What alerts should I configure for monitoring a prediction model in production?"

Validation Reporting:

"Generate a validation report for model version 2.1.0 with accuracy 94.2%, precision 93.8%, recall 94.5%"

Integration with ML Tools

The MLOps templates include guidance for integrating with popular ML platforms:

  • MLflow: Experiment tracking, model registry, deployment
  • Weights & Biases: Real-time metrics visualization
  • TensorBoard: TensorFlow/PyTorch metrics
  • Kubeflow: Kubernetes-native ML workflows
  • AWS SageMaker, Google Vertex AI, Azure ML: Cloud ML platforms

Export metrics from these tools and use the GitHub Actions workflows to generate documentation with your actual performance data.

MLOps Workflow Best Practices

  1. Version Everything: Code, data, models, configurations
  2. Track All Experiments: Log hyperparameters, metrics, and artifacts
  3. Validate Before Deploying: Run all tests (unit, integration, performance)
  4. Monitor Continuously: Set up drift detection and performance alerts
  5. Document Thoroughly: Use templates for consistency
  6. Plan Rollbacks: Always have a tested rollback strategy

πŸ—οΈ Project Structure

github-process-manager/
β”œβ”€β”€ app.py                  # Main Flask application
β”œβ”€β”€ config.py               # Configuration management
β”œβ”€β”€ logger.py               # Logging setup
β”œβ”€β”€ rag_engine.py           # RAG document processing
β”œβ”€β”€ gemini_client.py        # Gemini API integration
β”œβ”€β”€ github_client.py        # GitHub API integration
β”œβ”€β”€ word_generator.py       # Word document generation
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ .env.template           # Environment variable template
β”œβ”€β”€ .gitignore             # Git ignore rules
β”œβ”€β”€ document_templates.json # Document template configuration
β”œβ”€β”€ templates/
β”‚   β”œβ”€β”€ base.html          # Base template
β”‚   β”œβ”€β”€ index.html         # Chat interface
β”‚   └── settings.html      # Settings page
β”œβ”€β”€ static/
β”‚   └── css/
β”‚       └── style.css      # Application styling
β”œβ”€β”€ .github/
β”‚   └── workflows/
β”‚       β”œβ”€β”€ process-analysis-doc.yml  # Generic process workflow
β”‚       └── sox-analysis-doc.yml      # SOX-specific workflow (legacy)
β”œβ”€β”€ chroma_db/             # ChromaDB storage (auto-created)
β”œβ”€β”€ uploads/               # Temporary upload folder (auto-created)
β”œβ”€β”€ generated_reports/     # Generated Word documents (auto-created)
└── README.md              # This file

πŸ”§ Configuration Options

Edit config.py or set environment variables:

Variable Description Default
GEMINI_API_KEY Google Gemini API key Required
GEMINI_TEMPERATURE AI response randomness (0.0-1.0) 0.7
GEMINI_MAX_TOKENS Maximum response length 2048
SYSTEM_PROMPT_TEMPLATE Pre-defined prompt template default
CUSTOM_SYSTEM_PROMPT Custom system instruction None
COMPANY_NAME Company name for documents None
BRAND_COLOR Document brand color (hex) #4A90E2
DOCUMENT_LOGO_PATH Path to logo for documents None
DEFAULT_TEMPLATE_TYPE Default document template generic
DOCUMENT_TEMPLATES_PATH Template config file path document_templates.json
GITHUB_REPO_URL GitHub repository URL Optional
FLASK_SECRET_KEY Flask session secret Auto-generated
CHROMA_DB_PATH ChromaDB storage location ./chroma_db
CHUNK_SIZE Characters per document chunk 800
CHUNK_OVERLAP Overlap between chunks 200
TOP_K_RESULTS RAG chunks to retrieve 3
MLOPS_FEATURES_ENABLED Enable MLOps features false
MLOPS_TEMPLATES_DIR MLOps templates directory templates/mlops
MLOPS_WORKFLOWS_DIR MLOps workflows directory .github/workflows/mlops

πŸ› οΈ API Endpoints

Chat

  • POST /api/chat - Send query and get AI response

Document Management

  • POST /api/upload - Upload document for RAG
  • GET /api/rag/stats - Get RAG database statistics
  • POST /api/rag/clear - Clear all documents

GitHub Integration

  • POST /api/github/connect - Connect to repository
  • GET /api/github/info - Get repository info
  • GET /api/github/workflows - List workflows
  • POST /api/github/workflow/trigger - Trigger workflow
  • GET /api/github/pulls - Get pull requests
  • GET /api/github/issues - Get issues

AI Prompt Management

  • GET /api/prompts/templates - Get available prompt templates
  • GET /api/prompts/current - Get current active prompt
  • POST /api/prompts/update - Update system prompt (session-based)
  • POST /api/prompts/reset - Reset to default prompt

MLOps (Optional - requires MLOPS_FEATURES_ENABLED=true)

  • GET /api/mlops/status - Check MLOps feature availability and configuration
  • POST /api/mlops/parse-metrics - Parse and format ML metrics JSON
  • POST /api/mlops/validate-metrics - Validate ML metrics against schema
  • GET /api/mlops/templates - List available MLOps documentation templates

System

  • GET /health - Health check endpoint

❗ Troubleshooting

"Configuration validation failed: GEMINI_API_KEY is not set"

  • Make sure you've created a .env file from .env.template
  • Add your Gemini API key to the .env file
  • Restart the application

Documents not being processed

  • Check file format (.txt, .pdf, .docx only)
  • Ensure file size is under 16MB
  • Check app.log for detailed error messages

GitHub connection failing

  • Verify your GitHub token has correct permissions (repo, workflow)
  • Check that the repository URL is correct
  • Ensure the token hasn't expired

ChromaDB errors

  • Delete the chroma_db/ folder and restart the application
  • This will clear all uploaded documents

πŸ“ Features in Detail

RAG (Retrieval-Augmented Generation)

  • Automatically chunks documents into manageable pieces
  • Generates embeddings using Gemini Embedding API
  • Stores vectors in ChromaDB for fast similarity search
  • Retrieves top-K most relevant chunks for each query

Gemini Integration

  • Uses Gemini Pro for natural language understanding
  • Combines RAG context with GitHub data in prompts
  • Configurable temperature and token limits
  • Robust error handling and retries

GitHub Features

  • Read repository metadata
  • List and search pull requests and issues
  • Access workflow run history
  • Trigger workflows with custom inputs
  • Retrieve repository files and structure

🀝 Contributing

This is a personal project, but suggestions and improvements are welcome!

πŸ“„ License

This project is provided as-is for educational and personal use.

πŸ™ Acknowledgments

  • Google Gemini API for AI capabilities
  • ChromaDB for vector storage
  • PyGithub for GitHub integration
  • Flask for the web framework

πŸ“§ Support

For issues or questions, please check the logs in app.log or review the troubleshooting section above.


Built with ❀️ using Python, Flask, and ChromaDB