A research project that combines process mining with agentic AI to discover, model, and automate software development workflows from GitHub repository event logs.
This repository implements the approach described in:
"Using Process Mining to Generate AI Agents from Software Engineering Process Records" Submitted to: BPM 2026 (double-blind review)
This project demonstrates a complete pipeline from raw GitHub data to an operational multi-agent system:
- Extract GitHub repository events into OCEL (Object-Centric Event Log) format
- Discover process models, roles, and behavioral patterns using process mining
- Generate role-specific BPMN models and DECLARE constraints
- Build a LangGraph-based multi-agent application grounded in discovered patterns
GitHub Events → OCEL → Role Mining → Process Discovery → Agentic App
- Event Logs: OCEL 2.0 format capturing issues, commits, users, and tasks
- Process Mining: Role-based analysis using pm4py and custom algorithms
- Discovered Artifacts: BPMN models, DECLARE constraints, process descriptions
- Agentic Application: LangGraph-based multi-agent system for GitHub automation
pmaa/
├── 01-09_*.ipynb # Sequential analysis notebooks
├── enrich_ocel_with_tasks.py # Task extraction from commit messages
├── commitizen*.json # OCEL event logs (raw, enriched, annotated)
├── role_logs/ # Per-role OCEL splits + visualizations
├── bpmn_models/ # Discovered BPMN diagrams per role
├── declare_models/ # Mined DECLARE constraint models
└── langraph-agentic-app/ # Multi-agent application
├── src/ # Python source code
├── docs/ # Architecture documentation
└── README.md # App-specific documentation
- Python 3.10+
- Jupyter Notebook
- Git
-
Clone the repository
git clone https://github.com/liorlimonad/pmaa.git cd pmaa -
Set up Python environment
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -r requirements.txt # If available
-
Install Jupyter dependencies
pip install jupyter pm4py pandas numpy matplotlib
Run the notebooks in sequence:
- 01_enrich-ocel-event-log.ipynb - Parse and enrich OCEL from GitHub data
- 02_analyze-with-pm4py.ipynb - Basic process mining analysis
- 03_resource-role-annotation.ipynb - Annotate resources with roles
- 04_split-logs-by-role.ipynb - Split event log by discovered roles
- 05_role-process-mining.ipynb - Per-role process discovery
- 06_declarative-process-mining.ipynb - DECLARE constraint mining
- 07_process_markdowns.ipynb - Generate process descriptions
- 08_bpmn_discovery.ipynb - Discover BPMN models
- 09_role_declare_model_discovery.ipynb - Role-specific DECLARE models
The langraph-agentic-app/ directory contains a production-ready multi-agent system:
- Role-based agents: Issue Reporter, Bot, Feature Developer, Quality Engineer, etc.
- GitHub integration: Real REST API calls with dry-run mode
- Policy guards: Risk scoring and DECLARE constraint checking
- Human-in-the-loop: Approval gates for high-risk operations
- Structured state: TypedDict-based state management
cd langraph-agentic-app
python -m venv .venv
source .venv/bin/activate
pip install -e .
# Configure environment
export GITHUB_TOKEN=your_token
export GITHUB_REPO=owner/repo
export GITHUB_ISSUE_ID=123
export GITHUB_DRY_RUN=true
# Run the app
python -m langraph_agentic_app.appSee langraph-agentic-app/README.md for detailed documentation.
The process mining analysis identified 8 distinct roles in software development:
- Issue Reporter - Creates and clarifies issues
- Bot - Automated workflow maintenance
- Feature Developer - Implements new features
- Maintainer - Reviews and merges contributions
- Quality Engineer - Testing and validation
- Technical Writer - Documentation
- DevOps Engineer - CI/CD and infrastructure
- Contributor - General contributions
Each role has:
- Dedicated OCEL event log
- OCDFG visualization
- Process description (markdown)
- BPMN model
- DECLARE constraints (where applicable)
- Object-centric approach: Captures complex many-to-many relationships between issues, commits, users, and tasks
- Role specialization: Different roles exhibit distinct behavioral patterns
- Conventional commits: Task semantics extracted from commit message patterns
- Declarative constraints: Temporal and logical rules governing valid workflows
- Process Mining: pm4py, OCEL 2.0
- Visualization: matplotlib, graphviz
- Agentic Framework: LangGraph, LangChain
- Data Processing: pandas, numpy
- API Integration: requests (GitHub REST API)
- Built on the Commitizen project event log
- Uses pm4py for process mining
- Powered by LangGraph for agent orchestration
Note: This project demonstrates a research prototype. The agentic application runs in dry-run mode by default to prevent unintended GitHub operations.