Contributing to GitFlow Analytics

Thank you for your interest in contributing to GitFlow Analytics! This document provides guidelines for contributing to the project.

🚀 Quick Start for Contributors

Development Setup

Fork and clone the repository:

git clone https://github.com/your-username/gitflow-analytics.git
cd gitflow-analytics

Set up development environment:

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Install spaCy model for ML features
python -m spacy download en_core_web_sm

Verify installation:

# Run tests
pytest tests/

# Check code quality
ruff check src/
mypy src/
black --check src/

Development Commands

# Code formatting
black src/ tests/

# Linting
ruff check src/ tests/
ruff check --fix src/ tests/  # Auto-fix issues

# Type checking
mypy src/

# Run tests with coverage
pytest --cov=gitflow_analytics --cov-report=html

# Test specific module
pytest tests/test_analyzer.py -v

# Install local development version
pip install -e ".[dev]"

📝 Code Style and Standards

Code Quality Requirements

All contributions must pass the following checks:

Formatting: Code must be formatted with black
Linting: Code must pass ruff checks
Type Hints: New code should include type hints and pass mypy
Tests: New features must include tests with >80% coverage
Documentation: Public APIs must have docstrings

Pre-commit Setup (Recommended)

Install pre-commit hooks to automatically check code quality:

# Install pre-commit
pip install pre-commit

# Set up pre-commit hooks
pre-commit install

# Run manually (optional)
pre-commit run --all-files

Code Style Guidelines

Line Length: Maximum 88 characters (black default)
Import Organization: Use isort compatible imports
Naming Conventions:
- Classes: PascalCase
- Functions/variables: snake_case
- Constants: UPPER_CASE
- Private members: _leading_underscore
Docstrings: Use Google-style docstrings for all public functions

Example Function Style

def analyze_repository(
    repo_path: Path,
    weeks: int = 12,
    enable_ml: bool = True
) -> AnalysisResult:
    """Analyze a Git repository for productivity insights.
    
    Args:
        repo_path: Path to the Git repository to analyze
        weeks: Number of weeks to analyze (default: 12)
        enable_ml: Enable ML-enhanced commit categorization
        
    Returns:
        AnalysisResult containing comprehensive metrics and insights
        
    Raises:
        RepositoryError: If the repository is invalid or inaccessible
        ConfigurationError: If analysis configuration is invalid
    """
    # Implementation here
    pass

🧪 Testing Guidelines

Test Structure

Tests are organized in the tests/ directory mirroring the src/ structure:

tests/
├── test_analyzer.py           # Core analysis tests
├── test_config.py             # Configuration tests  
├── test_identity.py           # Identity resolution tests
├── qualitative/               # ML system tests
│   ├── test_classifiers.py
│   └── test_nlp_engine.py
└── fixtures/                  # Test data and fixtures

Writing Tests

Unit Tests: Test individual functions and classes
Integration Tests: Test component interactions
End-to-end Tests: Test complete workflows

Example test structure:

import pytest
from pathlib import Path
from gitflow_analytics.core.analyzer import GitAnalyzer

class TestGitAnalyzer:
    """Test suite for GitAnalyzer class."""
    
    @pytest.fixture
    def sample_repo(self) -> Path:
        """Provide a sample repository for testing."""
        return Path("tests/fixtures/sample-repo")
    
    def test_analyze_commits(self, sample_repo):
        """Test basic commit analysis functionality."""
        analyzer = GitAnalyzer(sample_repo)
        result = analyzer.analyze_commits(weeks=4)
        
        assert result.total_commits > 0
        assert len(result.developers) > 0
        assert result.date_range.weeks == 4
    
    def test_invalid_repository(self):
        """Test handling of invalid repository paths."""
        with pytest.raises(RepositoryError):
            GitAnalyzer(Path("nonexistent/path"))

Test Data and Fixtures

Use the tests/fixtures/ directory for sample repositories and data
Create minimal, focused test repositories
Use pytest.fixtures for reusable test setup
Mock external APIs (GitHub, JIRA) in tests

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=gitflow_analytics --cov-report=html

# Run specific test file
pytest tests/test_analyzer.py

# Run with verbose output
pytest -v

# Run only failed tests from last run
pytest --lf

# Run tests matching pattern
pytest -k "test_identity"

🏗️ Architecture and Design

Project Structure

src/gitflow_analytics/
├── __init__.py              # Package initialization
├── cli.py                   # Command-line interface
├── config.py                # Configuration handling
├── core/                    # Core analysis logic
│   ├── analyzer.py          # Main Git analysis engine
│   ├── cache.py             # Caching system
│   ├── identity.py          # Developer identity resolution
│   └── branch_mapper.py     # Branch to project mapping
├── extractors/              # Data extraction components
│   ├── tickets.py           # Ticket reference extraction (rule-based)
│   ├── ml_tickets.py        # ML-enhanced ticket extraction
│   └── story_points.py      # Story point extraction
├── integrations/            # External service integrations
│   └── github_client.py     # GitHub API integration
├── qualitative/             # ML and qualitative analysis
│   ├── classifiers/         # ML classification models
│   ├── core/                # Core ML infrastructure
│   └── utils/               # ML utilities
├── reports/                 # Report generation
│   ├── csv_writer.py        # CSV report generation
│   └── narrative_writer.py  # Markdown narrative reports
└── models/                  # Data models and schemas
    └── database.py          # SQLAlchemy models

Design Principles

Modular Architecture: Each component has a single responsibility
Extensibility: Easy to add new ticket platforms, report formats, ML models
Performance: Intelligent caching and batch processing for large repositories
Graceful Degradation: ML features fall back to rule-based approaches
Configuration-driven: Behavior controlled through YAML configuration

Adding New Features

Adding a New Ticket Platform

Update regex patterns in extractors/tickets.py:

# Add new platform pattern
PLATFORM_PATTERNS = {
    'existing_platform': r'PROJ-\d+',
    'new_platform': r'NEW-\d+',  # Add here
}

Add platform detection logic:

def detect_platform(ticket_ref: str) -> str:
    if re.match(r'NEW-\d+', ticket_ref):
        return 'new_platform'
    # ... existing logic

Update configuration schema if needed
Add tests for the new platform
Update documentation

Adding a New Report Format

Create report writer in reports/:

# reports/json_writer.py
class JSONReportWriter:
    def generate_report(self, data: AnalysisResult) -> None:
        # Implementation

Register in CLI (cli.py):

# Add to format options
if 'json' in output_formats:
    json_writer = JSONReportWriter()
    json_writer.generate_report(analysis_result)

Update configuration schema
Add tests
Update documentation

Extending ML Categorization

Add new categories in qualitative/classifiers/change_type.py:

CHANGE_PATTERNS = {
    'existing_category': ['pattern1', 'pattern2'],
    'new_category': ['new_pattern1', 'new_pattern2'],  # Add here
}

Update semantic analysis if needed
Train/validate on sample data
Add tests with expected categorizations
Update documentation

🔄 Contribution Workflow

Step 1: Planning

Check existing issues for similar features/bugs
Create an issue to discuss the change (for significant features)
Get feedback from maintainers before starting work

Step 2: Development

Create a feature branch:

git checkout -b feature/awesome-new-feature
# or
git checkout -b fix/important-bug-fix

Make atomic commits with clear messages:

git commit -m "feat: add support for Linear ticket platform"
git commit -m "fix: resolve identity resolution bug with similar names"
git commit -m "docs: update installation instructions for ML features"

Follow conventional commits for automatic versioning:
- feat: - New features (minor version bump)
- fix: - Bug fixes (patch version bump)
- docs: - Documentation changes
- test: - Adding or updating tests
- refactor: - Code refactoring
- perf: - Performance improvements
- chore: - Maintenance tasks

Step 3: Quality Checks

Run all quality checks before submitting:

# Format code
black src/ tests/

# Check linting
ruff check src/ tests/

# Type checking
mypy src/

# Run tests
pytest --cov=gitflow_analytics

# Test installation
pip install -e ".[dev]"

Step 4: Pull Request

Create a pull request with:
- Clear title following conventional commit format
- Detailed description of changes
- Reference to related issues
- Screenshots/examples if applicable

PR Template:

## Summary
Brief description of changes

## Type of Change
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix/feature that would cause existing functionality to not work as expected)
- [ ] Documentation update

## Testing
- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Added tests for new functionality
- [ ] Manual testing completed

## Checklist
- [ ] Code follows project style guidelines
- [ ] Self-review completed
- [ ] Documentation updated
- [ ] No new warnings introduced

Respond to feedback promptly and make requested changes

🐛 Reporting Bugs

Bug Report Template

When reporting bugs, please include:

Environment:
- Python version
- GitFlow Analytics version
- Operating system
- Git version
Configuration (redacted):
- YAML configuration (remove sensitive tokens)
- Command line arguments used
Expected vs Actual Behavior:
- What you expected to happen
- What actually happened
- Error messages or logs
Reproduction Steps:
- Minimal steps to reproduce the issue
- Sample repository if possible (or description)
Additional Context:
- Repository size/characteristics
- Any workarounds found

Example Bug Report

**Bug**: ML categorization fails with spaCy model error

**Environment**:
- Python 3.11.5
- GitFlow Analytics v1.0.9  
- macOS 14.0
- Git 2.42.0

**Configuration**:
```yaml
analysis:
  ml_categorization:
    enabled: true
    min_confidence: 0.7

Expected: Commits categorized using ML model Actual: Error: "OSError: [E050] Can't find model 'en_core_web_sm'"

Steps:

Install gitflow-analytics
Run without installing spaCy model
Error occurs during ML categorization

Solution: Better error message suggesting spaCy model installation


## 🌟 Feature Requests

### Feature Request Template

1. **Use Case**: What problem does this solve?
2. **Proposed Solution**: How should it work?
3. **Alternatives**: Other approaches considered
4. **Impact**: Who benefits from this feature?
5. **Implementation**: Any implementation suggestions

## 📚 Documentation Contributions

### Types of Documentation

1. **User Documentation**: README, usage guides, configuration examples
2. **Developer Documentation**: CLAUDE.md, API documentation, architecture guides
3. **API Documentation**: Docstrings, type hints
4. **Examples**: Sample configurations, use cases

### Documentation Standards

- Use clear, concise language
- Include practical examples
- Keep configuration examples generic (no company-specific references)
- Test all code examples
- Use proper markdown formatting
- Include screenshots/diagrams where helpful

### Documentation Workflow

1. **Identify documentation needs** (outdated guides, missing examples)
2. **Create or update content** following project standards
3. **Test examples** to ensure they work
4. **Submit pull request** with documentation changes

## 🏷️ Release Process

GitFlow Analytics uses automated semantic versioning:

1. **Conventional Commits** determine version bumps
2. **GitHub Actions** handles automated releases
3. **PyPI Publishing** happens automatically on version tags

### For Maintainers

The release process is fully automated:
1. Merge pull requests to main
2. GitHub Actions analyzes commits
3. Version is bumped automatically
4. Releases are published to PyPI
5. GitHub releases are created with changelogs

## ❓ Getting Help

- **GitHub Issues**: Bug reports and feature requests
- **GitHub Discussions**: Questions, ideas, and community support
- **Documentation**: Check README and CLAUDE.md first
- **Code Examples**: Review sample configurations and tests

## 🎉 Recognition

Contributors are recognized through:
- **GitHub Contributors** section
- **Release Notes** mention significant contributions
- **Community Appreciation** in discussions and issues

Thank you for contributing to GitFlow Analytics! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to GitFlow Analytics

🚀 Quick Start for Contributors

Development Setup

Development Commands

📝 Code Style and Standards

Code Quality Requirements

Pre-commit Setup (Recommended)

Code Style Guidelines

Example Function Style

🧪 Testing Guidelines

Test Structure

Writing Tests

Test Data and Fixtures

Running Tests

🏗️ Architecture and Design

Project Structure

Design Principles

Adding New Features

Adding a New Ticket Platform

Adding a New Report Format

Extending ML Categorization

🔄 Contribution Workflow

Step 1: Planning

Step 2: Development

Step 3: Quality Checks

Step 4: Pull Request

🐛 Reporting Bugs

Bug Report Template

Example Bug Report

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to GitFlow Analytics

🚀 Quick Start for Contributors

Development Setup

Development Commands

📝 Code Style and Standards

Code Quality Requirements

Pre-commit Setup (Recommended)

Code Style Guidelines

Example Function Style

🧪 Testing Guidelines

Test Structure

Writing Tests

Test Data and Fixtures

Running Tests

🏗️ Architecture and Design

Project Structure

Design Principles

Adding New Features

Adding a New Ticket Platform

Adding a New Report Format

Extending ML Categorization

🔄 Contribution Workflow

Step 1: Planning

Step 2: Development

Step 3: Quality Checks

Step 4: Pull Request

🐛 Reporting Bugs

Bug Report Template

Example Bug Report