A system for collecting and analyzing software sustainability metrics for scientific open-source software.
This framework collects metrics from multiple sources and integrates with the CORSA Sustainability Dashboard.
- Multi-Source Data Collection: GitHub, Semantic Scholar, OpenAlex, Zenodo
- Orchestrated Workflows: Configurable collection pipelines
- CASS Framework: Four dimensions - Impact, Community, Viability, Quality
- Dashboard Integration: Generate JSON data for CORSA dashboard
- Automated Collection: GitHub Actions workflows
- Extensible Framework: Modular collector design for incremental implementation
- Python 3.11+
- Git
# Clone repository
git clone https://github.com/brtnfld/metrics
cd metrics
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Configure your workflow
cp config/api_credentials.yaml.example config/api_credentials.yaml
# Edit config/api_credentials.yaml with your API keys
# Run orchestrator
python orchestrator.py config/orchestrator.yaml# Set environment variables (optional but recommended)
export GITHUB_TOKEN="your_github_token"
export SEMANTIC_SCHOLAR_KEY="your_api_key"
export OPENALEX_EMAIL="your-email@example.com"
# Generate metrics
python scripts/generate_corsa_citations.py \
--catalog config/software_catalog.yaml \
--output output/citationMetrics.jsonThe framework follows the CASS (Consortium for Advancement of Scientific Software) sustainability model with four main dimensions:
| Dimension | Status | Description |
|---|---|---|
| Impact | ✅ Implemented | Software citation, adoption, and field research impact |
| Community | ✅ Partially Implemented | CoC/governance, licensing, maintenance, engagement, community health |
| Viability | ✅ Implemented | Long-term sustainability, security, and licensing |
| Quality | ✅ Partially Implemented | CI/CD practices, accessibility, reproducibility, OpenSSF badge/scorecard |
Each dimension contains multiple sub-categories and metrics that contribute to an overall sustainability score.
metrics/
├── collectors/ # CASS dimension collectors
│ ├── impact/
│ │ ├── citation.py # Citation metrics (✅ implemented)
│ │ └── dimension.py # Impact dimension
│ ├── sustainability/
│ │ ├── chaoss_governance.py # 4.2.1 CoC, governance & contributor guidelines
│ │ ├── licensing.py # 4.2.2 Open-source licensing & FAIR compliance
│ │ ├── active_maintenance.py # 4.2.3 Active maintenance
│ │ ├── engagement.py # 4.2.4 Community engagement
│ │ ├── community_health.py # 4.2.10 Project longevity & community health
│ │ ├── openssf_badge.py # OpenSSF best practices badge
│ │ ├── openssf_scorecard.py # OpenSSF scorecard
│ │ └── dimension.py # Sustainability dimension aggregator
│ ├── quality/
│ │ ├── development_practices/
│ │ │ └── ci_cd.py # 4.3.2 CI/CD development practices
│ │ ├── reproducibility.py # 4.3.3 Reproducibility
│ │ ├── accessibility.py # 4.3.5 Accessibility (portable build systems)
│ │ └── dimension.py # Quality dimension aggregator
│ └── catalog_sync.py # Catalog synchronization
│
├── integrations/ # API integrations
│ ├── base.py # Base API client
│ ├── github_api.py # GitHub API
│ ├── semantic_scholar.py # Semantic Scholar API
│ ├── openalex.py # OpenAlex API
│ └── zenodo.py # Zenodo API
│
├── scripts/
│ └── generate_corsa_citations.py # CORSA integration
│
├── config/
│ ├── orchestrator.yaml # Workflow configuration
│ ├── software_catalog.yaml # Software catalog
│ └── api_credentials.yaml.example
│
├── .github/workflows/
│ └── collect-and-sync.yml # Automated collection
│
└── orchestrator.py # Main orchestrator
Environment variables (all optional for better rate limits):
| Variable | Description |
|---|---|
GITHUB_TOKEN |
GitHub personal access token |
SEMANTIC_SCHOLAR_KEY |
Semantic Scholar API key |
OPENALEX_EMAIL |
Email for OpenAlex polite pool |
ZENODO_TOKEN |
Zenodo access token |
See CONFIGURATION.md for detailed setup.
- CONFIGURATION.md - Configuration details
- ORCHESTRATOR_GUIDE.md - Orchestrator usage
- QUICK_START.md - Getting started guide
- QUICK_REFERENCE.md - Command reference
- CASS-Sustainability-Metrics-Report.pdf - CASS framework specification
- Semantic Scholar: Academic citations
- OpenAlex: Citation database
- Zenodo: DOI resolution and downloads
- GitHub: Repository metadata and dependents
MIT License - See LICENSE for details.
- CORSA Dashboard: info@corsa.center
- Issues: GitHub Issues