BMSSP (Python) — Bounded Multi-Source Shortest Path

An efficient and faithful Python implementation of the Bounded Multi-Source Shortest Path (BMSSP) algorithm, along with baselines using pure-Python Dijkstra and SciPy-accelerated Dijkstra.

This implementation follows the algorithmic structure in the pseudocode (Algorithm 3: BMSSP) and preserves its key guarantees, including never writing distances ≥ B, not splitting tie groups in the controller structure D, and correctly maintaining boundary keys.

Paper reference: "A New Algorithm for the Single-Source Shortest Path Problem" (Ruobing Chen et al.).

Overview

BMSSP solves SSSP by recursively partitioning the problem. Compared to classic Dijkstra (≈ O(m + n log n)), BMSSP leverages a divide-and-conquer strategy with theoretical complexity O(m log^{2/3} n) and a specialized controller structure D to orchestrate bounded subproblems efficiently.

What's included

bmssp_py/common.py: Graph and Edge classes, plus init_distances() helper. Exposes Graph.to_csr() for SciPy integration.
bmssp_py/dijkstra.py: Pure-Python Dijkstra (multi-source, optional distance boundary).
bmssp_py/dijkstra_scipy.py: SciPy-accelerated Dijkstra using scipy.sparse.csgraph.dijkstra with optimized multi-source support.
bmssp_py/bmssp.py: Full BMSSP implementation with recursion, base case, pivot selection, controller structure _DataStructureD, and choose_parameters() function.
tests/: Unit tests and benchmark tests (pytest-benchmark).
example.py: Minimal example to run BMSSP.
scripts/quick_bench.py: Fast micro-benchmarks with beautiful terminal output and performance analysis.
scripts/performance_analysis.py: Detailed stability analysis and performance variation study.
scripts/advanced_benchmark.py: Comprehensive performance analysis with sophisticated graph generators, memory profiling, and crossover point detection.
scripts/enterprise_benchmark.py: Enterprise-level benchmarking with parallel execution, statistical analysis, and hardware utilization monitoring.

Algorithm adherence (pseudocode alignment)

Initialize and Insert (lines 6–7): _DataStructureD.initialize(M, B) and insertion of pivots with their current estimates.
Pull (line 8): _DataStructureD.pull() drains the entire tie group for the smallest key even when M = 1, ensuring tie groups are not split. It returns Bi as the next strictly larger key (or B if none), exactly as required.
BaseCase (line 2): _base_case runs a bounded Dijkstra-like search and never writes distances ≥ B. The base-case singleton/split behavior mirrors the pseudocode.
Relaxations (lines 13–19): After recursion, edges out of U_i are relaxed with the ≤ condition, but distances ≥ B are never written; insertions and the K set respect [B'_i, B_i).
BatchPrepend (line 21): _DataStructureD.batch_prepend merges the smallest distance per vertex and inserts them as a batch.
Return (line 22): Returns B' = min{B'_i, B} and augments U with all x ∈ W whose current estimate is < B'.

We also provide regression tests to ensure tie groups are fully drained by _DataStructureD.pull().

Installation

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Requirements: Python 3.9–3.12, pytest, pytest-benchmark, numpy, scipy>=1.3.0, psutil, matplotlib, networkx.

Note: SciPy 1.3.0+ is required for the min_only parameter in scipy.sparse.csgraph.dijkstra which enables optimized multi-source shortest path computation.

Quick start

from bmssp_py.common import Graph
from bmssp_py.bmssp import BMSSPAlgorithm, choose_parameters

g = Graph(5)
g.add_undirected(0, 1, 1.0)
g.add_undirected(1, 2, 2.0)
g.add_undirected(2, 3, 1.0)
g.add_undirected(3, 4, 3.0)

# Use adaptive parameters
k, t, l = choose_parameters(g.n, mode="performance")
algo = BMSSPAlgorithm(g, l=l, B=1000.0, sources=[0], k_override=k, t_override=t)
dist = algo.solve()
print(f"Parameters: k={k}, t={t}, l={l}")
print(f"Metrics: {algo.metrics}")

Run the example:

PYTHONPATH=. python example.py

Adaptive Parameters

BMSSP uses three key parameters derived from graph size n:

k = ⌊log^(1/3) n⌋: Controls pivot selection and base case size
t = ⌊log^(2/3) n⌋: Determines recursion branching factor
l = ⌈log n / t⌉: Number of recursion levels

The choose_parameters(n, mode) function provides three tuning modes:

performance (default): Reduces k, increases t for small graphs to minimize overhead
completeness: Increases k for better exploration, may use more recursion levels
balanced: Uses theoretical parameters k=⌊log^(1/3) n⌋, t=⌊log^(2/3) n⌋

Execution Metrics

Both BMSSPAlgorithm and DijkstraAlgorithm track execution metrics:

BMSSP: edge_relaxations, basecase_pq_pops, controller_inserts, vertices_discovered
Dijkstra: edge_relaxations, pq_pops, vertices_processed

These provide hardware-independent performance measures for algorithm comparison.

Tests

PYTHONPATH=. pytest -q

Highlights:

tests/test_bmssp.py: BMSSP correctness (linear, cycle, disconnected, fractional weights, boundary behavior).
tests/test_dijkstra.py: Dijkstra correctness (single/multi-source, bounded, cycle, disconnected).
tests/test_data_structure_d.py: ensures _DataStructureD.pull() drains tie groups and keeps smallest values on batch prepend.

Quick Benchmarks

Fast micro-benchmarks with execution metrics:

# Default performance mode, 3 repetitions
PYTHONPATH=. python scripts/quick_bench.py --reps 3

# Completeness mode, 5 repetitions  
PYTHONPATH=. python scripts/quick_bench.py --mode completeness --reps 5

Features

Beautiful Terminal Output: Color-coded results with formatted tables
Performance Analysis: Automatic winner detection and improvement calculations
Algorithm Metrics: Detailed execution statistics for each algorithm
Recommendations: Smart suggestions based on results

Tests linear (n=100), grid (10×10), and random (n=200, m=400) graphs, showing:

Runtime statistics (mean, median, stdev, min, max)
Performance improvements vs baseline
BMSSP parameters used (k, t, l)
Execution metrics for both algorithms

Performance Stability Analysis

Comprehensive analysis of performance variations and stability:

# Run detailed stability analysis
PYTHONPATH=. python scripts/performance_analysis.py

Why Performance Varies

Performance variations between runs are normal and expected due to:

System Factors:

CPU frequency scaling and thermal throttling
Background processes and system load
Memory allocation patterns and garbage collection
Cache warming and CPU branch prediction

Python-Specific Factors:

Python's garbage collector timing
Memory allocation and deallocation patterns
Import time and module loading

Algorithm-Specific Factors:

Random number generation (for graph generation)
Hash table collisions and resizing
Memory access patterns and cache misses
Heap operations and priority queue behavior

Stability Metrics

Coefficient of Variation (CV): Measures consistency (lower = more stable)
CV < 5%: Excellent stability
CV 5-10%: Good stability
CV 10-20%: Fair stability
CV > 20%: Poor stability

Recommendations

Use multiple runs (10+ recommended for reliable results)
Report mean/median with standard deviation
Consider CV < 10% as stable
Run on dedicated hardware for best results
Single measurements are not reliable for performance comparison

Advanced Benchmarks

Comprehensive performance analysis with sophisticated graph generators, memory profiling, and crossover point detection:

# Basic run with default settings
PYTHONPATH=. python scripts/advanced_benchmark.py

# Custom configuration
PYTHONPATH=. python scripts/advanced_benchmark.py \
    --sizes 32 64 128 256 512 \
    --types sparse_random scale_free small_world \
    --trials 5 \
    --mode balanced

Advanced Benchmark Features

Sophisticated Graph Generators: Scale-free, small-world, grid, sparse random graphs
Memory Profiling: Peak memory usage tracking with psutil
Crossover Point Detection: Automatic identification of when BMSSP becomes faster
Theoretical vs Empirical Analysis: Comparison of complexity predictions vs actual performance
Comprehensive Reporting: Detailed markdown reports and visualization plots
Multiple Baselines: Comparison against both pure-Python and SciPy-accelerated Dijkstra

Output Files

ADVANCED_PERFORMANCE_REPORT.md: Detailed analysis with crossover points and performance tables
advanced_performance_analysis.png: Four-panel visualization showing runtime comparison, improvement percentages, memory usage, and theoretical vs empirical analysis

Enterprise Benchmarks

High-performance benchmarking designed for maximum hardware utilization with parallel execution, statistical analysis, and enterprise-level reporting:

# Basic enterprise run
PYTHONPATH=. python scripts/enterprise_benchmark.py

# High-performance configuration for powerful machines
PYTHONPATH=. python scripts/enterprise_benchmark.py \
    --sizes 64 128 256 512 1024 2048 4096 \
    --types sparse_random scale_free small_world hierarchical power_law \
    --trials 10 \
    --mode balanced \
    --workers 16

Enterprise Benchmark Features

Parallel Execution: Multi-process benchmarking using all available CPU cores
Advanced Graph Topologies: Hierarchical, power-law, and community-structured graphs
Statistical Robustness: Confidence intervals, coefficient of variation, and statistical significance testing
Hardware Monitoring: Real-time CPU and memory utilization tracking
Enterprise Reporting: Comprehensive markdown reports with statistical analysis
Adaptive Configuration: Automatic worker scaling based on hardware capabilities
Progress Tracking: Real-time progress monitoring with ETA calculations

Enterprise Output Files

results/ENTERPRISE_PERFORMANCE_REPORT.md: Comprehensive statistical analysis
results/enterprise_performance_analysis.png: Four-panel visualization with confidence intervals
Detailed performance metrics including percentiles, stability analysis, and hardware utilization

Benchmarks

This repo includes four families of benchmarks:

Quick Benchmarks: Fast micro-benchmarks with beautiful output for development and testing
Performance Analysis: Detailed stability analysis and variation study
Advanced Benchmarks: Comprehensive analysis with sophisticated graph generators
Enterprise Benchmarks: High-performance parallel benchmarking for maximum hardware utilization

Standard pytest benchmarks

Pure-Python comparison (BMSSP vs Dijkstra):

Linear graphs (small/medium/large)
Grid graphs (small/medium)
Random graphs (sparse/dense)
Multi-source and bounded runs

SciPy-accelerated baseline (Dijkstra via SciPy) vs BMSSP on linear graphs:

Demonstrates a fast compiled baseline for Dijkstra using scipy.sparse.csgraph.dijkstra.
Optimized multi-source computation using min_only=True parameter (SciPy 1.3.0+).

Run only the SciPy benchmarks (fast):

PYTHONPATH=. pytest -q -k scipy --benchmark-only

Run the whole benchmark suite (short runs by default):

PYTHONPATH=. pytest -q -k benchmark --benchmark-only

Benchmark config (short by default) is set in pytest.ini:

--benchmark-max-time=0.2
--benchmark-min-rounds=5

Tips for larger runs:

Increase time/rounds on the command line, e.g. --benchmark-max-time=2 --benchmark-min-rounds=10.
Select specific groups, e.g. -k comparison_random_sparse or -k comparison_grid_medium.

What the benchmarks evaluate

Throughput and latency of complete runs for each algorithm on identically constructed graphs and sources.
Impact of recursion depth l and boundary B on BMSSP performance.
Effect of topology (linear vs grid vs random sparse/dense) and number of sources.
With SciPy, a compiled Dijkstra baseline to gauge the overheads of Python vs an optimized implementation.

Interpreting results

On small/simple graphs, Dijkstra (especially SciPy) generally outperforms BMSSP due to lower constant overhead.
As graph sizes grow and in certain sparsity/topology regimes, BMSSP's bounded recursion can reduce the number of effective relaxations. In pure Python, the crossover point can require larger instances than in low-level languages.
Performance variations are normal - use multiple runs and statistical analysis for reliable comparisons.

Hardware Requirements

Minimum Requirements (Basic Testing)

CPU: 2+ cores, 2.0+ GHz
RAM: 4 GB
Storage: 1 GB free space
OS: Linux/macOS/Windows with Python 3.9+

Recommended Requirements (Standard Benchmarks)

CPU: 4+ cores, 2.5+ GHz (Intel i5/AMD Ryzen 5 or better)
RAM: 8 GB
Storage: 2 GB free space
OS: Linux/macOS with Python 3.9+

High-Performance Requirements (Large-Scale Testing)

CPU: 8+ cores, 3.0+ GHz (Intel i7/AMD Ryzen 7 or better)
RAM: 16+ GB
Storage: 5+ GB free space (SSD recommended)
OS: Linux/macOS with Python 3.9+

Enterprise Requirements (Maximum Performance)

CPU: 16+ cores, 3.5+ GHz (Intel i9/AMD Ryzen 9, Xeon, or EPYC)
RAM: 32+ GB (64+ GB for very large graphs)
Storage: 10+ GB free space (NVMe SSD recommended)
OS: Linux/macOS with Python 3.9+
Parallel Processing: Support for multiprocessing with high core count

Advanced Benchmark Specific Requirements

CPU: 4+ cores recommended for parallel processing
RAM: 8+ GB for large graph generation and memory profiling
Graphics: Not required (plots are generated as PNG files)
Network: Not required (all processing is local)

Enterprise Benchmark Specific Requirements

CPU: 8+ cores for optimal parallel execution
RAM: 16+ GB for large-scale graph generation and statistical analysis
Storage: Fast SSD for I/O-intensive operations
Process Isolation: Support for multiprocessing with memory isolation

Performance Scaling

Small graphs (n ≤ 1,000): Any modern laptop
Medium graphs (n ≤ 10,000): Recommended specs or better
Large graphs (n ≤ 100,000): High-performance specs required
Very large graphs (n > 100,000): Enterprise specs recommended
Massive graphs (n > 1,000,000): Server-grade hardware required

Expected Runtime

Quick benchmarks: 10-30 seconds
Performance analysis: 1-3 minutes
Standard benchmarks: 1-5 minutes
Advanced benchmarks: 5-30 minutes (depending on graph sizes and trials)
Enterprise benchmarks: 10-60 minutes (depending on hardware and configuration)
Large-scale testing: 1-6 hours

Hardware Utilization Optimization

CPU: Enterprise benchmarks automatically scale to use all available cores
Memory: Efficient memory management with garbage collection optimization
Storage: Results are saved incrementally to prevent data loss
Network: All processing is local for maximum performance

Design choices and correctness notes

_DataStructureD uses a single global heap and a map of best keys per vertex. pull() drains a full tie group for the current minimum key before advancing Bi. This strictly follows the pseudocode and prevents tie-splitting bugs common in multi-heap designs.
BMSSP never writes distances ≥ B (base case and relaxations). Batch inserts keep only the smallest value per vertex.
Pivot selection runs k rounds of ≤-relaxations within bound B, constructs the equality forest, and selects pivots with subtree size ≥ k.
SciPy integration uses min_only=True for multi-source shortest paths when available (SciPy 1.3.0+), providing significant performance improvements over manual iteration.

License and citation

If you use this repository in academic work, please cite the BMSSP paper (Chen et al.) and reference this implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bmssp_py		bmssp_py
scripts		scripts
tests		tests
LICENSE		LICENSE
README.md		README.md
example.py		example.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

BMSSP (Python) — Bounded Multi-Source Shortest Path

Overview

What's included

Algorithm adherence (pseudocode alignment)

Installation

Quick start

Adaptive Parameters

Execution Metrics

Tests

Quick Benchmarks

Features

Performance Stability Analysis

Why Performance Varies

Stability Metrics

Recommendations

Advanced Benchmarks

Advanced Benchmark Features

Output Files

Enterprise Benchmarks

Enterprise Benchmark Features

Enterprise Output Files

Benchmarks

Standard pytest benchmarks

What the benchmarks evaluate

Interpreting results

Hardware Requirements

Minimum Requirements (Basic Testing)

Recommended Requirements (Standard Benchmarks)

High-Performance Requirements (Large-Scale Testing)

Enterprise Requirements (Maximum Performance)

Advanced Benchmark Specific Requirements

Enterprise Benchmark Specific Requirements

Performance Scaling

Expected Runtime

Hardware Utilization Optimization

Design choices and correctness notes

License and citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages