A Secure Federated and Agentic AI Framework for Multilingual Disability-Inclusive Employment in AI Cities
Syed, Toqeer Ali Β· Siddiqui, Muhammad Shoaib Β· Ali Akarma Frontiers in Artificial Intelligence, 2026
FedAgent-Chain is a unified framework enabling distributed institutions β public employment agencies, universities, rehabilitation centres, employers, and assistive-technology providers β to collaboratively train inclusive employment models without centralising sensitive disability data.
The system integrates five technology pillars into a single trustworthy pipeline:
| Pillar | Purpose |
|---|---|
| Federated Learning | Privacy-preserving distributed model training across 4 heterogeneous regional nodes |
| Fairness-Aware Aggregation | A Ξ»-penalty mechanism in FedAvg that reduces disparity across disability categories |
| Permissioned Blockchain | Immutable audit trail for model updates and consent management |
| Agentic AI Services | 5 specialised agents for matching, upskilling, accommodation, multilingual support, and governance |
| Differential Privacy | Gradient clipping and calibrated noise injection for formal privacy guarantees |
The framework achieves competitive federated performance while providing trustworthy orchestration, governance-aware decision support, and blockchain-backed auditability β capabilities absent from standard federated baselines.
- Key Contributions
- System Architecture
- Installation
- Quick Start β Reproduce in 3 Steps
- Empirical Results
- Figure Showcase
- Fairness & Heterogeneity Analysis
- Systems Performance
- Qualitative Agentic AI Demonstrations
- Reproducibility & Scientific Hardening
- Repository Structure
- Advanced Workflows
- Limitations & Ethical Considerations
- Paper Writing Resources
- Citation
- License
- ποΈ Unified trustworthy architecture combining federated learning, permissioned blockchain, and agentic AI for disability-inclusive employment across multilingual, multi-institutional, cross-country settings.
- βοΈ Fairness-Aware FedAvg: A formalised fairness penalty (Ξ») integrated into the federated optimisation objective, with an empirical Pareto frontier characterising the accuracyβfairness tradeoff.
- π Permissioned blockchain audit layer: Consent traceability, cryptographic model-update hashing, and smart-contract-based access control β without storing raw disability data on-chain.
- π€ Five specialised agentic AI services: Employment matching, adaptive upskilling, workplace accommodation, multilingual communication, and human-in-the-loop governance.
- π¬ Reproducible prototype simulation: A four-node cross-country simulation (Saudi Arabia, United States, China, Europe) with synthetic disability-employment data, validated across 5 independent seeds (42, 123, 2024, 777, 999).
- π Comprehensive systems profiling: Full runtime breakdown, communication cost analysis, and scalability discussion with analytical complexity bounds.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Layer 1 β Users & Stakeholders β
β Persons with Disabilities Β· Employers Β· Vocational Advisors β
ββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β Layer 2 β Data Ingestion β
β Synthetic Dataset Β· O*NET Β· ESCO Β· Regional Disability Data β
ββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β Layer 3 β Institutional Nodes (4 Countries) β
β Saudi Arabia β United States β China β Europe β
β Local Training Β· Data Preprocessing Β· Consent Management β
ββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β DP-Protected Model Updates
ββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β Layer 4 β Security & Privacy β
β Differential Privacy (Ξ΅,Ξ΄) Β· LayerNorm Stabilisation β
β Gradient Clipping (C=1.0) Β· Noise Multiplier (Ο=0.1) β
ββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β Layer 5 β Federated Aggregation β
β Standard FedAvg β Fairness-Aware FedAvg (Ξ»-penalty) β
β Weight formula: Ο_i = 1 + Ξ» Β· min-group-F1_i β
βββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β Audit Hashes
βββββββββββΌββββββββββββββββ βββββββββββββΌβββββββββββββββββββββ
β Layer 7 β Agentic AI β β Layer 6 β Blockchain β
β Employment Matching β β Permissioned Ledger β
β Upskilling Agent β β SHA-256 Hash Chain β
β Accommodation Agent β β Smart Contracts β
β Multilingual Agent β β Consent Logger β
β Governance Agent β β Audit Trail β
βββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββ
Agent Descriptions:
| Agent | Role |
|---|---|
| Employment Matching | Scores user-job suitability using skill overlap, accommodation coverage, and language compatibility |
| Upskilling | Identifies skill gaps and recommends targeted training courses |
| Accommodation | Recommends workplace adaptations (e.g., screen readers, ergonomic setups) based on disability profiles |
| Multilingual | Provides cross-lingual communication plans between user and employer language environments |
| Governance | Flags high-risk recommendations for mandatory human review; enforces policy compliance |
# Clone and enter the repository
git clone https://github.com/aliakarma/fedagent-chain.git
cd fedagent-chain
# Create isolated environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
pip install -e .# Generate synthetic disability-employment data
python scripts/generate_synthetic_data.py \
--config configs/experiment/fedagent_chain_full.yaml --seed 42
# Run FedAgent-Chain simulation (repeat for seeds 123, 2024, 777, 999)
python scripts/run_federated_simulation.py --seed 42
# Run baselines (repeat for all seeds)
python scripts/run_baselines.py --seed 42# Aggregate all 5 seeds into publication tables
python scripts/aggregate_multi_seed_results.py --seeds 42 123 2024 777 999
# Run full evaluation pipeline (generates confusion matrices)
python scripts/run_evaluation.py --seed 42
# Generate ablation comparison table
python scripts/generate_ablation_table.py# Generate all publication figures
python scripts/generate_figures.py
python scripts/generate_lambda_tradeoff_plot.py
python scripts/generate_system_overhead_plots.py
# Run automated integrity check
python scripts/verify_submission_readiness.pyAll outputs are saved to experiments/results/.
All metrics are computed from trained model checkpoints evaluated on held-out test sets (stratified 80/20 split per node). Results below are aggregated over 5 independent seeds (42, 123, 2024, 777, 999).
| Method | F1 Mean | F1 Std | 95% CI |
|---|---|---|---|
| FedAgent-Chain | 0.7207 | 0.0565 | [0.6506, 0.7909] |
| Standard FedAvg | 0.7116 | 0.0718 | [0.6225, 0.8007] |
| Local Baseline | 0.5380 | 0.2753 | [0.1962, 0.8799] |
| Centralized | 0.7115 | 0.0238 | [0.6820, 0.7411] |
Source:
experiments/results/statistics/table_2_multi_seed_summary.csv
| Comparison | Ξ F1 | t | p | Cohen's d | Sig. |
|---|---|---|---|---|---|
| FedAgent-Chain vs Standard FedAvg | +0.0091 | 0.44 | 0.679 | 0.20 | No |
| FedAgent-Chain vs Local Baseline | +0.1827 | 1.31 | 0.261 | 0.59 | No |
| FedAgent-Chain vs Centralized | +0.0092 | 0.61 | 0.575 | 0.27 | No |
Interpretation: FedAgent-Chain achieves competitive performance with all baselines. The lack of statistical significance at n=5 is expected given the limited seed count β the key contribution is not raw metric superiority, but the integration of fairness, governance, and auditability within a federated paradigm.
Source:
experiments/results/statistics/statistical_tests.csv
| Agent | Metric | Score |
|---|---|---|
| Employment Matching | Mean Confidence | 0.6937 |
| Upskilling | Skill Gap Coverage | 1.0000 |
| Accommodation | Accommodation Coverage | 0.7308 |
| Multilingual | Language Adequacy | 0.9690 |
| Governance | High-Risk Detection Rate | 0.7333 |
| Governance | False Positive Rate | 0.0595 |
Source:
experiments/results/seeds/seed_42/table_5_agent_results.csv
| Variant | F1 (Mean) | D_fair (Mean) | Runtime |
|---|---|---|---|
| Full System (Ξ»=0.5) | 0.7207 | 0.1653 | 86.6s |
| Standard FedAvg (Ξ»=0) | 0.7116 | 0.1610 | 86.6s |
Interpretation: The Ξ»-penalty acts as a group-regularizer, yielding a modest improvement in predictive stability (F1) across heterogeneous nodes while maintaining comparable aggregate fairness disparity.
The following publication-quality figures are available in paper_figures/:
| Figure | Description | File |
|---|---|---|
| Convergence (CI) | FL training dynamics with 95% confidence bands across 5 seeds | paper_figures/fl_convergence.pdf |
| Node F1 Scores | Per-region performance comparison across all methods | paper_figures/node_f1_scores.pdf |
| Fairness Disparity | D_fair evolution over federated rounds | paper_figures/fairness_disparity.pdf |
| Ξ» Tradeoff (Pareto) | Accuracyβfairness Pareto frontier across 8 Ξ» values | paper_figures/lambda_tradeoff_ci.pdf |
| Runtime Breakdown | Stacked bar chart of local training vs. aggregation overhead | paper_figures/runtime_breakdown.pdf |
| Communication Costs | Cumulative transmission volume over 20 rounds | paper_figures/communication_costs.pdf |
| Confusion Matrix (FedAgent-Chain) | Classification error analysis for the full system | paper_figures/confusion_matrix_fedagent_chain.pdf |
| Confusion Matrix (Standard FedAvg) | Classification error analysis for baseline FedAvg | paper_figures/confusion_matrix_standard_fedavg.pdf |
| Protected Attribute | FedAgent-Chain | Std FedAvg | Local Baseline | Centralized |
|---|---|---|---|---|
| Disability Category | 0.0517 Β± 0.0147 | 0.0428 Β± 0.0095 | 0.0764 Β± 0.0546 | 0.0444 Β± 0.0193 |
| Language Group | 0.4154 Β± 0.0493 | 0.4115 Β± 0.0532 | 0.3248 Β± 0.1430 | 0.4366 Β± 0.0843 |
| Work Mode | 0.0145 Β± 0.0083 | 0.0169 Β± 0.0233 | 0.0243 Β± 0.0273 | 0.0111 Β± 0.0071 |
| Regional Node | 0.1795 Β± 0.0160 | 0.1729 Β± 0.0205 | 0.1357 Β± 0.0790 | 0.1764 Β± 0.0253 |
Source:
experiments/results/statistics/table_3_multi_seed_summary.csv
| Node | Total Samples | Suitable (1) | Unsuitable (0) | Balance (%) |
|---|---|---|---|---|
| Saudi Arabia | 12,500 | 6,753 | 5,747 | 54.0% |
| United States | 12,500 | 7,265 | 5,235 | 58.1% |
| China | 12,500 | 7,601 | 4,899 | 60.8% |
| Europe | 12,500 | 4,759 | 7,741 | 38.1% |
Finding: The Europe node exhibits a distributional skew (38.1% positive rate vs. ~57% average across other nodes). This heterogeneity is a deliberate design choice to stress-test the fairness-aware aggregator under realistic cross-institutional data imbalance. The global model's lower performance on Europe reflects this distributional shift β not a system deficiency.
| Metric | Value | Description |
|---|---|---|
| Avg Local Training Time | 16.01s | Per-node computation time (5 epochs) |
| Avg Aggregation Time | 0.0005s | Server-side coordination overhead |
| Avg Blockchain Logging Time | 0.0007s | Hash submission latency |
| Model Size | 513 KB | Payload size per communication round |
The FedAgent-Chain architecture exhibits linear communication scalability:
- Communication: Total volume scales as O(R Β· K Β· |W|), where R is rounds, K is nodes, and |W| is model size. With a ~500 KB model, a 100-node deployment would transmit ~100 MB per round β well within modern institutional bandwidth.
- Computation: Local training is parallelised across nodes. Server-side aggregation is O(K Β· |W|), which is negligible for K < 1000.
- Blockchain: The audit trail grows linearly with R Β· K. In production, a Merkle-tree-based accumulator could further compress these logs.
FedAgent-Chain uses a multi-agent orchestration layer to provide holistic employment support. Three representative scenarios demonstrate the system in action:
- Profile: Visually impaired user, primary language Arabic, seeking a Data Analyst role.
- Agent Action: The Multilingual Agent provides a cross-lingual communication plan, while the Accommodation Agent recommends screen-reader and Braille display integrations.
- Outcome: β Approved (Confidence: 0.78).
- Profile: Mobility-impaired candidate seeking remote work in Finance.
- Agent Action: The Upskilling Agent identifies a skill gap and recommends targeted training courses.
- Outcome: β Approved (Confidence: 0.60).
- Profile: High-risk candidate (multiple disabilities) for a manual labor role with low accessibility score (0.2).
- Agent Action: The Governance Agent detects a mismatch between physical requirements and candidate needs.
- Outcome: π© Flagged for Human Review (Risk Score: 1.0).
Full scenario reports:
experiments/results/demos/
FedAgent-Chain is designed for full transparency and reproducibility:
- Multi-Seed Validation: n=5 independent random seeds (42, 123, 2024, 777, 999).
- Deterministic Seeding: All local training and data generation use fixed PyTorch and NumPy seeds.
- Hardware Agnostic: Results are verifiable on standard CPU-based workstations (8 GB+ RAM).
- Experiment Manifest: All hyperparameters, seeds, and runtime metadata are recorded in
experiments/manifest.yaml.
| Document | Purpose |
|---|---|
docs/reproducibility.md |
Step-by-step verification checklist for reviewers |
docs/scientific_hardening.md |
Threats to validity and ethical considerations |
experiments/manifest.yaml |
Machine-readable experiment provenance |
CITATION.cff |
Standardised citation metadata |
fedagent-chain/
βββ configs/ # Hydra experiment configurations
β βββ experiment/ # Per-experiment YAML configs
βββ src/ # Core framework source code
β βββ federated/ # FedAvg, Fairness-Aware aggregator, DP
β βββ models/ # Neural network (MLP + LayerNorm)
β βββ agents/ # 5 specialised agentic services
β βββ blockchain/ # Permissioned chain, smart contracts
β βββ data/ # Dataset loading, schema, preprocessing
β βββ evaluation/ # Metrics, fairness computation, audit
β βββ visualization/ # Plotting utilities
β βββ utils/ # Helpers, logging, seeding
βββ scripts/ # Entry-point scripts
β βββ run_federated_simulation.py # Main FL training loop
β βββ run_evaluation.py # Evaluation + confusion matrices
β βββ run_baselines.py # Local & centralised baselines
β βββ aggregate_multi_seed_results.py
β βββ generate_figures.py # Publication plots
β βββ generate_ablation_table.py # Ξ»-ablation comparison
β βββ generate_system_overhead_plots.py
β βββ generate_lambda_tradeoff_plot.py
β βββ generate_agent_demonstrations.py
β βββ verify_submission_readiness.py
βββ experiments/ # Output directory
β βββ results/ # CSV tables, plots, statistics
β β βββ seeds/ # Raw per-seed metrics
β β βββ plots/ # Publication PDF figures
β β βββ statistics/ # Aggregated t-tests and CIs
β β βββ demos/ # Qualitative agent case studies
β βββ runs/ # Per-run checkpoints and metrics
β βββ manifest.yaml # Experiment provenance
βββ paper_figures/ # Consolidated publication PDFs
βββ docs/ # Extended documentation
β βββ paper_results_inventory.md # Master artifact catalog
β βββ reproducibility.md # Verification checklist
β βββ scientific_hardening.md # Threats & ethics
βββ tests/ # Test suite (unit/integration/regression)
βββ CITATION.cff # Citation metadata
βββ LICENSE # Apache 2.0
βββ README.md # This file
# Regenerate systems overhead plots and CSVs
python scripts/generate_system_overhead_plots.pyOutputs are saved to experiments/results/plots/runtime_breakdown.pdf and experiments/results/plots/communication_costs.pdf.
# Run fairness-accuracy tradeoff sweep
python scripts/run_lambda_sweep.py
# Generate Pareto plot with confidence intervals
python scripts/generate_lambda_tradeoff_plot.py# Unit tests
pytest tests/unit/ -v
# Integration tests
pytest tests/integration/ -v -m integration --timeout=120
# Regression tests (anchored to paper results)
pytest tests/regression/ -v -m regression --timeout=300Experiments are managed via Hydra. Override any parameter at runtime:
python scripts/run_federated_simulation.py \
--config configs/experiment/fedagent_chain_full.yaml \
federated.n_rounds=20 \
privacy.noise_multiplier=0.1 \
fairness.lambda_fairness=0.5 \
--seed 123- Synthetic Data: All results are based on synthetically generated disability-employment data calibrated against WHO and World Bank statistics. Performance on real-world clinical or institutional records may differ.
- Moderate-Scale Evaluation: The current prototype evaluates K=4 regional nodes with n=5 random seeds. Larger-scale deployments may introduce additional heterogeneity and communication challenges.
- Fairness Tradeoffs: The Ξ»-penalty targets disability category as the primary sensitive attribute. Intersectional fairness (e.g., combining disability with age or gender) remains a subject for future work.
- Statistical Power: With n=5 seeds, pairwise comparisons lack sufficient statistical power for formal significance claims at Ξ±=0.05. We report effect sizes (Cohen's d) alongside p-values for transparency.
- No real disability data is collected, stored, or processed in this prototype.
- Human oversight is mandatory: The Governance Agent provides recommendations; final employment decisions must always be made by qualified human advisors.
- The system must never automatically reject a person with a disability's employment application without human review.
- Privacy-Utility Tradeoff: Stronger DP noise multipliers improve privacy but can degrade matching accuracy. We recommend calibrating Ο based on local regulation (e.g., GDPR, NDMO).
For an extended discussion, see Scientific Hardening & Ethics.
Researchers and authors can use the following artifacts to build the main manuscript:
| Resource | Description |
|---|---|
docs/paper_results_inventory.md |
Master catalog of every figure, table, and statistical result |
paper_figures/ |
8 publication-quality PDF figures |
experiments/results/demos/ |
3 qualitative agent case study reports |
experiments/results/statistics/ |
Aggregated CSV tables and statistical tests |
@article{syed2026fedagentchain,
title = {FedAgent-Chain: A Secure Federated and Agentic AI Framework
for Multilingual Disability-Inclusive Employment in AI Cities},
author = {Syed, Toqeer Ali and Siddiqui, Muhammad Shoaib and Ali Akarma},
journal = {Frontiers in Artificial Intelligence},
year = {2026},
}This project is licensed under the Apache License 2.0.
For questions, issues, or collaboration inquiries, please open a GitHub Issue.