Virtual-GENESIS

A harness-first cognitive agent layer on top of LLM APIs.
Testing two central hypotheses:

🚀 For complete setup, OpenRouter configuration (gpt-oss-120b:free), evolutionary discovery, and running on serious benchmarks (SWE-bench, gpqa, etc.):
See the full guide: SETUP_AND_RUN_GUIDE.md (includes exact commands for real paper-level experiments after cloning).

Concept Formation outperforms retrieval-only adaptation.
Cognitive Economy (intelligent tiered routing + resource allocation) outperforms stronger-model-only scaling.

⚠️ Primary reference: The internal regime lock and current status documents (Arabic). All other descriptions are subordinate to them.

Key Results (Live Runs)

Primary thesis slice (prototype_v3b_curriculum — 72 tasks):

Condition	Success Rate	Avg Cost (simulated)	Notes
Baseline (retrieval only)	79.2%	—	—
+ Concept Formation	98.6%	0.000736	+24.5% absolute improvement
+ Cognitive Economy	98.6%	0.0023 (vs 0.010 premium-always)	4.3× cost reduction at same performance
Combined (Canonical)	98.6%	0.000736	Best efficiency

Governance Ablation (Cycle 5.1) — Does Layer B buy peak performance or robustness?

use_theory_leverage: No gain in success (still 98.61%), but 2.94× cost.
use_anomaly_leverage: Reaches 100%, but at 13.59× cost (forces premium tier on everything).

Conclusion from live data: Governance buys robustness, not cheap peak performance. This is why Layer B remains gated (OFF by default) on the canonical path.

Full details: results/ablation_summary_2026-06-01.md

Architecture (3 Layers)

Layer A — Core Epistemic Engine (Locked 🔒)

The value core. Only touched with strong empirical justification.

Task framing (6 families: comparison, synthesis, procedure, analysis, extraction, planning)
Memory OS (lifecycle, clustering, productive forgetting, consolidation)
Concept Formation Engine (proposer from success/failure contrasts + family-specific selectivity)
Economy Control (tier router with expected value calculation, anomaly/theory overrides)
Verification Runtime (contract-based: required properties + forbidden shortcuts)
Minimal Pipeline runner

Layer B — Governance Expansions (Gated, OFF by default 🧪)

Experimental mechanisms that add robustness at a cost:

Anomaly leverage
Theory leverage
Productive forgetting
Identity governance
Paradigm fork
Self-benchmarking
Contradiction detection

Layer C — Interface & Infrastructure

REST API + OpenRouter adapter
SQLite persistence
Evaluation runners and perturbation curricula (research-only)

Canonical path (the approved, locked path for the main thesis):

python -m virtual_genesis.eval.runners.run_local_eval_v3b_curriculum

Governance flags are available for ablation only:

VIRTUAL_SIA_USE_THEORY_LEVERAGE=true  python -m virtual_genesis.eval.runners.run_local_eval_v3b_curriculum
VIRTUAL_SIA_USE_ANOMALY_LEVERAGE=true python -m virtual_genesis.eval.runners.run_local_eval_v3b_curriculum

Quickstart

Requirements: Python 3.10+

git clone https://github.com/faresrafat3/GENESIS.git
cd GENESIS
pip install -e ".[dev]"
pytest -q          # 424 tests should pass

Run the canonical evaluation (no API key needed — fully local simulation):

python -m virtual_genesis.eval.runners.run_local_eval_v3b_curriculum

Real LLM experiments (research-only):

export OPENROUTER_API_KEY=your_key
python -m virtual_genesis.eval.runners.run_adversarial_llm_eval

For the full self-improving orchestrator (advanced):

python -m genesis.orchestrator --task <bundled_task> --max_gen 3

See genesis/tasks/ for the self-evolution missions (including the critical cognitive integration bridge task).

Current Status (June 2026)

Strongly implemented: Task ingress, Blackboard, Memory OS, Concept Engine (with proposer + selectivity), Economy-aware routing, evaluation framework, 424 tests, live ablations.
Scaffolding / Template-based: Reasoning and verification (currently keyword + template driven — explicitly acknowledged as a limitation).
Critical pending work: Full integration of the orchestrator with the Virtual-GENESIS cognitive pipeline as a reasoning substrate for real LLMs (see genesis/tasks/genesis_cognitive_integration/ — this is Phase 1 of the strategic plan).
Research output: Detailed Arabic research memos, Master Architecture, and a full Research Paper Draft (GENESIS_Research_Paper_Draft_AR.md).

This is an advanced research/execution prototype, not a finished production system. Layer A is partially validated (strongest on H2 & H6). Layers B and C are maturing.

Documentation Map

Purpose	Document
Highest structural lock	GENESIS_Internal_Regime_Lock_AR.md (or Virtual_SIA_...)
Current regime & status	GENESIS_Current_Regime_Status_AR.md
Theoretical architecture	GENESIS_Master_Architecture_AR.md
Research program (H1–H9)	GENESIS_Research_Program_AR.md
Full paper draft	GENESIS_Research_Paper_Draft_AR.md
Strategic development plan	STRATEGIC_DEVELOPMENT_PLAN_2026_06.md
Live ablation evidence	results/ablation_summary_2026-06-01.md
75+ "Legitimate Thefts" sources	GENESIS_Legitimate_Thefts_MASTER_INDEX_AR.md

All primary documentation is in Arabic. English summaries are being added.

Installation & Development

pip install -e ".[dev]"
pytest -q

The core has zero external runtime dependencies (stdlib only: urllib, sqlite3, etc.). pytest is dev-only.

Roadmap Highlights (from Strategic Plan)

Phase 1 (Critical): Bridge the orchestrator with the cognitive pipeline so real LLM calls are guided by concepts, memory, tier decisions, and theories — moving beyond pure templates.

Phase 2+: Concept gating with regression budgets, graph-structured memory, stronger self-benchmarking, self-healing orchestration.

See the full plan in STRATEGIC_DEVELOPMENT_PLAN_2026_06.md.

Contributing

This is currently a solo research project with heavy internal documentation. Contributions are welcome, especially around:

Strengthening reasoning & verification beyond keywords/templates
Real LLM integration experiments
Expanding the perturbation curriculum
English documentation & examples

Please read the regime lock and current status documents before proposing changes to Layer A.

License

Proprietary for now (research phase). License may change as the project matures.

Virtual-GENESIS — Building intelligence through externalized cognitive mechanisms rather than model scaling alone.

For the deepest context, start with the Research Paper Draft and the Master Architecture document.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Virtual-GENESIS

Key Results (Live Runs)

Architecture (3 Layers)

Layer A — Core Epistemic Engine (Locked 🔒)

Layer B — Governance Expansions (Gated, OFF by default 🧪)

Layer C — Interface & Infrastructure

Quickstart

Current Status (June 2026)

Documentation Map

Installation & Development

Roadmap Highlights (from Strategic Plan)

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.agents/tasks		.agents/tasks
.github/workflows		.github/workflows
PAPER		PAPER
genesis		genesis
results		results
runs/run_53		runs/run_53
scripts		scripts
tasks/gpqa_subset_20		tasks/gpqa_subset_20
tests		tests
tools		tools
virtual_genesis		virtual_genesis
.env.example		.env.example
.gitignore		.gitignore
API_GENESIS_Design_Arabic.md		API_GENESIS_Design_Arabic.md
GENESIS_Adversarial_Validation_Memo_AR.md		GENESIS_Adversarial_Validation_Memo_AR.md
GENESIS_Agent_Identity_Theory_AR.md		GENESIS_Agent_Identity_Theory_AR.md
GENESIS_Anomaly_Crisis_Paradigm_Theory_AR.md		GENESIS_Anomaly_Crisis_Paradigm_Theory_AR.md
GENESIS_Anomaly_Leverage_Implementation_Memo_AR.md		GENESIS_Anomaly_Leverage_Implementation_Memo_AR.md
GENESIS_Broader_Domain_Cycle_Memo_AR.md		GENESIS_Broader_Domain_Cycle_Memo_AR.md
GENESIS_Build_Checklist_AR.md		GENESIS_Build_Checklist_AR.md
GENESIS_Cognitive_Economy_Ledger_And_Tier_Router_Spec_AR.md		GENESIS_Cognitive_Economy_Ledger_And_Tier_Router_Spec_AR.md
GENESIS_Cognitive_Economy_Theory_AR.md		GENESIS_Cognitive_Economy_Theory_AR.md
GENESIS_Concept_Engine_Deep_Analysis_AR.md		GENESIS_Concept_Engine_Deep_Analysis_AR.md
GENESIS_Concept_Engine_TaskCase_Refinement_Memo_AR.md		GENESIS_Concept_Engine_TaskCase_Refinement_Memo_AR.md
GENESIS_Concept_Formation_Engine_Spec_AR.md		GENESIS_Concept_Formation_Engine_Spec_AR.md
GENESIS_Concept_Formation_Theory_AR.md		GENESIS_Concept_Formation_Theory_AR.md
GENESIS_Concept_Selectivity_Spec_AR.md		GENESIS_Concept_Selectivity_Spec_AR.md
GENESIS_Concept_Selectivity_Update_Memo_AR.md		GENESIS_Concept_Selectivity_Update_Memo_AR.md
GENESIS_Contract_Perturbation_Memo_AR.md		GENESIS_Contract_Perturbation_Memo_AR.md
GENESIS_Contradiction_Analytics_Update_Memo_AR.md		GENESIS_Contradiction_Analytics_Update_Memo_AR.md
GENESIS_Contradiction_Theory_AR.md		GENESIS_Contradiction_Theory_AR.md
GENESIS_Core_Ontology_AR.md		GENESIS_Core_Ontology_AR.md
GENESIS_Current_Evidence_Package_AR.md		GENESIS_Current_Evidence_Package_AR.md
GENESIS_Current_Reference_Index_AR.md		GENESIS_Current_Reference_Index_AR.md
GENESIS_Current_Regime_Memo_AR.md		GENESIS_Current_Regime_Memo_AR.md
GENESIS_Current_Regime_Status_AR.md		GENESIS_Current_Regime_Status_AR.md
GENESIS_Curriculum_Analytics_Update_Memo_AR.md		GENESIS_Curriculum_Analytics_Update_Memo_AR.md
GENESIS_Data_Schema_Plan_AR.md		GENESIS_Data_Schema_Plan_AR.md
GENESIS_Decision_Memo_AR.md		GENESIS_Decision_Memo_AR.md
GENESIS_DeepMind_Aletheia_Theft_AR.md		GENESIS_DeepMind_Aletheia_Theft_AR.md
GENESIS_DeepMind_AlphaEvolve_FunSearch_Theft_AR.md		GENESIS_DeepMind_AlphaEvolve_FunSearch_Theft_AR.md
GENESIS_DeepMind_CoScientist_Theft_AR.md		GENESIS_DeepMind_CoScientist_Theft_AR.md
GENESIS_Deep_Continuation_AR.md		GENESIS_Deep_Continuation_AR.md
GENESIS_Deep_Foundations_AR.md		GENESIS_Deep_Foundations_AR.md
GENESIS_Diagnosis_run_53_GPQA_Gap_AR.md		GENESIS_Diagnosis_run_53_GPQA_Gap_AR.md
GENESIS_Eval_System_Analysis_AR.md		GENESIS_Eval_System_Analysis_AR.md
GENESIS_Evaluation_Perturbation_Curriculum_Spec_AR.md		GENESIS_Evaluation_Perturbation_Curriculum_Spec_AR.md
GENESIS_Evaluation_Pressure_Cycle_Memo_AR.md		GENESIS_Evaluation_Pressure_Cycle_Memo_AR.md
GENESIS_Evaluation_Redesign_Memo_AR.md		GENESIS_Evaluation_Redesign_Memo_AR.md
GENESIS_Evaluation_Regime_Status_Memo_AR.md		GENESIS_Evaluation_Regime_Status_Memo_AR.md
GENESIS_Evaluation_Slice_Strategy_AR.md		GENESIS_Evaluation_Slice_Strategy_AR.md
GENESIS_Expanded_Evidence_Memo_AR.md		GENESIS_Expanded_Evidence_Memo_AR.md
GENESIS_Family_Selectivity_Ablation_Memo_AR.md		GENESIS_Family_Selectivity_Ablation_Memo_AR.md
GENESIS_Family_Selectivity_Ablation_Results_Memo_AR.md		GENESIS_Family_Selectivity_Ablation_Results_Memo_AR.md
GENESIS_Family_Specific_Selectivity_Plan_AR.md		GENESIS_Family_Specific_Selectivity_Plan_AR.md
GENESIS_First_Evidence_Memo_AR.md		GENESIS_First_Evidence_Memo_AR.md
GENESIS_First_Implementation_Order_AR.md		GENESIS_First_Implementation_Order_AR.md
GENESIS_Free_LLM_Providers_2026_AR.md		GENESIS_Free_LLM_Providers_2026_AR.md
GENESIS_Full_Documentation_AR.md		GENESIS_Full_Documentation_AR.md
GENESIS_Governance_Aware_Current_Snapshot_AR.md		GENESIS_Governance_Aware_Current_Snapshot_AR.md
GENESIS_Identity_Governance_Implementation_Memo_AR.md		GENESIS_Identity_Governance_Implementation_Memo_AR.md
GENESIS_Implementation_Preplan_AR.md		GENESIS_Implementation_Preplan_AR.md
GENESIS_Internal_Regime_Lock_AR.md		GENESIS_Internal_Regime_Lock_AR.md
GENESIS_Legitimate_Thefts_Cycle2_AR.md		GENESIS_Legitimate_Thefts_Cycle2_AR.md
GENESIS_Legitimate_Thefts_Cycle3_AR.md		GENESIS_Legitimate_Thefts_Cycle3_AR.md
GENESIS_Legitimate_Thefts_Cycle4_AR.md		GENESIS_Legitimate_Thefts_Cycle4_AR.md
GENESIS_Legitimate_Thefts_MASTER_INDEX_AR.md		GENESIS_Legitimate_Thefts_MASTER_INDEX_AR.md
GENESIS_Legitimate_Thefts_Production_AR.md		GENESIS_Legitimate_Thefts_Production_AR.md
GENESIS_Legitimate_Thefts_RealWorld_AR.md		GENESIS_Legitimate_Thefts_RealWorld_AR.md
GENESIS_Legitimate_Thefts_Wave3_AR.md		GENESIS_Legitimate_Thefts_Wave3_AR.md
GENESIS_Legitimate_Thefts_Wave3b_AR.md		GENESIS_Legitimate_Thefts_Wave3b_AR.md
GENESIS_Level_Wise_Governance_Analytics_Memo_AR.md		GENESIS_Level_Wise_Governance_Analytics_Memo_AR.md
GENESIS_Local_Theory_Building_AR.md		GENESIS_Local_Theory_Building_AR.md
GENESIS_Master_Architecture_AR.md		GENESIS_Master_Architecture_AR.md
GENESIS_Memory_OS_Spec_AR.md		GENESIS_Memory_OS_Spec_AR.md
GENESIS_Merge_Strategy_AR.md		GENESIS_Merge_Strategy_AR.md
GENESIS_Meta_Theory_AR.md		GENESIS_Meta_Theory_AR.md
GENESIS_Milestone_Execution_Plan_AR.md		GENESIS_Milestone_Execution_Plan_AR.md
GENESIS_Minimal_Anomaly_Candidate_Memo_AR.md		GENESIS_Minimal_Anomaly_Candidate_Memo_AR.md
GENESIS_Minimal_Contradiction_Runtime_Memo_AR.md		GENESIS_Minimal_Contradiction_Runtime_Memo_AR.md
GENESIS_Minimal_Evaluation_Protocol_AR.md		GENESIS_Minimal_Evaluation_Protocol_AR.md
GENESIS_Minimal_Local_Theory_Builder_Memo_AR.md		GENESIS_Minimal_Local_Theory_Builder_Memo_AR.md
GENESIS_Module_API_Contracts_AR.md		GENESIS_Module_API_Contracts_AR.md
GENESIS_Multi_Model_Infrastructure_AR.md		GENESIS_Multi_Model_Infrastructure_AR.md
GENESIS_Nemotron_3_Ultra_Memo_AR.md		GENESIS_Nemotron_3_Ultra_Memo_AR.md
GENESIS_Next_Cycle_Decision_Anomaly_vs_Theory_AR.md		GENESIS_Next_Cycle_Decision_Anomaly_vs_Theory_AR.md
GENESIS_Next_Cycle_Options_AR.md		GENESIS_Next_Cycle_Options_AR.md
GENESIS_Orchestrator_Scaffolding_Fix_AR.md		GENESIS_Orchestrator_Scaffolding_Fix_AR.md
GENESIS_Paradigm_Fork_Implementation_Memo_AR.md		GENESIS_Paradigm_Fork_Implementation_Memo_AR.md
GENESIS_Persistence_Implementation_Memo_AR.md		GENESIS_Persistence_Implementation_Memo_AR.md
GENESIS_Perturbation_Operator_Expansion_Results_Memo_AR.md		GENESIS_Perturbation_Operator_Expansion_Results_Memo_AR.md
GENESIS_Perturbation_Operator_Refinement_Memo_AR.md		GENESIS_Perturbation_Operator_Refinement_Memo_AR.md
GENESIS_Productive_Forgetting_Implementation_Memo_AR.md		GENESIS_Productive_Forgetting_Implementation_Memo_AR.md
GENESIS_Productive_Forgetting_Theory_AR.md		GENESIS_Productive_Forgetting_Theory_AR.md
GENESIS_Prototype_Slice_Plan_AR.md		GENESIS_Prototype_Slice_Plan_AR.md
GENESIS_Prototype_Status_Memo_AR.md		GENESIS_Prototype_Status_Memo_AR.md
GENESIS_Prototype_V2_Evidence_Memo_AR.md		GENESIS_Prototype_V2_Evidence_Memo_AR.md
GENESIS_Prototype_V3B_Curriculum_Evidence_Memo_AR.md		GENESIS_Prototype_V3B_Curriculum_Evidence_Memo_AR.md
GENESIS_Prototype_V3B_Evidence_Memo_AR.md		GENESIS_Prototype_V3B_Evidence_Memo_AR.md
GENESIS_Prototype_V3_Evidence_Memo_AR.md		GENESIS_Prototype_V3_Evidence_Memo_AR.md

Folders and files

Latest commit

History

Repository files navigation

Virtual-GENESIS

Key Results (Live Runs)

Architecture (3 Layers)

Layer A — Core Epistemic Engine (Locked 🔒)

Layer B — Governance Expansions (Gated, OFF by default 🧪)

Layer C — Interface & Infrastructure

Quickstart

Current Status (June 2026)

Documentation Map

Installation & Development

Roadmap Highlights (from Strategic Plan)

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages