Convert source code to logical representation for LLM analysis.
Code2Logic analyzes codebases and generates compact, LLM-friendly representations with semantic understanding. Perfect for feeding project context to AI assistants, building code documentation, or analyzing code structure.
- 🌳 Multi-language support - Python, JavaScript, TypeScript, Java, Go, Rust, and more
- 🎯 Tree-sitter AST parsing - 99% accuracy with graceful fallback
- 📊 NetworkX dependency graphs - PageRank, hub detection, cycle analysis
- 🔍 Rapidfuzz similarity - Find duplicate and similar functions
- 🧠 NLP intent extraction - Human-readable function descriptions
- 📦 Zero dependencies - Core works without any external libs
pip install code2logicpip install code2logic[full]pip install code2logic[treesitter] # High-accuracy AST parsing
pip install code2logic[graph] # Dependency analysis
pip install code2logic[similarity] # Similar function detection
pip install code2logic[nlp] # Enhanced intentscode2logic ./ -f yaml --compact --function-logic --with-schema -o project.yaml
code2logic ./ -f toon --ultra-compact --function-logic --with-schema -o project.toon# Standard Markdown output
code2logic /path/to/project
# If the `code2logic` entrypoint is not available (e.g. running from source without install):
python -m code2logic /path/to/project
# Compact YAML (14% smaller, meta.legend transparency)
code2logic /path/to/project -f yaml --compact -o analysis-compact.yaml
# Ultra-compact TOON (71% smaller, single-letter keys)
code2logic /path/to/project -f toon --ultra-compact -o analysis-ultra.toon
# Generate schema alongside output
code2logic /path/to/project -f yaml --compact --with-schema
# With detailed analysis
code2logic /path/to/project -d detailedfrom code2logic import analyze_project, MarkdownGenerator
# Analyze a project
project = analyze_project("/path/to/project")
# Generate output
generator = MarkdownGenerator()
output = generator.generate(project, detail_level='standard')
print(output)
# Access analysis results
print(f"Files: {project.total_files}")
print(f"Lines: {project.total_lines}")
print(f"Languages: {project.languages}")
# Get hub modules (most important)
hubs = [p for p, n in project.dependency_metrics.items() if n.is_hub]
print(f"Key modules: {hubs}")# Core analysis
from code2logic import ProjectInfo, ProjectAnalyzer, analyze_project
# Format generators
from code2logic import (
YAMLGenerator,
JSONGenerator,
TOONGenerator,
LogicMLGenerator,
GherkinGenerator,
)
# LLM clients
from code2logic import get_client, BaseLLMClient
# Development tools
from code2logic import run_benchmark, CodeReviewerHuman-readable documentation with:
- Project structure tree with hub markers (★)
- Dependency graphs with PageRank scores
- Classes with methods and intents
- Functions with signatures and descriptions
Ultra-compact format optimized for LLM context:
# myproject | 102f 31875L | typescript:79/python:23
ENTRY: index.ts main.py
HUBS: evolution-manager llm-orchestrator
[core/evolution]
evolution-manager.ts (3719L) C:EvolutionManager | F:createEvolutionManager
task-queue.ts (139L) C:TaskQueue,Task
Machine-readable format for:
- RAG (Retrieval-Augmented Generation)
- Database storage
- Further analysis
Check which features are available:
code2logic --statusLibrary Status:
tree_sitter: ✓
networkx: ✓
rapidfuzz: ✓
nltk: ✗
spacy: ✗
Manage LLM providers, models, API keys, and routing priorities:
code2logic llm status
code2logic llm set-provider auto
code2logic llm set-model openrouter nvidia/nemotron-3-nano-30b-a3b:free
code2logic llm key set openrouter <OPENROUTER_API_KEY>
code2logic llm priority set-provider openrouter 10
code2logic llm priority set-mode provider-first
code2logic llm priority set-llm-model nvidia/nemotron-3-nano-30b-a3b:free 5
code2logic llm priority set-llm-family nvidia/ 5
code2logic llm config listNotes:
code2logic llm set-provider autoenables automatic fallback selection: providers are tried in priority order.- API keys should be stored in
.env(or environment variables), not inlitellm_config.yaml. - These commands write configuration files:
.envin the current working directorylitellm_config.yamlin the current working directory~/.code2logic/llm_config.jsonin your home directory
You can choose how automatic fallback ordering is computed:
provider-firstproviders are ordered by provider priority (defaults + overrides)model-firstproviders are ordered by priority rules for the provider's configured model (exact/prefix)mixedproviders are ordered by the best (lowest) priority from either provider priority or model rules
Configure the mode:
code2logic llm priority set-mode provider-first
code2logic llm priority set-mode model-first
code2logic llm priority set-mode mixedModel priority rules are stored in ~/.code2logic/llm_config.json.
from code2logic import get_library_status
status = get_library_status()
# {'tree_sitter': True, 'networkx': True, ...}- PageRank - Identifies most important modules
- Hub detection - Central modules marked with ★
- Cycle detection - Find circular dependencies
- Clustering - Group related modules
Functions get human-readable descriptions:
methods:
async findById(id:string) -> Promise<User> # retrieves user by id
async createUser(data:UserDTO) -> Promise<User> # creates user
validateEmail(email:string) -> boolean # validates emailFind duplicate and similar functions:
Similar Functions:
core/auth.ts::validateToken:
- python/auth.py::validate_token (92%)
- services/jwt.ts::verifyToken (85%)code2logic/
├── analyzer.py # Main orchestrator
├── parsers.py # Tree-sitter + fallback parser
├── dependency.py # NetworkX dependency analysis
├── similarity.py # Rapidfuzz similar detection
├── intent.py # NLP intent generation
├── generators.py # Output generators (MD/Compact/JSON)
├── models.py # Data structures
└── cli.py # Command-line interface
from code2logic import analyze_project, CompactGenerator
project = analyze_project("./my-project")
context = CompactGenerator().generate(project)
# Use in your LLM prompt
prompt = f"""
Analyze this codebase and suggest improvements:
{context}
"""import json
from code2logic import analyze_project, JSONGenerator
project = analyze_project("./my-project")
data = json.loads(JSONGenerator().generate(project))
# Index in vector DB
for module in data['modules']:
for func in module['functions']:
embed_and_store(
text=f"{func['name']}: {func['intent']}",
metadata={'path': module['path'], 'type': 'function'}
)git clone https://github.com/wronai/code2logic
cd code2logic
poetry install --with dev -E full
poetry run pre-commit install
# Alternatively, you can use Makefile targets (prefer Poetry if available)
make install-fullmake test
make test-cov
# Or directly:
poetry run pytest
poetry run pytest --cov=code2logic --cov-report=htmlmake typecheck
# Or directly:
poetry run mypy code2logicmake lint
make format
# Or directly:
poetry run ruff check code2logic
poetry run black code2logic| Codebase Size | Files | Lines | Time | Output Size |
|---|---|---|---|---|
| Small | 10 | 1K | <1s | ~5KB |
| Medium | 100 | 30K | ~2s | ~50KB |
| Large | 500 | 150K | ~10s | ~200KB |
Compact format is ~10-15x smaller than Markdown.
Code2Logic can reproduce code from specifications using LLMs. Benchmark results:
| Format | Score | Token Efficiency | Spec Tokens | Runs OK |
|---|---|---|---|---|
| YAML | 71.1% | 42.1 | 366 | 66.7% |
| Markdown | 65.6% | 48.7 | 385 | 100% |
| JSON | 61.9% | 23.7 | 605 | 66.7% |
| Gherkin | 51.3% | 19.1 | 411 | 66.7% |
- YAML is best for score - 71.1% reproduction accuracy
- Markdown is best for token efficiency - 48.7 score/1000 tokens
- YAML uses 39.6% fewer tokens than JSON with 9.2% higher score
- Markdown has 100% runs OK - generated code always executes
# Token-aware benchmark
python examples/11_token_benchmark.py --folder tests/samples/ --no-llm
# Async multi-format benchmark
python examples/09_async_benchmark.py --folder tests/samples/ --no-llm
# Function-level reproduction
python examples/10_function_reproduction.py --file tests/samples/sample_functions.py --no-llm
python examples/15_unified_benchmark.py --folder tests/samples/ --no-llm
# Terminal markdown rendering demo
python examples/16_terminal_demo.py --folder tests/samples/Contributions welcome! Please read our Contributing Guide.
Apache 2 License - see LICENSE for details.
Generate test scaffolds from Code2Logic output:
# Show what can be generated
python -m logic2test out/code2logic/project.c2l.yaml --summary
# Generate unit tests
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/
# Generate all test types (unit, integration, property)
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/ --type allfrom logic2test import TestGenerator
generator = TestGenerator('out/code2logic/project.c2l.yaml')
result = generator.generate_unit_tests('tests/')
print(f"Generated {result.tests_generated} tests")Generate source code from Code2Logic output:
# Show what can be generated
python -m logic2code out/code2logic/project.c2l.yaml --summary
# Generate Python code
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/
# Generate stubs only
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/ --stubs-onlyfrom logic2code import CodeGenerator
generator = CodeGenerator('out/code2logic/project.c2l.yaml')
result = generator.generate('output/')
print(f"Generated {result.files_generated} files")# 1. Analyze existing codebase
code2logic src/ -f yaml -o out/code2logic/project.c2l.yaml
# 2. Generate tests for the codebase
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/ --type all
# 3. Generate code scaffolds (for refactoring)
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/ --stubs-only- 00 - Docs Index - Documentation home (start here)
- 01 - Getting Started - Install and first steps
- 02 - Configuration - API keys, environment setup
- 03 - CLI Reference - Command-line usage
- 04 - Python API - Programmatic usage
- 05 - Output Formats - Format comparison and usage
- 06 - Format Specifications - Detailed format specs
- 07 - TOON Format - Token-Oriented Object Notation
- 08 - LLM Integration - OpenRouter/Ollama/LiteLLM
- 09 - LLM Comparison - Provider/model comparison
- 10 - Benchmarking - Benchmark methodology and results
- 11 - Repeatability - Repeatability testing
- 12 - Examples - Usage workflows and examples
- 13 - Architecture - System design and components
- 14 - Format Analysis - Deeper format evaluation
- 15 - Logic2Test - Test generation from logic files
- 16 - Logic2Code - Code generation from logic files
- 17 - LOLM - LLM provider management
- 18 - Reproduction Testing - Format validation and code regeneration
- 19 - Monorepo Workflow - Managing all packages from repo root
- examples/ - All runnable examples
- examples/run_examples.sh - Example runner script (multi-command workflows)
- examples/code2logic/ - Minimal project + docker example for code2logic
- examples/logic2test/ - Minimal project + docker example for logic2test
- examples/logic2code/ - Minimal project + docker example for logic2code

