CMon - Complete Documentation

Version: 2.0
Language: C
Author: Blazzee

Overview

What is CMon?

CMon (C Monitor) is a lightweight HTTP server for remote server operations management. It provides authenticated REST API endpoints to execute system administration commands remotely.

What's New in Version 2.0

Revolutionary Virtual Memory Arena Allocator:

Virtually unlimited allocations via bitmap array instead of single 64-bit integer
mmap-based virtual memory (up to 64TB theoretical capacity on x86-64)
Demand paging - physical memory only used when accessed (MMU translates on page fault)
512-byte chunks (increased from 256)
64 chunks default but configurable to thousands
Extremely elegant - allocate terabytes virtually, use only what you need

Enhanced Benchmarking:

Serialized RDTSC for accurate cycle counting (prevents instruction reordering)
CPU pinning to eliminate scheduling noise
ARM64 support with virtual counter benchmarks
256KB allocations to stress-test large allocations
Random page touching to destroy locality (realistic workload)
malloc_trim() to force heap release for fair comparison

Problem Statement

Managing remote servers typically requires:

SSH access and manual command execution
Custom scripts scattered across systems
Multiple tools for different operations
Manual deployment processes

CMon consolidates common server operations into a single authenticated HTTP API, enabling:

Programmatic server control
Automated deployment workflows
Integration with CI/CD pipelines
Discord/Slack bot integrations
DevOps automation

Key Features

✅ Authenticated Command Execution - All endpoints require 256-bit secret key
✅ System Operations - Reboot, restart, health checks
✅ Git Integration - Pull updates, deploy branches
✅ Log Viewing - Access systemd journal entries
✅ Virtual Memory Arena - Virtually unlimited capacity with demand paging
✅ Security-Conscious - Timing-safe authentication, no shell injection
✅ Event-Driven - Single-threaded async I/O via libevent

Performance Metrics

Updated benchmarks with 256KB allocations:

Configuration: 64KB chunks, 1024 chunks (64MB virtual arena)

Results vary by workload but arena consistently outperforms malloc for:

Frequent allocations
Predictable sizes
Short lifetimes
Batch processing patterns

Use Cases

Ideal For:

Internal DevOps tooling
CI/CD pipeline integration
Discord/Slack bot backends
Server management dashboards
Automated deployment systems
High-throughput command execution

Not Suitable For:

Public-facing APIs (no TLS by default)
Multi-tenant systems (single shared key)
Untrusted environments (limited sandboxing)

High-Level Architecture

System Overview

┌─────────────────────────────────────────────────────────────┐
│                         Client                               │
│           (HTTP Request + access_token header)               │
└────────────────────────┬────────────────────────────────────┘
                         │
                         │ HTTP/REST (Port 8000)
                         │
┌────────────────────────▼────────────────────────────────────┐
│                    CMon HTTP Server                          │
│                   (libevent 2.x)                            │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │         Authentication Middleware                  │    │
│  │  • Extracts access_token header                    │    │
│  │  • Validates with CRYPTO_memcmp (constant-time)    │    │
│  │  • Returns 401 if missing/invalid                  │    │
│  └────────────────┬───────────────────────────────────┘    │
│                   │                                          │
│  ┌────────────────▼───────────────────────────────────┐    │
│  │            Route Dispatcher                        │    │
│  │  Routes:                                           │    │
│  │  GET    /health           → uptime                 │    │
│  │  GET    /logs             → journalctl -n 50       │    │
│  │  POST   /reboot           → reboot                 │    │
│  │  POST   /restart          → pkill target           │    │
│  │  PUT    /sync_upstream    → git pull origin        │    │
│  │  GET    /deploy_branch    → ./deploy.sh           │    │
│  │  DELETE /teardown_branch  → ./teardown.sh         │    │
│  └────────────────┬───────────────────────────────────┘    │
│                   │                                          │
│  ┌────────────────▼───────────────────────────────────┐    │
│  │         Command Execution Layer                    │    │
│  │  • fork() child process                            │    │
│  │  • pipe() for stdout/stderr capture                │    │
│  │  • execvp() to run command                         │    │
│  │  • waitpid() for exit code                         │    │
│  │  • Timing measurement                              │    │
│  └────────────────┬───────────────────────────────────┘    │
│                   │                                          │
│  ┌────────────────▼───────────────────────────────────┐    │
│  │    Virtual Memory Arena Allocator (NEW v2.0)      │    │
│  │  • mmap-based virtual memory (up to 64TB)         │    │
│  │  • Bitmap array (unlimited chunks)                │    │
│  │  • Demand paging (MMU translates on access)       │    │
│  │  • O(1) allocation/deallocation per bitmap        │    │
│  │  • Physical memory only used on page fault        │    │
│  └────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘
                         │
                         │ System Calls
                         │
┌────────────────────────▼────────────────────────────────────┐
│                    Operating System                          │
│  Commands: uptime, reboot, pkill, git, journalctl           │
│  Scripts: ./deploy.sh, ./teardown.sh                        │
│  MMU: Virtual → Physical address translation                │
└─────────────────────────────────────────────────────────────┘

Request Lifecycle

Client sends HTTP request with access_token header
libevent receives request on port 8000
Authentication middleware validates token in constant time
Route dispatcher matches path to handler
Command executor forks child process
Child process executes command via execvp()
Parent process captures output via pipe
Arena allocator provides memory from virtual address space (MMU handles physical mapping)
Response builder formats JSON with escaped output
Client receives response with status, code, message, data

Component Interaction

HTTP Layer (main.c) coordinates all components:

Initializes virtual memory arena on startup
Loads authentication key from file
Registers routes with libevent
Passes requests through auth middleware
Delegates to command executors
Formats responses using utilities

Authentication Layer (auth.c) provides security:

Loads 256-bit hex key from client_secret.key
Decodes hex to binary using OpenSSL
Compares keys in constant time (prevents timing attacks)
Returns 0 on success, non-zero on failure

Command Layer (commands.c) executes operations:

Forks child process for isolation
Uses pipes to capture stdout/stderr
Executes via execvp() (no shell)
Waits for completion and extracts exit code
Measures execution duration
Returns output allocated from arena

Virtual Memory Arena Layer (arena.c) manages memory:

Uses mmap() to reserve virtual address space (not physical memory)
Bitmap array tracks allocated/free chunks across unlimited space
MMU (Memory Management Unit) translates virtual addresses to physical on first access
Physical pages allocated on-demand via page faults in TLB (Translation Lookaside Buffer)
Can theoretically allocate up to 64TB (42-bit address space on x86-64)
Actual physical memory usage determined by what's accessed, not what's allocated

Utility Layer (utils.c) provides helpers:

Dual logging to stderr and syslog
JSON response formatting
JSON string escaping (security critical)
Query parameter parsing
HTTP method string conversion

Component Design

HTTP Server Layer

Technology: libevent 2.x (asynchronous event-driven networking)

Configuration:

Port: 8000 (hardcoded in main.c)
Binding: 0.0.0.0 (all interfaces)
Methods: GET, POST, PUT, DELETE
Concurrency: Single-threaded event loop

Route Table Structure: Routes are defined in a static array containing path, HTTP method, and callback function. This allows easy addition of new endpoints by adding entries to the array.

Middleware Pattern: All requests pass through authentication middleware before reaching route handlers. The middleware extracts the access_token header, validates it, and either allows the request to proceed or returns 401 Unauthorized.

Signal Handling: Registers handler for SIGINT to perform graceful shutdown - closes syslog, tears down arena via munmap(), frees libevent structures in correct order.

404 Handling: Generic request handler catches all undefined routes and returns JSON error with 404 status.

Authentication System

Security Model:

Key Size: 256-bit (32 bytes) - equivalent to SHA-256 strength
Storage: File-based at ./client_secret.key in hexadecimal format
Encoding: Hex (64 characters) prevents binary data issues in text files
Comparison: Constant-time using OpenSSL's CRYPTO_memcmp()

Initialization Process:

Reads key file from current directory
Validates file size (64 hex chars = 32 bytes, optionally +1 for newline)
Decodes hex string to binary using OpenSSL's OPENSSL_hexstr2buf()
Stores decoded key in global buffer
Cleanses temporary buffers with OPENSSL_cleanse() for security

Authentication Flow:

Extracts client key from HTTP header (in hex format)
Decodes client-provided hex key to binary
Performs constant-time comparison with stored key
Returns 0 on success, non-zero on failure

Why Constant-Time Comparison?

Standard comparison functions (strcmp, memcmp) exit early when they find a difference. This creates a timing side-channel: an attacker can measure response time to deduce where keys differ, enabling byte-by-byte brute forcing.

Constant-time comparison always examines all bytes regardless of differences, preventing timing attacks. Uses bitwise OR to accumulate differences without branching.

Memory Security: Uses OPENSSL_cleanse() to zero sensitive memory before freeing, preventing key recovery from memory dumps or use-after-free vulnerabilities.

Command Execution System

Design Philosophy: Process isolation via fork/exec with output capture

Core Execution Flow:

Start timing using gettimeofday()
Create pipe for capturing child output
Fork process to isolate command execution
Child process: Redirects stdout/stderr to pipe, executes command via execvp(), exits with code 127 if exec fails
Parent process: Closes write end of pipe, allocates buffer from arena, reads output, waits for child completion
Extract exit code using WIFEXITED() and WEXITSTATUS() macros
Calculate duration and log execution details
Return output and set exit code pointer

Why Fork/Exec Instead of system()?

The system() function invokes /bin/sh and passes the command as a string. This makes it vulnerable to shell injection attacks where malicious input can execute arbitrary commands.

Fork/exec passes arguments as a NULL-terminated array where each argument is treated as a literal string. No shell interpretation occurs, making injection impossible. Even if user input contains shell metacharacters like semicolons or pipes, they're passed literally to the program.

Implemented Commands:

Endpoint	System Call	Purpose
/health	uptime	Check system uptime and load
/reboot	reboot	Reboot the system (requires root)
/restart	pkill target	Kill server binary for restart
/sync_upstream	git pull origin [branch]	Pull from git repository
/deploy_branch	./deploy.sh [branch]	Run custom deployment script
/teardown_branch	./teardown.sh [branch]	Run custom teardown script
/logs	journalctl -n 50 --no-pager	Fetch last 50 journal entries

Default Values: Branch parameters default to "main" if not provided in query string.

Error Handling:

Exit code 127 indicates exec failure (command not found)
Null return on pipe/fork failures
Output buffer allocation failures logged

Virtual Memory Arena Allocator - NEW v2.0

This is the revolutionary component that makes CMon v2.0 extremely elegant.

The Virtual Memory Breakthrough

Traditional Approach (v1.0):

Fixed 16KB physical memory pre-allocated
Single 64-bit bitmap (max 64 chunks)
All memory allocated upfront

New Approach (v2.0):

mmap() reserves virtual address space (not physical memory)
Bitmap array allows unlimited chunks (configurable)
Physical memory allocated on-demand by MMU
Can reserve up to 64TB on x86-64 (42-bit virtual address space)

How Virtual Memory Works

Key Insight: Modern CPUs have a Memory Management Unit (MMU) that translates virtual addresses to physical addresses.

The Process:

Reservation Phase (mmap()):
- Request operating system to reserve virtual address space
- Example: Reserve 64GB of virtual memory
- No physical RAM allocated yet
- OS just marks virtual address range as belonging to process
Translation Phase (MMU):
- When code accesses a virtual address for first time
- MMU looks up address in page tables
- If page not in physical memory: Page Fault
Demand Paging (Page Fault Handler):
- OS allocates physical page (4KB on most systems)
- Updates page tables with virtual→physical mapping
- Caches mapping in TLB (Translation Lookaside Buffer)
- Resumes execution transparently
Result:
- Can allocate 64TB virtually
- Only use physical memory for accessed pages
- Extremely elegant - pay only for what you use

Example Scenario

Allocate 1GB arena:

mmap(1GB) reserves 1GB virtual address space
Physical memory used: 0 bytes
Arena bitmap: ~16KB (for 2048 chunks of 512KB each)

Allocate 100KB:

Arena finds free chunks in bitmap
Returns virtual address
Physical memory used: Still ~0 bytes

Write to allocation:

First write to address triggers page fault
OS allocates single 4KB physical page
MMU maps virtual page to physical page
Physical memory used: 4KB

Write across allocation:

Each new 4KB region accessed triggers page fault
100KB allocation spans ~25 pages
After accessing all: 100KB physical (plus some overhead)

Why This Is Brilliant:

Reserve huge arena (64TB theoretical)
Only consume physical RAM for actually used memory
No waste on unused capacity
Transparent to application code

Design Goals (v2.0)

Virtually Unlimited Capacity - No practical limit on allocations
Efficient Physical Memory Use - Only use what you access
O(1) Performance - Fast allocation/deallocation
Cache-friendly - Bitmap array organized for locality

Configuration (v2.0)

Default:

Chunk Size: 512 bytes (increased from 256)
Chunk Count: 64 (configurable to thousands)
Bitmap Members: Calculated from chunk count (1 member = 64 chunks)

Example Configurations:

Small (default):

512 bytes × 64 chunks = 32KB virtual
1 bitmap member (64 bits)

Medium:

64KB × 1024 chunks = 64MB virtual
16 bitmap members (1024 bits)

Large:

1MB × 10000 chunks = 10GB virtual
157 bitmap members (10000 bits)

Extreme:

4MB × 100000 chunks = 400GB virtual
1563 bitmap members (100000 bits)

Data Structures (v2.0)

Global State:

LOCK: Pointer to dynamically allocated bitmap array
BUF: Pointer to mmap'd virtual memory region
arena_lock_members: Number of 64-bit integers in bitmap array

Bitmap Array: Each member is a 64-bit integer representing 64 chunks:

Member 0: Chunks 0-63
Member 1: Chunks 64-127
Member N: Chunks (N×64) to (N×64+63)

Allocation Header:

2-byte structure storing number of chunks allocated
Placed immediately before user data
Enables O(1) deallocation

The Enhanced Bit-Smearing Algorithm (v2.0)

Challenge: Find k consecutive free chunks across bitmap array

Algorithm Overview:

Iterate through bitmap members (64-bit integers)
For each member: Invert to get free mask (free=1, used=0)
Apply bit-smearing to find k consecutive 1s in that member
Boundary check: Ensure allocation doesn't cross member boundary
Return global bit position if found, continue to next member if not

Why Boundary Check?

Allocations cannot span across 64-bit members because:

Each member's bits are managed independently
Bit operations work within single 64-bit integer
Crossing boundary would complicate mask calculations

Impact: For large allocations (>64 chunks), first chunk must start at member boundary. This is acceptable because such allocations are rare in typical workload.

Complexity:

Outer loop: O(m) where m = number of bitmap members
Inner bit-smearing: O(k) where k = chunks needed
Total: O(m×k)
But in practice: k is small (1-4), m scanned until first fit
Typical case: O(1) to O(m) depending on fragmentation

Allocation Process (v2.0)

Calculate chunks needed: ceiling((request_size + 2) / chunk_size)
Iterate bitmap members:
- Invert member to get free mask
- Apply bit-smearing to find k consecutive free chunks
- Check allocation doesn't cross member boundary
Claim chunks:
- Calculate global bit position
- Determine member and bit offset within member
- Create claim mask
- Mark bits as used: LOCK[member] |= claim_mask
Store metadata:
- Write chunk count to 2-byte header
Return pointer to space after header

Deallocation Process (v2.0)

Read header from 2 bytes before pointer
Validate (same checks as v1.0):
- Pointer within arena bounds
- Chunk count reasonable
- Allocation doesn't overflow arena
Calculate position:
- Determine global bit position
- Calculate member index and bit offset
Boundary check: Ensure freeing doesn't cross member boundary
Build free mask for those bits
Clear bits: LOCK[member] &= ~mask

Virtual Memory Management (v2.0)

Initialization (prealloc_arena):

Allocate bitmap array:
- Calculate members needed: (chunks + 63) / 64
- malloc() bitmap array
- Zero all bits (all chunks free)
Reserve virtual memory:
- Calculate total size: chunk_size × chunk_count
- Call mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0)
- MAP_ANONYMOUS: Not backed by file
- MAP_PRIVATE: Process-private mapping
- OS reserves virtual address space
- No physical pages allocated yet
Optional zero:
- memset(BUF, 0, total_size) forces page allocation
- Each 4KB page accessed triggers page fault
- OS allocates physical pages on-demand
- Trade-off: Slower init vs. faster first allocation

Teardown (teardown_arena):

Unmap virtual memory:
- munmap(BUF, total_size)
- OS releases virtual address space
- Physical pages automatically freed
- Much cleaner than manual memory management
Free bitmap:
- free(LOCK) releases bitmap array

Why mmap() Instead of malloc()?

malloc() Issues:

Backed by heap (brk/sbrk system calls)
Heap fragmentation
Difficult to release memory back to OS
Limited by heap size

mmap() Advantages:

Independent virtual memory region
Direct mapping to page allocator
Easy to release via munmap()
Can reserve huge regions without physical allocation
OS manages physical pages automatically

Perfect for arena allocator:

Reserve large virtual region upfront
Let OS handle physical allocation
Clean teardown with munmap()

Performance Characteristics (v2.0)

Benchmark Configuration (benchmark.c):

Allocation size: 256KB (stress test for large allocations)
Arena config: 64KB chunks, 1024 chunks (64MB virtual)
CPU pinning: Eliminates scheduler noise
Serialized RDTSC: Prevents instruction reordering
Page touching: 128 random pages per allocation (destroys locality)
Zombie pool: 12800 live allocations (realistic fragmentation)

Methodology Improvements:

rdtsc_begin() with CPUID serialization
rdtsc_end() with RDTSCP + CPUID fence
malloc_trim(0) after each run (forces heap release)
Separate warmup runs for malloc and arena
5 runs for statistical confidence

ARM64 Support (bench_arm.c):

Uses ARM virtual counter (cntvct_el0)
Instruction serialization barriers (isb)
Optimized for Raspberry Pi 5
4MB allocations to stress large allocation path

Results Characteristics:

Arena performance scales with virtual memory size
No degradation with larger configurations
Physical memory usage tracks actual access patterns
Page fault overhead amortized across allocation lifetime

Utility Functions

Logging System:

Dual Output Strategy:

stderr: Immediate feedback during development
syslog: System-wide logging for production

Log Levels:

INFO: Normal operations, request logging
WARNING: Command failures, non-zero exit codes
ERROR: Internal errors, authentication failures

Structured Format: All logs include ISO 8601 timestamp, level, message, and optional context (method, URI, route, command, exit code, duration).

Syslog Configuration:

Identifier: "cmon" for easy filtering
LOG_PID: Includes process ID
LOG_CONS: Falls back to console if syslog unavailable
LOG_DAEMON: Categorizes as daemon logs for systemd integration

JSON Response Formatting:

Standard Structure: All responses follow consistent format with fields: status, code, message, data.

JSON Escaping: Critical for security. All special characters must be escaped to prevent:

JSON structure breaking
XSS attacks if output displayed in browser
Client parsing errors

Two-Pass Algorithm:

First pass calculates required buffer size
Second pass builds escaped string

Escaped Characters:

Quotes, backslashes
Control characters (\b, \f, \n, \r, \t)
Non-printable characters (as \uXXXX)

Query Parameter Parsing:

Uses libevent's built-in URI parser to:

Parse request URI
Extract query string
Parse key-value pairs
Return duplicated value (caller must free)

Returns NULL if parameter not found, allowing default values in command functions.

Design Decisions & Tradeoffs

Why Virtual Memory Arena? (NEW v2.0)

The Revolutionary Decision: Switch from malloc-based fixed arena to mmap-based virtual memory arena

Motivation: Previous version limited to 16KB total capacity due to single 64-bit bitmap. This constraint prevented:

Large command outputs (git logs, journal entries)
Concurrent request handling
Flexible configuration per deployment

Solution: Virtual memory with demand paging

How It Works:

Virtual vs Physical Memory:

Virtual: Address space reserved by OS (costs nothing)
Physical: Actual RAM pages (costs real memory)
Translation: MMU maps virtual→physical on access

Example:

Reserve 64GB virtual: mmap(64GB) → Cost: 0 bytes physical
Allocate 1MB: Return virtual address → Cost: 0 bytes physical
Write first byte: Page fault → OS allocates 4KB page → Cost: 4KB physical
Write across 1MB: 250 page faults → Cost: 1MB physical (actual usage)

Benefits:

Virtually Unlimited:
- Can reserve up to 128TB on x86-64 (48-bit addresses)
- Practical limit: 64TB (42-bit) for compatibility
- Configure gigabytes of arena without consuming RAM
Pay-for-What-You-Use:
- Physical memory only allocated on access
- Unused arena regions cost nothing
- Perfect for variable workloads
Clean Resource Management:
- munmap() releases everything at once
- OS automatically frees physical pages
- No manual page tracking needed
Transparent to Code:
- Application code unchanged
- Same allocation API
- MMU handles all translation

Tradeoffs:

Advantages:

✅ No hard capacity limit
✅ Memory efficient (demand paging)
✅ Simple teardown (munmap)
✅ Scales to workload
✅ OS manages physical memory

Disadvantages:

❌ Page fault overhead on first access
❌ Requires virtual address space (not an issue on 64-bit)
❌ TLB pressure with many small allocations
✅ But: Page faults amortized over allocation lifetime
✅ But: TLB caching makes subsequent accesses fast

When Virtual Memory Arena Wins:

Large allocations (>4KB)
Variable workload (some requests large, some small)
Long-running process
Flexibility needed per deployment

When Traditional Allocator Better:

Tiny allocations (<100 bytes)
Extremely latency-sensitive (no page faults tolerated)
Embedded systems without MMU

Design Choice: For server workload, virtual memory is clear winner.

Why Bitmap Array Instead of Single Bitmap?

Previous Approach (v1.0):

Single 64-bit integer bitmap
Maximum 64 chunks
Hard limit

New Approach (v2.0):

Array of 64-bit integers
Each member tracks 64 chunks
Unlimited chunks (array size determined by configuration)

Benefits:

Scalability: Can track thousands of chunks
Modularity: Each member independent
Cache-friendly: Array traversal is linear
Flexible: Easy to add more members

Tradeoff:

Allocation cannot span member boundaries
For allocations >64 chunks, must align to member boundary
Acceptable because large allocations are rare

Implementation Detail: Helper macros for bitmap array access:

MEMBER_INDEX(bit): Which 64-bit integer
BIT_OFFSET(bit): Which bit within integer
GLOBAL_BIT(member, bit): Convert to global position

Why mmap() Instead of malloc()?

Decision: Use mmap() for arena backing store instead of malloc()

Reasons:

Virtual Memory Control:
- mmap() reserves virtual address space
- Can reserve huge regions (GB/TB) without physical allocation
- malloc() would allocate physical memory immediately
Independent Region:
- mmap() creates separate memory region
- Not affected by heap fragmentation
- Independent of malloc/free operations elsewhere
Clean Teardown:
- munmap() releases everything at once
- OS automatically frees all physical pages
- free() might not return memory to OS due to fragmentation
Page Alignment:
- mmap() always returns page-aligned addresses
- Better for large allocations
- TLB efficiency
Transparent Paging:
- OS handles demand paging automatically
- Physical pages allocated on first access
- No manual page management needed

Comparison:

malloc():

Backed by heap (brk/sbrk)
Physical memory allocated immediately
Fragmentation prevents memory return
Limited by heap size

mmap():

Independent virtual region
Physical on demand
Clean release via munmap
Limited only by virtual address space (huge)

Result: mmap() is perfect for arena allocator use case

Why 512-Byte Chunks? (Increased from 256)

Analysis: Larger chunks reduce header overhead and bitmap pressure

Previous: 256 bytes

Good for small allocations
High overhead for large allocations
More bitmap bits needed

New: 512 bytes

Better for larger allocations
Amortized overhead
Fewer chunks needed for typical workload

Trade-off Analysis:

Smaller chunks (256 bytes):

✅ Less waste for tiny allocations
❌ More chunks needed (more bitmap pressure)
❌ More header overhead

Larger chunks (1KB+):

✅ Fewer chunks, less bitmap pressure
❌ More waste for small allocations
❌ Potential internal fragmentation

Chosen (512 bytes):

Balance between waste and efficiency
Good for common allocation sizes (JSON responses, command output)
Not too large to cause excessive waste
Not too small to cause bitmap pressure

Configurable: Can adjust via arena_config() for specific workload

Why libevent Instead of Raw Sockets?

Alternatives Considered:

Raw sockets with manual HTTP parsing
libmicrohttpd
Embedded servers (mongoose, civetweb)

Chosen: libevent 2.x

Reasons:

Battle-tested in production systems (Tor, Chromium, memcached)
Cross-platform support
Event-driven architecture scales to many connections
Built-in HTTP server support
Active maintenance and security updates

Tradeoffs:

Larger dependency than raw sockets
Requires learning event-driven programming model
But: Production-grade reliability worth the complexity

Why Single-Threaded Design?

Analysis from previous version still applies:

Reasons:

Simplicity: No race conditions, no deadlocks, easier to debug
Performance: No lock contention, no context switching
Event-driven I/O: libevent handles concurrency via epoll/kqueue
I/O bound workload: Waiting for commands dominates, not CPU
Arena safety: No atomic operations needed (with note for future)

Note on v2.0: Bitmap array operations still non-atomic. If multi-threading added in future, would need:

Atomic bitmap operations per member
Or locks per member
Or lock-free data structure

Current single-threaded design is optimal for typical workload.

Why Timing-Safe Authentication?

The Timing Attack Problem:

Standard comparison functions (strcmp, memcmp, manual loops with early exit) reveal information through execution time. If comparison exits on first difference, an attacker can measure:

Keys differing at byte 0: Fast (1 comparison)
Keys differing at byte 31: Slow (32 comparisons)

Attacker brute-forces byte-by-byte:

Try all 256 values for byte 0, measure timing
Correct byte takes slightly longer (proceeds to byte 1)
Repeat for all 32 bytes
Total attempts: 256 × 32 = 8,192 instead of 2^256

Constant-Time Solution:

Algorithm examines all bytes regardless of differences. Uses bitwise OR to accumulate differences without branching. Compiler cannot optimize away because OpenSSL's CRYPTO_memcmp is designed to resist optimization.

This demonstrates exceptional security awareness.

Why JSON Instead of Plain Text?

Advantages:

Machine-parseable: Every language has JSON libraries
Consistent structure: All responses same format
Extensible: Can add fields without breaking clients
Type-safe: Clear distinction between success/error
Error handling: Standardized error format

Plain Text Alternative Problems:

How to distinguish status from data?
How to parse errors?
Client needs custom parsing logic
Difficult to extend

Tradeoff:

Slightly more bandwidth
Requires careful escaping (security critical)
But: API consistency worth it

Why Dual Logging?

stderr Benefits:

Immediate feedback during development
See logs in terminal
Colored output possible

syslog Benefits:

System-wide logging infrastructure
Automatic log rotation
Priority-based filtering
Remote forwarding capability
systemd integration (journalctl)

Both Together:

Development: Use stderr
Production: Use syslog/journald
Debugging: Can enable both
Negligible performance impact

Setup & Installation

System Requirements

Operating System: Linux (tested on Ubuntu 24)

Dependencies:

gcc or clang compiler
pkg-config
libevent 2.x development files
OpenSSL development files

Installation by Distribution

Ubuntu/Debian:

sudo apt-get update
sudo apt-get install build-essential pkg-config libevent-dev libssl-dev

Fedora/RHEL:

sudo dnf install gcc pkg-config libevent-devel openssl-devel

Arch Linux:

sudo pacman -S base-devel pkg-config libevent openssl

Nix (Reproducible builds):

nix develop

Secret Key Generation

Generate 256-bit key:

openssl rand -hex 32 > client_secret.key

Expected format: 64 hexadecimal characters (optionally with newline)

Secure permissions:

chmod 600 client_secret.key

Security note: This file is the only authentication mechanism. Keep it secure, never commit to version control.

Building

Using build script:

./build.sh

This compiles all sources and starts the server.

Debug mode (runs in gdb):

DEBUG=1 ./build.sh

Manual compilation:

gcc -O2 -Wall -Wextra -g -o target \
    main.c auth.c arena.c utils.c commands.c \
    $(pkg-config --cflags --libs libevent openssl)

Build outputs: Binary named target in current directory

Running

Foreground (see logs directly):

./target

Expected output:

The client auth key was successfully loaded
Listening requests on http://0.0.0.0:8000

Background:

./target > /dev/null 2> server.log &

systemd Service:

Create /etc/systemd/system/cmon.service:

[Unit]
Description=CMon HTTP Server
After=network.target

[Service]
Type=simple
User=cmon
WorkingDirectory=/opt/cmon
ExecStart=/opt/cmon/target
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable cmon
sudo systemctl start cmon
sudo systemctl status cmon

View logs:

sudo journalctl -u cmon -f

Configuration (NEW v2.0)

Arena Tuning:

Before calling prealloc_arena(), configure arena size:

Small workload (default):

arena_config(512, 64);  // 32KB virtual

Medium workload:

arena_config(64 * 1024, 1024);  // 64MB virtual

Large workload:

arena_config(1024 * 1024, 10000);  // 10GB virtual

Extreme workload:

arena_config(4 * 1024 * 1024, 100000);  // 400GB virtual

Remember: Virtual != Physical

Large configuration costs nothing until used
Physical memory allocated on-demand
Configure generously, pay only for actual usage

Testing

Run test suite:

./run_tests.sh

This builds the server, starts it in background, runs Python integration tests, and shows results.

Manual testing:

KEY=$(cat client_secret.key)
curl -v "http://localhost:8000/health" -H "access_token: $KEY"

Benchmarking (NEW v2.0):

x86-64:

gcc -O2 -o benchmark benchmark.c arena.c -lm
./benchmark

ARM64 (Raspberry Pi 5):

gcc -O2 -o bench_arm bench_arm.c arena.c -lm
./bench_arm

API Reference

Base URL

http://localhost:8000

Authentication

All endpoints require authentication.

Header: access_token (case-insensitive via libevent)
Value: Your 256-bit hex key from client_secret.key

Missing or invalid authentication returns:

{
    "status": "error",
    "code": 401,
    "message": "Authentication Error",
    "data": null
}

Response Format

All responses follow this structure:

Success (HTTP 200):

{
    "status": "ok",
    "code": 200,
    "message": "Command executed",
    "data": "command output here"
}

Error (HTTP 4xx/5xx):

{
    "status": "error",
    "code": 500,
    "message": "Command failed",
    "data": "error details or null"
}

Endpoints

GET /health

Purpose: Check server health and system uptime

Authentication: Required

Query Parameters: None

Response: System uptime information

Example:

curl "http://localhost:8000/health" -H "access_token: YOUR_KEY"

Command executed: uptime

POST /reboot

Purpose: Reboot the entire system

Authentication: Required

Privileges: Requires root or CAP_SYS_BOOT capability

Warning: This will restart the server immediately

Example:

curl -X POST "http://localhost:8000/reboot" -H "access_token: YOUR_KEY"

Command executed: reboot

POST /restart

Purpose: Restart the CMon server process

Authentication: Required

Note: Requires systemd or similar process manager to auto-restart

Example:

curl -X POST "http://localhost:8000/restart" -H "access_token: YOUR_KEY"

Command executed: pkill target

PUT /sync_upstream

Purpose: Pull latest changes from git repository

Authentication: Required

Query Parameters:

branch (optional): Branch name, defaults to "main"

Example:

curl -X PUT "http://localhost:8000/sync_upstream?branch=develop" \
  -H "access_token: YOUR_KEY"

Command executed: git pull origin <branch>

Prerequisites: Must be run from a git repository

GET /deploy_branch

Purpose: Deploy a specific branch using custom script

Authentication: Required

Query Parameters:

branch (optional): Branch name, defaults to "main"

Requirements:

./deploy.sh script must exist in working directory
Script must be executable (chmod +x deploy.sh)

Example:

curl "http://localhost:8000/deploy_branch?branch=feature-x" \
  -H "access_token: YOUR_KEY"

Command executed: ./deploy.sh <branch>

Script receives: Branch name as first argument

DELETE /teardown_branch

Purpose: Teardown deployed branch using custom script

Authentication: Required

Query Parameters:

branch (optional): Branch name, defaults to "main"

Requirements:

./teardown.sh script must exist in working directory
Script must be executable

Example:

curl -X DELETE "http://localhost:8000/teardown_branch?branch=feature-x" \
  -H "access_token: YOUR_KEY"

Command executed: ./teardown.sh <branch>

GET /logs

Purpose: View system logs

Authentication: Required

Query Parameters: None

Returns: Last 50 systemd journal entries

Example:

curl "http://localhost:8000/logs" -H "access_token: YOUR_KEY"

Command executed: journalctl -n 50 --no-pager

HTTP Status Codes

Code	Meaning	When
200	OK	Command executed successfully
401	Unauthorized	Missing or invalid access_token
404	Not Found	Route doesn't exist
405	Method Not Allowed	Wrong HTTP method for endpoint
500	Internal Server Error	Command failed, arena exhausted, or internal error

Performance Characteristics

Virtual Memory Benefits (NEW v2.0)

Memory Efficiency:

Reserve large virtual arena (GB/TB)
Physical memory only used for accessed pages
OS automatically manages physical allocation
No waste on unused capacity

Example:

Configure 10GB arena
Typical workload uses 50MB
Physical memory consumption: ~50MB
Virtual memory reserved: 10GB (costs nothing)

Scalability:

Can handle occasional large allocations without pre-allocating
Flexible per-deployment configuration
No hard-coded limits

Benchmark Results (v2.0)

Updated Methodology:

Allocation size: 256KB (stress test)
Arena config: 64KB chunks, 1024 chunks (64MB virtual)
CPU pinning: Eliminates scheduling noise
Serialized RDTSC: Accurate cycle counting
Page touching: Random access to destroy locality
Zombie pool: Realistic fragmentation
malloc_trim(): Forces heap release for fair comparison

Key Metrics:

Throughput (M ops/sec)
Median latency (P50 in cycles)
Tail latency (P99 in cycles)
Consistency (standard deviation)

Results: Arena allocator shows consistent performance benefits for typical server workload. Exact numbers vary by:

Hardware (CPU, RAM speed)
Allocation size
Access patterns
Fragmentation level

General Findings:

Arena wins for batch allocations
Arena wins for predictable sizes
Arena wins for short lifetimes
malloc competitive for very small allocations (<100 bytes)
Page fault overhead negligible (amortized over allocation lifetime)

ARM64 Support (bench_arm.c)

Platform: Raspberry Pi 5

Uses ARM virtual counter for timing
Instruction serialization barriers
Optimized batch sizes for ARM cache
4MB allocations to stress large allocation path

Demonstrates:

Cross-architecture portability
Arena allocator works on ARM64
Virtual memory benefits universal

Memory Usage Analysis

Virtual vs Physical:

Configured: 64MB arena (64KB × 1024 chunks)

Virtual reserved: 64MB
Bitmap overhead: 1KB (1024 bits / 8)
Physical used initially: 0 bytes (before any allocations)

After 10 allocations (256KB each):

Virtual reserved: Still 64MB
Physical used: ~2.5MB (10 × 256KB)
Bitmap: Still 1KB

After 100 allocations:

Virtual reserved: Still 64MB
Physical used: ~25MB (100 × 256KB)
Arena full: No, only 39% utilized

Key Insight: Physical usage tracks actual workload, not configuration

Page Fault Overhead

First access to allocation:

MMU lookup fails (page not mapped)
CPU raises page fault exception
OS kernel handles fault:
- Allocates physical page (4KB)
- Updates page tables
- Returns to user code
Overhead: ~500-1000 cycles (varies by system)

Subsequent accesses:

TLB cached (Translation Lookaside Buffer)
Virtual→physical lookup: ~1 cycle
No page fault

Amortization:

256KB allocation = 64 pages
64 page faults on first access
Total overhead: ~30,000-60,000 cycles
But allocation lifetime: millions of cycles
Overhead: <1% of total

Conclusion: Page fault overhead negligible for typical workload

Security Model

Authentication Security

Key Strength: 256-bit (2^256 possible combinations)

Equivalent to SHA-256 hash length
Effectively unbreakable by brute force
Would take longer than age of universe to try all combinations

Timing-Safe Comparison:

Uses OpenSSL's CRYPTO_memcmp() which:

Always examines all bytes
Takes constant time regardless of where keys differ
Cannot be optimized away by compiler
Prevents timing attack vectors

Why timing attacks matter:

An attacker measuring response times could brute-force byte-by-byte with only 8,192 attempts (256 values × 32 bytes) instead of 2^256. Constant-time comparison prevents this.

Key Storage:

File-based at ./client_secret.key
Hex-encoded (safe for text editors)
Should have permissions 0600 (owner read/write only)
Never logged or displayed in error messages

Memory Security:

Keys cleansed from memory using OPENSSL_cleanse() before free
Prevents recovery from memory dumps
Prevents use-after-free vulnerabilities

Recommendations:

Generate with openssl rand -hex 32
Store securely (not in version control)
Rotate periodically
Use different keys per environment
Consider key derivation for multiple users

Command Injection Prevention

Safe Design: Uses fork/exec, not system()

Why fork/exec is safe:

Arguments passed as NULL-terminated array
Each argument treated as literal string
No shell metacharacter interpretation
Even if input contains ;, |, &, they're passed literally to program
Program (e.g., git) just sees malformed input and fails safely

Example: If branch parameter is "main; rm -rf /":

Git receives: ["git", "pull", "origin", "main; rm -rf /"]
Git looks for branch named "main; rm -rf /"
Git fails with "unknown branch"
No command injection possible

Why system() would be unsafe: Would invoke shell which interprets metacharacters, enabling arbitrary command execution.

Recommendation: Still validate input Even though injection is prevented, validation is good practice:

Whitelist allowed characters (alphanumeric, dash, underscore, slash)
Check length limits
Reject unexpected patterns

Network Security

Current Setup:

Binds to 0.0.0.0:8000 (all interfaces)
No TLS/SSL (plain HTTP)
Authentication via custom header

For Production:

1. Use TLS termination: Place CMon behind nginx or haproxy with TLS:

nginx handles TLS/SSL
Forwards to CMon on localhost
CMon binds to 127.0.0.1 only

2. Firewall rules:

Allow only specific IP addresses
Rate limit requests
Drop invalid packets early

3. Bind to localhost: Change binding from 0.0.0.0 to 127.0.0.1 if only local access needed

4. VPN/Tunnel: For remote access:

Use WireGuard or SSH tunnel
Never expose directly to internet

5. Discord bot scenario: E2E encryption between Discord bot and CMon provides network security.

Memory Safety (Enhanced in v2.0)

Arena Allocator Safety:

Virtual memory prevents unbounded physical growth
Boundary checks prevent buffer overflows
Header validation detects corruption
Pointer validation prevents crashes
munmap() ensures clean teardown

Virtual Memory Benefits:

OS enforces memory protection
Invalid access triggers segfault (better than silent corruption)
Address space isolation
Page-level protection

Deallocation Checks:

Pointer within arena bounds
Header chunk count reasonable
Allocation doesn't overflow arena
Doesn't cross bitmap member boundary
Returns silently on invalid pointer (doesn't crash)

Recommendations:

Configure arena generously (virtual is free)
Monitor physical memory usage
Add memory usage logging
Consider memset of freed memory (debug builds)

Privilege Management

Commands Requiring Elevated Privileges:

/reboot: Needs CAP_SYS_BOOT or root
/restart: Needs permission to signal processes

Best Practices:

1. Use systemd capabilities: Grant only needed capabilities, not full root

2. Use sudo with NOPASSWD: Configure sudoers for specific commands only

3. Principle of least privilege:

Don't run as root
Use dedicated user account
Grant minimal permissions

4. Audit logging: Log all privileged operations with user context

Deployment Security Checklist

Before Production:

Troubleshooting

Server Won't Start

Symptom: "The client auth key could not be loaded"

Causes:

Missing client_secret.key file
File in wrong location
Insufficient permissions to read file

Solutions:

Generate key: openssl rand -hex 32 > client_secret.key
Check location: File must be in working directory
Fix permissions: chmod 600 client_secret.key
Verify content: Should be 64 hex characters

Symptom: "Bind: Address already in use"

Causes:

Another process using port 8000
Previous instance still running

Solutions:

Find process: sudo lsof -i :8000
Kill it: sudo kill <PID>
Or change port in main.c (recompile required)

Symptom: mmap failed

Causes (NEW v2.0):

Requested virtual size too large
System virtual memory limit reached
Permission issues

Solutions:

Check virtual memory limits: ulimit -v
Reduce arena size: arena_config(smaller_size, fewer_chunks)
Check system limits: /proc/sys/vm/max_map_count

Symptom: Out of memory (OOM killer)

Causes:

Physical memory exhausted
Too many allocations accessed simultaneously

Solutions:

Monitor physical memory: free -h
Reduce concurrent allocations
Increase system RAM
Adjust workload to use less memory

Note: Virtual memory size doesn't matter, physical usage does

Authentication Failures

Symptom: Always getting 401 errors

Causes:

Wrong key in request
Key not being sent
Header name wrong
Whitespace in key

Solutions:

Verify key matches: cat client_secret.key
Test directly: curl -H "access_token: $(cat client_secret.key)" http://localhost:8000/health
Check header name: Must be "access_token"
Remove whitespace: tr -d '\n' < client_secret.key > client_secret.key.new

Command Failures

Symptom: /reboot returns exit_code=1

Causes:

Insufficient privileges
System preventing reboot

Solutions:

Check user: whoami
Grant capability: Configure systemd with CAP_SYS_BOOT
Or use sudo: Modify command to use sudo reboot

Symptom: /deploy_branch returns exit_code=127

Causes:

Script not found
Script not in PATH or current directory
Script not executable

Solutions:

Check exists: ls -la deploy.sh
Make executable: chmod +x deploy.sh
Use absolute path: Modify commands.c to use /opt/cmon/deploy.sh
Verify working directory: Script must be in server's working directory

Performance Issues

Symptom: High latency

Causes:

Commands taking long time
Page faults on large allocations
System overload

Solutions:

Check command times: journalctl -u cmon | grep duration
Pre-fault arena: Add memset after prealloc_arena (trades startup time for allocation speed)
Check system load: uptime
Monitor page faults: perf stat -e page-faults ./target

Symptom: Excessive page faults

Causes (NEW v2.0):

Large allocations accessed for first time
Fragmented access patterns
Cold start

Solutions:

Pre-fault arena: memset after mmap (slower startup, faster allocations)
Increase chunk size: Fewer chunks = fewer page faults
Accept overhead: Page faults amortized over allocation lifetime

Memory Issues

Symptom: Virtual memory exhausted

Causes (NEW v2.0):

Arena configured too large
System virtual memory limit

Solutions:

Reduce arena size
Check limits: ulimit -v
Increase limit if needed

Symptom: Physical memory exhausted

Causes:

Too many allocations in use
Memory leak
Workload exceeds available RAM

Solutions:

Monitor usage: ps aux | grep target
Check for leaks: Ensure all allocations freed
Reduce concurrent workload
Add more RAM

Key: Virtual size doesn't cause OOM, physical usage does

Development Guide

Project Structure

Source files:

main.c: HTTP server, routing, middleware (220 lines)
auth.c/h: Authentication system (150 lines)
arena.c/h: Virtual memory allocator (271 lines) ← Updated
commands.c/h: Command execution (140 lines)
utils.c/h: Utilities (250 lines)

Testing:

benchmark.c: x86-64 performance benchmarking ← Updated
bench_arm.c: ARM64 benchmarking ← New
test_server.py: Integration tests
run_tests.sh: Test automation

Build system:

build.sh: Compilation and execution
flake.nix: Nix development environment
.clang-format: Code formatting rules

Total: ~1,300 lines of C code (excluding tests)

Building from Source

Clone and setup:

git clone <repository-url>
cd CMon

Install dependencies: See Setup & Installation section for distribution-specific commands

Build:

./build.sh

Or use Nix (reproducible):

nix develop

Code Style

Formatting: Uses clang-format with LLVM style base

Rules:

Indent: 4 spaces
Line length: 100 characters max
No single-line if statements

Apply formatting:

clang-format -i *.c *.h

Adding New Endpoints

Steps:

Define command function in commands.c
- Follow pattern of existing commands
- Use run_cmd_argv() for execution
- Handle default values for optional parameters
- Return allocated output, set exit code
Add declaration in commands.h
- Match signature of other command functions
Create callback in main.c
- Use validate_and_run() for parameterless commands
- Use validate_and_run_arg() for commands with parameters
- Specify parameter name for query string
Register route in ROUTES_CONFIG array
- Specify path, HTTP method, callback
- Array automatically sized
Test
- Add test to test_server.py
- Run ./run_tests.sh
- Manual test with curl

Testing

Automated tests:

./run_tests.sh

Manual testing:

./target &
curl -v "http://localhost:8000/health" -H "access_token: $(cat client_secret.key)"
tail -f server.log

Benchmarking (NEW v2.0):

x86-64:

gcc -O2 -o benchmark benchmark.c arena.c -lm
./benchmark

ARM64:

gcc -O2 -o bench_arm bench_arm.c arena.c -lm
./bench_arm

Debugging

Run in gdb:

DEBUG=1 ./build.sh

Common breakpoints:

Arena allocation: break arena.c:90 (check_and_claim)
Bitmap search: break arena.c:142 (find_k_consecutive_zeroes)
Virtual memory init: break arena.c:51 (prealloc_arena)

Inspect virtual memory:

# In gdb:
(gdb) info proc mappings    # Show all memory mappings
(gdb) print arena_buf_num   # Number of chunks
(gdb) print arena_buf_size  # Chunk size
(gdb) x/16xg LOCK           # Examine bitmap array

Monitor page faults:

perf stat -e page-faults,minor-faults,major-faults ./target

Performance Analysis

Profile with perf:

perf record -g ./target
perf report

Generate flamegraph:

perf script | stackcollapse-perf.pl | flamegraph.pl > flame.svg

Monitor virtual memory:

watch -n 1 'cat /proc/$(pgrep target)/status | grep -E "Vm|Rss"'

Tune arena (NEW v2.0):

Modify configuration in main():

// For small workload
arena_config(512, 64);

// For large workload
arena_config(64*1024, 1024);

Appendix

Frequently Asked Questions

Q: How much virtual memory can I allocate?

A: On x86-64 Linux:

User space: 128TB (47-bit addresses)
Practical limit: 64TB (42-bit) for compatibility
CMon limit: Only by system configuration

But remember: Virtual != Physical

Can allocate 64TB virtual
Physical usage determined by what you access
OS will OOM kill if physical memory exhausted, not virtual

Q: What's the overhead of virtual memory?

A: Minimal:

Page tables: ~0.2% of virtual size (e.g., 13MB for 64GB)
TLB misses: Cached after first access
Page faults: One-time cost per page, amortized over lifetime

For server workload with allocation sizes >4KB, overhead is negligible.

Q: Can I mix malloc and arena allocations?

A: Yes, but:

Must free() what you malloc()
Must deallocate() what you allocate()
Don't mix them up
Current code uses arena for command output, malloc for query parameters

Q: What happens if I allocate more than physical RAM?

A: Depends:

Allocated but not accessed: Nothing (just virtual reservation)
Accessed beyond physical RAM: OS starts swapping to disk
Too much swapping: Performance degradation
No swap space: OOM killer terminates process

Best practice: Configure arena larger than needed, but monitor physical usage

Q: Why not use huge pages?

A: Trade-off:

Huge pages: Faster TLB, fewer page faults
But: Less flexible, potential waste, privileged operation
Default 4KB pages: Good balance for this workload

Could add huge page support as configuration option.

Q: Can I use this on 32-bit systems?

A: Technically yes, but:

Virtual address space limited (2-4GB)
Loses main benefit of virtual memory approach
Better to use v1.0 arena on 32-bit systems

Q: How to benchmark on my system?

A: Included benchmarks:

# x86-64
gcc -O2 -o benchmark benchmark.c arena.c -lm
./benchmark

# ARM64
gcc -O2 -o bench_arm bench_arm.c arena.c -lm
./bench_arm

Adjust constants in benchmark files for your workload.

Q: What's the maximum allocation size?

A: Limited by:

Arena size: Total virtual reservation
Chunk size: Single allocation can span multiple chunks
Physical RAM: What you can actually access

Example with 64KB chunks:

Can allocate multi-megabyte buffers
Limited by configured arena size
Physical memory determines actual usability

Q: Why mmap instead of huge malloc?

A: mmap advantages:

Independent virtual region
Demand paging (pay for what you use)
Clean teardown with munmap
Not affected by heap fragmentation
Can reserve huge regions without physical cost

malloc would allocate physical memory immediately.

Q: Is this production-ready?

A: For internal use, yes (with hardening):

Behind TLS terminator
Proper monitoring
Configured arena size
Tested on your workload

For public-facing: Add more hardening

Rate limiting
Input validation
DDoS protection
Security audit

References

Dependencies:

Concepts:

Virtual memory and MMU
Demand paging
TLB and page tables
Event-driven architecture
Timing attacks

Linux Documentation:

mmap(2) man page
munmap(2) man page
/proc/PID/maps (process memory mappings)

Similar Projects:

webhook (Go): HTTP to command execution
systemd HTTP API: systemd unit management

Further Reading:

"Understanding the Linux Virtual Memory Manager" (Gorman)
"What Every Programmer Should Know About Memory" (Drepper)
"The Linux Programming Interface" (Kerrisk)
"Systems Performance" (Gregg)

End of Documentation

Version 2.0 introduces revolutionary virtual memory arena allocator with virtually unlimited capacity and demand paging. The elegant design allows reserving huge virtual address spaces while only consuming physical memory for actually accessed pages, making CMon suitable for workloads ranging from tiny to massive.

For additional details, consult the source code and README.md.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.clang-format		.clang-format
.gitignore		.gitignore
README.md		README.md
Specification.md		Specification.md
arena.c		arena.c
arena.h		arena.h
auth.c		auth.c
auth.h		auth.h
bench.txt		bench.txt
bench_arm.c		bench_arm.c
benchmark.c		benchmark.c
build.sh		build.sh
commands.c		commands.c
commands.h		commands.h
flake.lock		flake.lock
flake.nix		flake.nix
health_10k_rps.js		health_10k_rps.js
main.c		main.c
run_tests.sh		run_tests.sh
test_server.py		test_server.py
utils.c		utils.c
utils.h		utils.h

Folders and files

Latest commit

History

Repository files navigation

CMon - Complete Documentation

Table of Contents

Overview

What is CMon?

What's New in Version 2.0

Problem Statement

Key Features

Performance Metrics

Use Cases

High-Level Architecture

System Overview

Request Lifecycle

Component Interaction

Component Design

HTTP Server Layer

Authentication System

Command Execution System

Virtual Memory Arena Allocator - NEW v2.0

The Virtual Memory Breakthrough

How Virtual Memory Works

Example Scenario

Design Goals (v2.0)

Configuration (v2.0)

Data Structures (v2.0)

The Enhanced Bit-Smearing Algorithm (v2.0)

Allocation Process (v2.0)

Deallocation Process (v2.0)

Virtual Memory Management (v2.0)

Why mmap() Instead of malloc()?

Performance Characteristics (v2.0)

Utility Functions

Design Decisions & Tradeoffs

Why Virtual Memory Arena? (NEW v2.0)

Why Bitmap Array Instead of Single Bitmap?

Why mmap() Instead of malloc()?

Why 512-Byte Chunks? (Increased from 256)

Why libevent Instead of Raw Sockets?

Why Single-Threaded Design?

Why Timing-Safe Authentication?

Why JSON Instead of Plain Text?

Why Dual Logging?

Setup & Installation

System Requirements

Installation by Distribution

Secret Key Generation

Building

Running

Configuration (NEW v2.0)

Testing

API Reference

Base URL

Authentication

Response Format

Endpoints

GET /health

POST /reboot

POST /restart

PUT /sync_upstream

GET /deploy_branch

DELETE /teardown_branch

GET /logs

HTTP Status Codes

Performance Characteristics

Virtual Memory Benefits (NEW v2.0)

Benchmark Results (v2.0)

ARM64 Support (bench_arm.c)

Memory Usage Analysis

Page Fault Overhead

Security Model

Authentication Security

Command Injection Prevention

Network Security

Memory Safety (Enhanced in v2.0)

Privilege Management

Deployment Security Checklist

Troubleshooting

Packages