Skip to content

ByteProvider

abbaye edited this page Mar 7, 2026 · 1 revision

ByteProvider System

ByteProvider (in WpfHexEditor.Core) is the virtual-view data layer of the HexEditor. It presents a mutable logical view of a file while keeping the physical file unchanged until Save().


Architecture

flowchart TD
    User["User edits"]
    VM["HexEditorViewModel\nvirtual byte view"]
    BP["ByteProvider"]
    PM["PositionMapper\nO(log n) segment tree"]
    File["Physical file\nunchanged until Save()"]

    subgraph Edits["Edit tracking"]
        Mods["Modifications\nDictionary<long, byte>\nin-place byte changes"]
        Ins["Insertions\nordered segments (LIFO)\nbyte[] + virtual offset"]
        Del["Deletions\nDictionary<long, int>\nvirtual offset + count"]
    end

    subgraph Commands["Undo / Redo stack"]
        ModCmd["ModifyCommand"]
        InsCmd["InsertCommand"]
        DelCmd["DeleteCommand"]
        Batch["BatchCommand\n(group as one undo step)"]
    end

    User --> VM
    VM --> BP
    BP --> PM
    BP --> Edits
    PM --> File
    BP --> Commands
Loading

Core Concepts

Virtual View

The user always sees a logical (virtual) byte stream that includes all pending edits. The physical file is never modified until an explicit Save().

Physical file:   [AA BB CC DD EE FF]  (6 bytes, unchanged)
Modifications:   { 2 → 0x99 }        (byte at pos 2 changed)
Insertions:      { after 3: [11 22] } (2 bytes inserted)
Virtual view:    [AA BB 99 DD 11 22 EE FF]  (8 bytes seen by editor)

PositionMapper

PositionMapper maintains a segment tree to translate between physical and virtual positions in O(log n):

// Physical file offset → virtual editor offset
long virtualPos = positionMapper.PhysicalToVirtual(physicalOffset);

// Virtual editor offset → physical file offset
long physicalPos = positionMapper.VirtualToPhysical(virtualOffset);

This is rebuilt incrementally as edits accumulate.


ByteProvider API

ByteProvider bp = new ByteProvider(filePath);

// --- Read ---
byte value     = bp.GetByte(virtualOffset);
byte[] block   = bp.GetBytes(virtualOffset, count);
long   length  = bp.Length;         // virtual length (includes insertions)
long   physLen = bp.FileLength;     // physical file length

// --- Modify (in-place byte change) ---
bp.AddByteModified(newValue, virtualOffset);

// --- Insert ---
bp.AddByteInserted(newByte, virtualOffset);
bp.AddBytesInserted(newBytes, virtualOffset);

// --- Delete ---
bp.AddByteDeleted(virtualOffset);
bp.AddBytesDeleted(virtualOffset, count);

// --- Undo / Redo ---
bp.Undo();
bp.Redo();
bool canUndo = bp.CanUndo;
bool canRedo = bp.CanRedo;

// --- Save ---
await bp.SaveAsync();                  // smart save (see below)
await bp.SaveAsAsync(newFilePath);

// --- State ---
bool isDirty    = bp.IsModified;
bool hasInserts = bp.HasInsertedBytes;
bool hasDels    = bp.HasDeletedBytes;

// --- Events ---
bp.DataChanged   += (s, e) => { /* refresh viewport */ };
bp.LengthChanged += (s, e) => { /* update scrollbar */ };
bp.Undone        += (s, e) => { };
bp.Redone        += (s, e) => { };

Smart Save Strategy

Save() chooses the fastest safe write path:

flowchart TD
    Save["Save()"]
    Check{"Has insertions\nor deletions?"}
    Fast["Fast path\nIn-place overwrite\nonly modified bytes\n(FileStream seek+write)"]
    Full["Full rebuild\n1. Write to temp file\n2. File.Replace()\n(atomic swap)"]

    Save --> Check
    Check -->|No — modifications only| Fast
    Check -->|Yes| Full
Loading
  • Fast path: 10–100× faster for pure byte-modification workloads (no length change)
  • Full rebuild: File.Replace() is atomic — no data loss on crash mid-save

Undo / Redo — Command Pattern

Every edit is a command pushed onto the undo stack:

Command Triggered by
ModifyCommand AddByteModified()
InsertCommand AddByteInserted() / AddBytesInserted()
DeleteCommand AddByteDeleted() / AddBytesDeleted()
BatchCommand Groups multiple commands into one undo step
// Group a paste operation as a single undo step
bp.BeginBatch();
foreach (var (offset, value) in clipboard)
    bp.AddByteModified(value, offset);
bp.EndBatch();     // entire paste undone in one Ctrl+Z

ByteProvider Variants

Variant Backed by Use case
ByteProvider(string filePath) Physical file Normal file editing
ByteProvider(Stream stream) Stream In-memory or network streams
ByteProvider(byte[] data) Byte array Unit tests, small buffers
ReadOnlyByteProvider Any Locked / read-only view

Memory Considerations

  • Modifications dictionary: O(m) where m = number of changed bytes
  • Insertions: O(i) where i = number of inserted segments
  • PositionMapper segment tree: O(i + d) nodes for i insertions + d deletions
  • File content is never loaded into memory — ByteProvider reads from FileStream on demand using Span<T> / ArrayPool<T> for zero-alloc reads

For very large files (10 GB+), the memory footprint of ByteProvider is proportional only to the number of edits, not the file size.


See Also

Navigation

Getting Started

IDE Documentation

HexEditor Control

Advanced

Development


v0.6.4.75 Highlights

  • whfmt.FileFormatCatalog v1.0.0 NuGet (cross-platform net8.0)
  • 690+ .whfmt definitions (schema v2.3)
  • Structure Editor — block DataGrid, drag-drop, validation, SmartComplete
  • WhfmtBrowser/Catalog panels — browse all embedded formats
  • AI Assistant (5 providers, 25 MCP tools)
  • Tab Groups, Document Structure, Lazy Plugin Loading
  • Window Menu + Win32 Fullscreen (F11)
  • Git Integration UI (changes, history, blame)
  • Shared Undo Engine (HexEditor ↔ CodeEditor)
  • Bracket pair colorization, sticky scroll, peek definition
  • Format detection hardening (thread-safe, crash guard)

Links

Clone this wiki locally