Skip to content

Releases: emremy/ColQL

v0.7.0 — Adaptive Background Indexing Runtime

19 May 13:30
bfd7cb1

Choose a tag to compare

Overview

v0.7.0 focuses on runtime maturity, adaptive background indexing, and large sorted rebuild optimization.

This release introduces:

  • adaptive background rebuild scheduling
  • optimized sorted index apply/merge pipeline
  • reduced worker transfer/output memory
  • stabilized runtime observability
  • hardened worker runtime portability
  • improved benchmark and diagnostics coverage

The public query API remains synchronous.


Highlights

Adaptive Background Scheduling

ColQL now includes an internal runtime scheduler for dirty indexes.

Current runtime behavior:

  • equality indexes remain sync-preferred
  • small tables remain sync-preferred
  • large eligible dirty sorted indexes may rebuild in the background
  • terminal queries remain synchronous and scan-correct
  • explain() remains non-executing

This release intentionally does not expose public scheduler configuration yet.


Sorted Background Rebuild Optimization

The sorted background apply path was heavily optimized.

Changes include:

  • typed-array heap merge
  • reduced compare counts
  • reduced worker transfer/output buffers
  • deterministic ordering preservation
  • fail-closed validation retained

10M Benchmark Improvements

Before:

  • sorted apply ≈ 7900ms

After:

  • sorted apply ≈ 600ms
  • total real-worker sorted rebuild ≈ 1.1s

Worker transfer/output:

  • ~80MB → ~40MB

Runtime Philosophy

Background indexing is intentionally narrow.

It is designed for:

  • latency isolation during large dirty sorted rebuilds

It is NOT:

  • a universal worker speedup mechanism
  • an async query engine
  • a distributed cache/database

Benchmark Summary

100k rows

  • equality: sync-preferred
  • sorted: sync-preferred

1M rows

Sorted:

  • sync query ≈ 316ms
  • real-worker total ≈ 112ms

10M rows

Equality:

  • sync query ≈ 142-165ms
  • real-worker ≈ 258-303ms
  • policy remains sync-preferred

Sorted:

  • sync query ≈ 3360-3430ms
  • real-worker total ≈ 1080-1110ms
  • rebuild ≈ 290-306ms
  • apply ≈ 608-650ms
  • fallback query ≈ 165-170ms
  • transfer/output ≈ 40MB

Worker Runtime Hardening

  • stabilized ESM/CJS worker resolution
  • lazy Node worker runtime loading
  • worker runtime smoke coverage
  • npm package/runtime validation
  • cleaned package artifact output

Observability Improvements

Expanded additive runtime diagnostics:

  • indexState
  • fallbackReason
  • backgroundRebuildScheduled
  • backgroundRebuildState
  • selectedIndex
  • scheduler decision metadata

Table-level diagnostics intentionally remain internal/debug-only for now.


Validation

Validated with:

  • test suite
  • type tests
  • worker runtime smoke tests
  • CodSpeed benchmarks
  • large background indexing benchmarks
  • npm package dry-run validation

Known Limitations

  • background indexing currently targets large SAB/zero-copy eligible dirty sorted indexes only
  • equality rebuilds remain sync-preferred
  • browser worker runtime is not officially supported yet
  • no public scheduler config exists yet
  • table diagnostics remain internal/debug-only

Notes

ColQL continues to evolve as a runtime-aware in-memory query engine focused on:

  • predictable performance
  • lifecycle correctness
  • explicit indexing
  • low-latency in-memory querying
  • production-oriented runtime behavior

v0.6.0 — Background Indexing Architecture

10 May 20:25
1eb57e6

Choose a tag to compare

ColQL v0.6.0

v0.6.0 introduces the first full background indexing architecture groundwork for ColQL.

This release focuses on lifecycle correctness, worker runtime infrastructure, stale-result safety, diagnostics, packaging stability, and large-table rebuild handling.

This is primarily an architecture/runtime-hardening release rather than a pure throughput release.


Highlights

Background indexing lifecycle

Added internal lifecycle management for index rebuilds:

  • fresh
  • dirty
  • queued
  • rebuilding
  • failed

Added:

  • generation/epoch tracking
  • stale-result discard
  • atomic apply/swap semantics
  • fallback query correctness
  • background rebuild diagnostics

Worker runtime groundwork

Added:

  • bounded worker pool
  • chunk task queue
  • real worker_threads runtime integration
  • deterministic fake executor for tests
  • worker packaging/runtime smoke tests
  • ESM/CJS worker resolution validation

Background rebuild paths currently remain internal/runtime infrastructure.

Public query APIs remain synchronous.


Equality + sorted rebuild groundwork

Implemented internal background rebuild pipelines for:

  • equality indexes
  • sorted indexes

including:

  • typed/encoded worker-safe outputs
  • deterministic merge/apply paths
  • stale generation discard
  • mutation/delete invalidation handling

Benchmarks

Added new benchmark coverage for:

  • background equality rebuilds
  • background sorted rebuilds
  • worker startup/runtime overhead
  • stale-result discard
  • rebuild apply timing
  • sync vs mock-background vs real-worker rebuild paths

Important benchmark note

Current benchmarks show that worker rebuilds are not universally faster than synchronous rebuilds.

For smaller datasets, worker startup and transfer overhead may dominate.

The primary goal of the architecture is:

  • rebuild isolation
  • lifecycle correctness
  • large dirty-index handling

rather than universal speedups.


Serialization hardening

Clarified/documented:

  • serialized indexes are unsupported
  • lifecycle/job state is not serialized
  • restored tables contain data only
  • indexes must be recreated after restore

Packaging cleanup

  • removed hashed chunk-* dist artifacts
  • stabilized dist layout
  • validated worker artifacts inside npm tarballs
  • verified ESM/CJS worker startup from built output

Validation

Validated with:

  • npm test
  • npm run test:types
  • npm run build
  • npm run test:worker-runtime
  • npm run bench:codspeed
  • background indexing benchmarks
  • worker runtime benchmarks
  • npm pack dry-run

Known caveats

  • query APIs remain synchronous
  • automatic background scheduling is not enabled for normal user query paths yet
  • worker rebuild paths are still internal/runtime infrastructure
  • a documented non-blocking tsup CJS import.meta warning remains during build

v0.5.0 — Trust & Stability Hardening

08 May 23:31
6f5b911

Choose a tag to compare

v0.5.0 — Trust & Stability Hardening

v0.5.0 is a hardening release focused on correctness, API stability, serialization safety, and performance predictability.

This release does not expand ColQL into SQL, persistence, distributed coordination, or database-replacement territory. ColQL remains a process-local, in-memory columnar query engine for TypeScript.

Highlights

Column-scoped index invalidation

Updates now dirty only the indexes whose indexed columns actually changed.

This avoids unnecessary lazy rebuild latency after unrelated updates.

For example, updating a non-indexed column no longer dirties unrelated equality, sorted, or unique indexes.

Deletes still dirty row-position-sensitive indexes broadly for correctness.

Serialization hardening

v0.5.0 introduces stricter serialization validation and a numeric wire-format version independent from the package version.

The deserializer now rejects malformed metadata and corrupted payloads more predictably, including:

  • duplicate column metadata
  • invalid rowCount/capacity relations
  • invalid offsets and alignment
  • overlapping byte ranges
  • malformed dictionary metadata
  • invalid dictionary codes
  • corrupted numeric payloads such as NaN, Infinity, and -Infinity
  • serialized index metadata

Restore-time reconstruction failures are now wrapped as COLQL_INVALID_SERIALIZED_DATA.

Serialization compatibility note

Serialized snapshots created with v0.4.x are not guaranteed to be compatible with v0.5.0.

ColQL snapshots are intended for process-local in-memory state transfer, not durable long-term storage before v1.0.0.

Serialized indexes remain unsupported and are rejected during restore.

Type/API stability gate

v0.5.0 adds an explicit type-test gate.

Type-level behavior for query, mutation, index, explain, and onQuery APIs is now checked as part of CI.

Deterministic correctness fuzzer

This release adds a deterministic mutation/index/serialization fuzzer.

It compares ColQL behavior against a JS array oracle across long operation sequences involving:

  • insert
  • update
  • delete
  • updateBy
  • deleteBy
  • serialize
  • restore
  • reindex
  • query parity checks

This improves confidence in index correctness after mutations and restores.

Observability additions

QueryInfo now includes additional optional diagnostics fields.

QueryExplainPlan also includes selectedIndex.

These changes are additive and preserve existing onQuery compatibility.

explain() remains non-executing and does not rebuild dirty indexes.

Benchmarks and release protocol

v0.5.0 adds CodSpeed benchmark coverage for no-rebuild vs lazy-rebuild paths and updates the release benchmark protocol.

Validation

The release was validated with:

  • npm test — passed, 69 files / 238 tests
  • npm run test:types — passed
  • npm run build — passed
  • npm run bench:codspeed — passed
  • npm pack --dry-run with a clean temporary cache — passed

Tarball contents were confirmed to include only the expected package artifacts.

ColQL remains process-local, in-memory, non-durable, non-SQL, and not a database replacement.

v0.4.0 – Developer Confidence Release

03 May 23:07
e9cc0eb

Choose a tag to compare

🚀 ColQL v0.4.0 – Developer Confidence Release

This release focuses on making ColQL predictable, inspectable, and production-ready (within its scope).


✨ Key Features

🔍 Query Explain API

  • query.explain() provides non-executing diagnostics
  • Understand index usage, scan behavior, and predicate planning
  • Includes stable reasonCode system

🧪 Confidence Scenario Suite

  • Real-world API-like test flows under tests/scenarios/
  • Covers:
    • filtering
    • projections
    • mutations
    • dirty index lifecycle
    • serialization & restore
  • Ensures correctness beyond unit tests

📊 Real-world Benchmark

  • Session analytics workload benchmark
  • Demonstrates:
    • index vs full scan behavior
    • dirty index rebuild cost
    • repeated query performance

Run:

npm run benchmark:session-analytics
ROWS=100000 npm run benchmark:session-analytics
npm run benchmark:session-analytics -- --json

🧩 Flagship Example

  • examples/session-analytics.ts
  • Shows real usage pattern:
    • dataset generation
    • indexing
    • querying
    • explain()
    • mutation lifecycle

📚 Documentation Improvements

  • Decision guide (ColQL vs Array vs SQLite vs DuckDB)
  • Clear “When NOT to use ColQL”
  • Serialization and index lifecycle explained
  • Benchmark usage documented

🎯 Philosophy

This release does not add new query features.

Instead, it focuses on:

  • trust
  • debuggability
  • observability
  • real-world usability

⚠️ Scope Clarification

ColQL is:

  • in-memory
  • process-local
  • explicitly indexed

ColQL is NOT:

  • a database
  • distributed
  • a persistence layer

🔮 What’s next?

v0.5.0 will focus on:

  • API stabilization
  • type safety improvements
  • further polish toward v1.0.0

v0.3.0: Unique indexes, by-key helpers, and JS Array comparison benchmarks

03 May 00:10

Choose a tag to compare

v0.3.0

Overview

v0.3.0 focuses on making ColQL more practical for real-world backend usage by introducing data integrity guarantees, stable key-based operations, and a clear comparison against standard JS array usage.

This release strengthens correctness while improving performance visibility and developer ergonomics.


✨ Highlights

🔐 Unique Indexes

ColQL now supports unique indexes that enforce data integrity:

users.createUniqueIndex("id");
users.createUniqueIndex("email");

Guarantees:

  • Duplicate inserts are rejected
  • insertMany is all-or-nothing
  • Updates cannot violate uniqueness
  • Deleted keys can be reused
  • Unique lookups never return stale row positions

Supported:

  • numeric columns
  • dictionary columns

Not supported:

  • boolean columns

🔑 By-Key Helpers

Work with stable identifiers instead of rowIndex:

users.findBy("id", 123);
users.updateBy("email", "a@b.com", { active: false });
users.deleteBy("id", 123);
  • Require a unique index
  • Provide predictable single-row semantics
  • Avoid reliance on unstable internal row positions

🔄 JS Array Migration Helpers

Easier transition from JS arrays:

const users = fromRows(schema, rows);

users.firstWhere({ id: 1 });
users.countWhere({ active: true });
users.exists({ country: "TR" });
  • Thin wrappers over the query engine
  • Fully typed
  • Preserve lazy execution behavior

📊 JS Array vs ColQL Benchmarks

Added a comprehensive benchmark suite comparing:

  • JS object arrays
  • ColQL scan
  • equality index
  • sorted index
  • unique index

Across:

  • 1k / 100k / 1M rows
  • memory usage
  • filtering
  • projection + limit
  • lookup
  • range queries
  • mutations

Key observations:

  • ~6–7x lower memory usage at 1M rows
  • sub-millisecond indexed lookups
  • projection + limit significantly faster
  • JS arrays remain faster for simple full scans
  • filter(fn) is a full-scan escape hatch and can be expensive

Benchmarks are local reference results, not universal guarantees.


⚡ Performance Improvements

  • Optimized structured scan hot path
  • Reduced 1M-row scan latency (~40ms → ~20ms)
  • Bulk compaction for deleteMany
  • Avoid unnecessary unique index rebuilds
  • More accurate benchmark measurement (setup excluded)

🧪 Expanded Test Coverage

  • Real-world scenarios:
    • user directory (id/email uniqueness)
    • product catalog (SKU + range queries)
    • session/token registry
    • feature flags
  • Mixed operation sequences
  • Edge cases for unique indexes
  • JS array parity validation

🧠 Design Notes

  • Equality and sorted indexes are performance-only
  • Unique indexes enforce data integrity
  • Row indexes are not stable identifiers
  • Unique indexes are not serialized and must be recreated
  • Broad mutations may be slower than raw JS arrays due to safety guarantees

📌 When to Use ColQL

Use ColQL when:

  • Data is already in-process
  • Memory efficiency matters
  • Indexed lookups or range queries are required
  • You want structured querying without a database roundtrip

JS arrays may be better for:

  • Small datasets
  • Simple scans
  • Minimal logic

❗ Breaking Changes

None.


Notes

This release focuses on correctness, clarity, and real-world usability rather than adding heavy features or external integrations.

v0.2.0 — Query Ergonomics, Performance Improvements, and Real-World Validation

30 Apr 17:27
08f0864

Choose a tag to compare

🚀 v0.2.0 Release

This release focuses on developer experience, usability, and real-world validation without changing the core architecture.

ColQL is now validated with a real backend setup and large dataset scenarios.


✨ Highlights

🔍 Query Ergonomics

  • Added object-based queries: table.where({ age: { gt: 25 }, country: "TR" })
  • Strong TypeScript typing with operator safety per column type

🔁 Mutations

  • Added table-level wrappers:
    • updateMany(predicate, partial)
    • deleteMany(predicate)
  • Existing APIs preserved (no breaking changes)

🧠 Query Execution

  • Added filter(fn) as a full-scan escape hatch:
    • runs after structured predicates
    • does not use indexes

📊 Observability

  • Added optional onQuery hook for lightweight query instrumentation

⚡ Performance

  • Removed regression introduced during development
  • Optimized query execution hot paths
  • Improved scan and index performance

1M rows (Fastify example)

  • indexed query: ~2ms
  • range query: ~25ms
  • full scan: ~25ms
  • filter(fn): ~200ms
  • 50 concurrent requests: ~3.5ms avg

🧪 Real-World Validation

Added a Fastify backend example with:

  • 1M in-memory dataset
  • full CRUD + query API
  • cold start + mutation validation
  • index rebuild verification
  • latency and stress tests
  • memory usage checks

📚 Documentation

  • Updated docs for:
    • object-based queries
    • mutation wrappers
    • filter(fn) behavior
    • TypeScript typing
  • Ensured index guarantees and planner behavior are clearly defined

🔒 Guarantees

  • No breaking changes
  • Index usage does not affect correctness
  • Planner decisions affect performance only
  • Dirty indexes are rebuilt before use
  • rowIndex is not a stable identifier

📦 Notes

  • Indexes are not serialized (must be recreated after restore)
  • filter(fn) is intentionally full-scan and should be used selectively
  • ColQL remains process-local (not distributed)

ColQL v0.1.0 — From Query Library to Full In-Memory Engine

28 Apr 22:30
9aca751

Choose a tag to compare

🚀 ColQL v0.1.0

ColQL is now a memory-efficient, indexed, and mutable in-memory columnar query engine for TypeScript.

This release marks a major milestone: ColQL evolves from a simple query utility into a complete execution engine with storage, indexing, and mutation support.


✨ Highlights

🧱 Columnar Storage (Chunked)

  • Compact TypedArray-based storage
  • Chunked layout for efficient mutations
  • Predictable memory usage

🔍 Indexing & Query Engine

  • Equality indexes (=, in)
  • Sorted indexes for range queries (>, <, >=, <=)
  • Cost-aware query planner
  • Predicate reordering
  • Projection pushdown

🗑️ Physical Deletes

users.delete(rowIndex);
users.where(...).delete();
users.deleteWhere(...);
  • Rows are physically removed (no tombstones)
  • Logical row order is preserved
  • Row indexes may shift after deletion

✏️ Updates & Predicate Mutations

users.update(rowIndex, { age: 30 });

users.where("status", "=", "active").update({ age: 25 });

users.updateWhere("age", ">", 18, {
  status: "active"
});
  • Partial updates
  • Predicate-based update/delete
  • All-or-nothing mutation behavior
  • Returns { affectedRows: number }

⚡ Index + Mutation Behavior

  • Indexes are marked dirty after mutations
  • Rebuilt lazily on next indexed query
  • First indexed query may pay rebuild cost
  • Subsequent queries remain fast

🛡️ Validation & Errors

  • Runtime validation for inserts, updates, and queries
  • Structured errors via ColQLError
  • Stable error codes

💾 Serialization

  • Tables can be serialized/deserialized
  • Indexes are not serialized (rebuild required)

⚡ Performance (example)

250,000 rows

delete first row:       ~0.8ms
delete middle row:      ~0.05ms
delete last row:        ~0.04ms
delete 1k random rows:  ~825ms

first indexed query:    ~30ms
subsequent query:       ~0.07ms

⚠️ Important Notes

  • rowIndex is not a stable identifier
  • Use an explicit id column for identity
  • toArray() materializes results (allocates memory)
  • Indexes increase memory usage (expected tradeoff)

📚 Documentation

Full documentation is available in the repository:

👉 docs/doc


🎯 Summary

ColQL v0.1.0 provides:

  • Typed schema-based tables
  • Columnar in-memory storage
  • Automatic indexing and query planning
  • Safe and predictable mutations
  • Strong runtime validation
  • Clear performance and memory model

➡️ What’s next?

Future work may include:

  • batch operations
  • advanced indexing strategies
  • additional query capabilities

ColQL v0.0.6 — Core Engine Complete (Indexing + Query + Physical Delete)

28 Apr 19:38
1399aa3

Choose a tag to compare

🚀 ColQL v0.0.6

This release completes ColQL’s core execution engine by introducing indexing, query optimizations, and physical row deletion.

ColQL is now a fully capable in-memory columnar query engine with efficient storage, indexing, and mutation support.


🔍 Indexing & Query Engine

  • Equality indexes (=, in)
  • Sorted indexes for range queries (>, <, >=, <=)
  • Cost-aware query planner
  • Predicate reordering
  • Projection pushdown

⚡ Query Engine Improvements

  • Lazy execution
  • Early termination (limit)
  • Reduced intermediate allocations
  • Optimized scan paths

🧱 Storage Upgrade

ColQL now uses chunked columnar storage internally:

  • Replaces single-buffer column layout
  • Reduces memory movement for mutations
  • Preserves compact TypedArray-based storage

🗑️ Physical Row Deletion

users.delete(rowIndex);
  • Rows are physically removed
  • No tombstones or compaction required
  • Logical row order is preserved
  • Row indexes after deletion may change

📊 Index Behavior After Delete

  • Indexes are marked dirty after delete
  • Rebuilt lazily on the next indexed query
  • First indexed query may be slower due to rebuild
  • Subsequent queries return to normal performance

⚡ Performance

250,000 rows

delete first row:       ~0.8ms
delete middle row:      ~0.05ms
delete last row:        ~0.04ms
delete 1k random rows:  ~825ms

first indexed query:    ~30ms
subsequent query:       ~0.07ms

💾 Memory Model

  • Columnar storage remains compact
  • Indexes increase memory usage (expected)
  • Query materialization (toArray) is temporary
  • Memory returns close to baseline after dropping indexes

⚠️ Important Notes

  • Row indexes are not stable identifiers
  • Use an explicit id column for stable identity
  • No runtime dependencies
  • No tombstones
  • No compaction step

🔧 Internal Improvements

  • Chunked storage replaces single-buffer columns
  • Delete benchmark and memory attribution improvements
  • Extended test coverage

ColQL v0.0.5 — Runtime validation and cost-aware indexing

28 Apr 12:16
1d7125a

Choose a tag to compare

ColQL now includes runtime validation, structured errors, and a cost-aware indexing system.

  • safer data handling (no silent corruption)
  • explicit equality indexes
  • selectivity-aware planner
  • large speedups for selective queries

v0.0.4

27 Apr 22:59

Choose a tag to compare