Skip to content

feat(rocksdb): migrate SQLite indexing to RocksDB#64

Merged
rayandrew merged 1 commit into
llnl:developfrom
rayandrew:feat/rocksdb
Apr 7, 2026
Merged

feat(rocksdb): migrate SQLite indexing to RocksDB#64
rayandrew merged 1 commit into
llnl:developfrom
rayandrew:feat/rocksdb

Conversation

@rayandrew
Copy link
Copy Markdown
Collaborator

Migrate SQLite Indexing to RocksDB

Summary

This PR migrates DFTracer indexing and provenance storage from SQLite sidecars to RocksDB-backed stores.

It also includes the follow-up correctness and recovery work needed to make the RocksDB path production-safe:

  • fixes transaction atomicity around file rebuilds
  • adds rollback-on-failure behavior for async transactional writes
  • checks iterator status() after prefix scans and covers that path with tests
  • hardens cache state handling in gzip/tar indexers
  • cleans up failed DB::Open() paths
  • improves RocksDB manager lifecycle coverage
  • captures executor context explicitly in RocksDB awaitables

Key Changes

Transaction correctness

  • IndexDatabase::delete_file_data() now respects active transaction batches for all deletes.
  • IndexDatabase::get_or_create_file_info() no longer mixes immediate deletes with batched replacement writes.
  • Added rollback_transaction() to index/provenance database wrappers.
  • Added TransactionScope RAII helper for rollback-on-exception behavior in:
    • index builder persistence
    • gzip index writes
    • tar index writes
    • provenance writes

Iterator error handling

  • Prefix scans in IndexDatabase and ProvenanceDatabase now check iterator status() after iteration.
  • Extracted the scan loop into a shared internal helper for consistent behavior.
  • Added direct test coverage for non-OK iterator status handling.

Cache/state hardening

  • GzipIndexer no longer uses 0 as an implicit “not cached” sentinel for metadata fields.
  • TarIndexer cache fields now use explicit state and synchronization instead of unsynchronized primitive members.

RocksDB/runtime cleanup

  • Failed RocksDatabase::open() now cleans up partially created handles.
  • RocksDB awaitables capture the executor context at creation time and resume consistently on that captured executor when available.
  • RocksDatabase now uses immutable internal read/write options helpers rather than mutable per-instance options state.
  • RocksDBManager lifecycle coverage was expanded for reset/shutdown/upgrade semantics.

Build/test cleanup

  • Fixed RocksDB static/shared import handling in CMake.
  • Vendored CPM.cmake to avoid bootstrap download failures in CI.
  • Added/expanded targeted RocksDB/indexer tests:
    • rollback behavior
    • iterator error path handling
    • manager reset/shutdown/upgrade behavior

Testing

Passed locally:

  • full C++ test suite
  • Ubuntu 22.04 Docker test run

Focused regression checks included:

  • utilities/indexer/test_rocksdb_storage
  • utilities/indexer/test_scan_prefix
  • utilities/indexer/test_index_database
  • utilities/indexer/test_provenance_database
  • utilities/indexer/test_index_builder
  • binaries/test_dftracer_info
  • binaries/test_dftracer_index
  • binaries/test_dftracer_organize
  • binaries/test_dftracer_tar

Notes

  • This PR is the SQLite-to-RocksDB migration branch, with the associated correctness, recovery, and CI hardening needed to stabilize the new backend.
  • An attempted follow-up refactor of RocksDBManager::get_or_open() and some async capture rewrites was explicitly discarded because it caused broad runtime regressions. Those changes are not part of this PR.
  • This branch is intended to be squashed before merge.

Copilot AI review requested due to automatic review settings April 6, 2026 06:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates DFTracer indexing/provenance storage from SQLite sidecars to a RocksDB-backed .dftindex store, updating APIs, utilities, tests, build tooling, and docs to match the new storage model.

Changes:

  • Replaced .idx/.pidx sidecar concepts with root-local .dftindex store semantics across C/C++/Python APIs and tests.
  • Introduced RocksDB core utilities (async helpers, key codec, DB manager) and updated pipeline/executor “db pool” plumbing.
  • Hardened I/O + runtime behavior (ScopedFd RAII, iterator/status handling helpers, rpath/test runner adjustments, docs/CI updates).

Reviewed changes

Copilot reviewed 250 out of 281 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/reader/test_reader_stream.cpp Updates reader API usage to get_index_path()
tests/reader/test_reader_formats.cpp Adjusts indexer expectations for RocksDB store lifecycle (exists/need_rebuild)
tests/reader/test_reader.cpp Updates indexer/reader getters to get_index_path()
tests/reader/test_reader.c Renames test comments for index_path terminology
tests/reader/test_basic_factory.cpp Aligns factory tests with .dftindex root and determine_index_path()
tests/python/test_trace_reader.py Updates Python tests to expect .dftindex store behavior
tests/python/test_reorganization_planner.py Switches index path inputs from .idx to environment index path
tests/binaries/test_dftracer_tar.cpp Updates tar binary tests to check .dftindex directory existence
tests/binaries/test_dftracer_server.cpp Improves environment skip check by binding to loopback
tests/binaries/test_dftracer_organize.cpp Updates organize tests to verify .dftindex directories exist
tests/binaries/test_dftracer_info.cpp Ensures binary tests set runtime library path for RocksDB deps
tests/CMakeLists.txt Adds rpath helper for tests to locate shared libs at runtime
src/dftracer/utils/utilities/replay/replay.cpp Renames idx_path to index_path for reader creation
src/dftracer/utils/utilities/reader/trace_reader.cpp Switches probing/usage to .dftindex path and index_path_
src/dftracer/utils/utilities/reader/internal/tar_reader.h Renames member/API from idx_path to index_path
src/dftracer/utils/utilities/reader/internal/reader_factory.cpp Renames factory arg to index_path and forwards to readers
src/dftracer/utils/utilities/reader/internal/reader_c.cpp Renames C API parameter and error message to index_path
src/dftracer/utils/utilities/reader/internal/gzip_reader.h Renames member/API from idx_path to index_path
src/dftracer/utils/utilities/reader/internal/gzip_reader.cpp Updates implementation/logging to use index_path
src/dftracer/utils/utilities/indexer/visitors/manifest_visitor.cpp Routes manifest persistence through IndexDatabase (no raw SQLite)
src/dftracer/utils/utilities/indexer/internal/transaction_scope.h Adds RAII transaction helper for rollback-on-failure
src/dftracer/utils/utilities/indexer/internal/tar/tar_indexer.h Migrates TAR indexer internals off SQLite; hardens caching with optionals/mutex
src/dftracer/utils/utilities/indexer/internal/tar/queries/query_archive_id.cpp Removes SQLite TAR query implementation
src/dftracer/utils/utilities/indexer/internal/tar/queries/insert_tar_file_record.cpp Removes SQLite TAR insert implementation
src/dftracer/utils/utilities/indexer/internal/tar/queries/insert_tar_checkpoint_record.cpp Removes SQLite TAR checkpoint insert implementation
src/dftracer/utils/utilities/indexer/internal/tar/queries/insert_file_record.cpp Removes SQLite file record insert implementation
src/dftracer/utils/utilities/indexer/internal/tar/queries/insert_archive_record.cpp Removes SQLite archive insert implementation
src/dftracer/utils/utilities/indexer/internal/tar/queries/insert_archive_metadata_record.cpp Removes SQLite archive metadata insert implementation
src/dftracer/utils/utilities/indexer/internal/sqlite/statement.h Removes deprecated forwarding header
src/dftracer/utils/utilities/indexer/internal/sqlite/database.h Removes deprecated forwarding header
src/dftracer/utils/utilities/indexer/internal/indexer_factory.cpp Generates .dftindex roots (via determine_index_path)
src/dftracer/utils/utilities/indexer/internal/indexer_c.cpp Renames C API arg to index_path
src/dftracer/utils/utilities/indexer/internal/helpers.h Adds normalize_index_root, renames validity check to directory-based
src/dftracer/utils/utilities/indexer/internal/helpers.cpp Implements .dftindex normalization + directory validity check
src/dftracer/utils/utilities/indexer/internal/gzip/queries/query_stored_file_info.cpp Removes SQLite gzip query implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/query_schema_validity.cpp Removes SQLite gzip schema check
src/dftracer/utils/utilities/indexer/internal/gzip/queries/query_num_lines.cpp Removes SQLite gzip query implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/query_max_bytes.cpp Removes SQLite gzip query implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/query_file_id.cpp Removes SQLite gzip query implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/query_checkpoint_size.cpp Removes SQLite gzip query implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/query_checkpoint.cpp Removes SQLite gzip query implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/queries.h Removes SQLite gzip query header
src/dftracer/utils/utilities/indexer/internal/gzip/queries/insert_file_record.cpp Removes SQLite gzip insert implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/insert_file_metadata_record.cpp Removes SQLite gzip insert implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/insert_checkpoint_record.cpp Removes SQLite gzip insert implementation
src/dftracer/utils/utilities/indexer/internal/gzip/queries/delete_file_record.cpp Removes SQLite gzip delete implementation
src/dftracer/utils/utilities/indexer/internal/gzip/gzip_indexer.h Migrates gzip indexer off SQLite; adds explicit cache readiness flags
src/dftracer/utils/utilities/indexer/internal/checkpoint_size.h Reorders parameters in signature for checkpoint sizing
src/dftracer/utils/utilities/composites/file_merger_utility.cpp Renames effective index var; uses index_path for readers/line inputs
src/dftracer/utils/utilities/composites/dft/views/view_reader_utility.cpp Renames fluent builder with_idx_pathwith_index_path
src/dftracer/utils/utilities/composites/dft/views/view_builder_utility.cpp Updates pruner + statistics queries to use IndexDatabase methods
src/dftracer/utils/utilities/composites/dft/statistics/trace_statistics.cpp Renames JSON field idx_pathindex_path
src/dftracer/utils/utilities/composites/dft/statistics/statistics_aggregator_utility.cpp Switches async runner from sqlite to rocksdb; normalizes index roots
src/dftracer/utils/utilities/composites/dft/statistics/chunk_detail_scanner_utility.cpp Uses index_path when creating indexed read inputs
src/dftracer/utils/utilities/composites/dft/reorganize/reconstruction_planner.cpp Uses RocksDB provenance DB read-only mode + fid-scoped queries
src/dftracer/utils/utilities/composites/dft/reorganize/event_router.cpp Awaits async provenance flush; avoids capturing ref in spawned tasks
src/dftracer/utils/utilities/composites/dft/internal/utils.cpp Changes determine_index_path() to return root-local .dftindex dir
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_time_bounds.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_resolved_by_hash.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_metadata_lines.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_index_dimensions.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_hash_by_resolved.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_file_bloom_filters_batch.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_file_bloom_filter.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_event_ranges.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_chunk_bloom_filters_batch.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/query_chunk_bloom_filters.cpp Removes SQLite query implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/insert_metadata_lines.cpp Removes SQLite insert implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/insert_index_dimension.cpp Removes SQLite insert implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/insert_hash_resolution.cpp Removes SQLite insert implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/insert_file_bloom_filter.cpp Removes SQLite insert implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/insert_event_range.cpp Removes SQLite insert implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/insert_chunk_bloom_filter.cpp Removes SQLite insert implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/delete_metadata_lines.cpp Removes SQLite delete implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/delete_hash_resolutions.cpp Removes SQLite delete implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/delete_file_bloom_filter.cpp Removes SQLite delete implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/delete_event_ranges.cpp Removes SQLite delete implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/delete_chunk_statistics.cpp Removes SQLite delete implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/delete_chunk_dimension_stats.cpp Removes SQLite delete implementation
src/dftracer/utils/utilities/composites/dft/indexing/queries/delete_chunk_bloom_filters.cpp Removes SQLite delete implementation
src/dftracer/utils/utilities/composites/dft/indexing/chunk_statistics.cpp Makes pid/tid formatting bounded/checkable with to_chars
src/dftracer/utils/utilities/composites/dft/indexing/chunk_indexer_utility.cpp Uses index_path for indexed reader input
src/dftracer/utils/utilities/composites/dft/event_collector_utility.cpp Uses index_path when creating readers
src/dftracer/utils/utilities/composites/dft/chunk_manifest_mapper_utility.cpp Uses index_path in chunk spec mapping
src/dftracer/utils/utilities/composites/dft/chunk_extractor_utility.cpp Uses index_path for indexed reader path
src/dftracer/utils/utilities/composites/dft/aggregators/chunk_mapper_utility.cpp Renames fluent builder to with_index_path
src/dftracer/utils/utilities/composites/dft/aggregators/chunk_aggregator_utility.cpp Updates index_dir derivation from index_path
src/dftracer/utils/utilities/call_tree/call_tree_mpi.cpp Uses .dftindex root instead of .idx
src/dftracer/utils/utilities/call_tree/call_tree_internal.cpp Updates comment + uses .dftindex root
src/dftracer/utils/server/viz_api.cpp Updates index path references to .dftindex store
src/dftracer/utils/server/trace_api.cpp Updates index path references + stats aggregator inputs
src/dftracer/utils/python/utilities/statistics_query.cpp Uses determine_index_path() for index lookup
src/dftracer/utils/python/utilities/statistics_aggregator.cpp Uses determine_index_path() for index lookup
src/dftracer/utils/python/utilities/reorganization_planner.cpp Renames output field idx_pathindex_path
src/dftracer/utils/python/utilities/reconstruction_planner.cpp Updates docs to .dftindex terminology
src/dftracer/utils/python/utilities/metadata_collector.cpp Updates to .dftindex terminology + index path computation
src/dftracer/utils/python/utilities/comparator.cpp Uses .dftindex index paths in aggregation
src/dftracer/utils/python/utilities/aggregator.cpp Updates docs to .dftindex terminology
src/dftracer/utils/python/trace_reader_iterator.h Tracks background task future for safe iterator teardown
src/dftracer/utils/python/trace_reader_iterator.cpp Waits for producer completion on dealloc to avoid use-after-free
src/dftracer/utils/python/indexer.h Renames python binding member idx_pathindex_path
src/dftracer/utils/core/sqlite/error.cpp Removes SQLite error implementation
src/dftracer/utils/core/sqlite/database.cpp Removes SQLite database implementation
src/dftracer/utils/core/sqlite/async.cpp Removes SQLite async helpers (replaced by RocksDB async)
src/dftracer/utils/core/runtime.cpp Clears moved-from tasks post-await to reduce resource retention
src/dftracer/utils/core/rocksdb/key_codec.cpp Adds RocksDB key encoding/decoding helpers
src/dftracer/utils/core/rocksdb/async.cpp Adds RocksDB async helpers using executor db pool
src/dftracer/utils/core/pipeline/pipeline.cpp Renames sqlite pool config to db pool config
src/dftracer/utils/core/pipeline/executor.cpp Renames sqlite pool to db pool; adjusts shutdown/drain behavior
src/dftracer/utils/core/io/thread_pool_backend.h Adds pread callback submission API
src/dftracer/utils/core/io/thread_pool_backend.cpp Implements callback-based pread and completion dispatch
src/dftracer/utils/core/io/kqueue_thread_pool_backend.h Adds pread callback submission API
src/dftracer/utils/core/io/kqueue_thread_pool_backend.cpp Implements callback-based pread and completion dispatch
src/dftracer/utils/core/io/io_uring_backend.h Adds completion callback plumbing to io_uring backend
src/dftracer/utils/core/io/io_backend_sync.cpp Updates sync I/O doc comment (no SQLite VFS assumption)
src/dftracer/utils/core/io/epoll_thread_pool_backend.h Adds pread callback submission API
src/dftracer/utils/core/io/epoll_thread_pool_backend.cpp Implements callback-based pread and completion dispatch
src/dftracer/utils/core/env.cpp Adds environment utility and RocksDB open-files config
src/dftracer/utils/binaries/dftracer_tar.cpp Updates CLI wording/logging for .dftindex store
src/dftracer/utils/binaries/dftracer_split.cpp Awaits index builder and uses .dftindex root paths
src/dftracer/utils/binaries/dftracer_server.cpp Updates index dir help/comment to .dftindex stores
src/dftracer/utils/binaries/dftracer_reconstruct.cpp Uses .dftindex root for metadata/reader inputs
src/dftracer/utils/binaries/dftracer_organize.cpp Renames “sidecar” step to index store; awaits build tasks
src/dftracer/utils/binaries/dftracer_index.cpp Updates CLI docs to .dftindex terminology
src/dftracer/utils/binaries/dftracer_gen_fake_trace.cpp Updates verify flow to use .dftindex root paths
src/dftracer/utils/binaries/dftracer_event_count.cpp Updates index path variable naming/references
src/dftracer/utils/binaries/dftracer_comparator.cpp Updates comment + .dftindex path usage
src/dftracer/utils/binaries/dftracer_aggregator.cpp Awaits index builder and uses .dftindex root paths
setup.py Minor formatting cleanup
include/dftracer/utils/utilities/reader/trace_reader.h Updates docs and member name to index_path_
include/dftracer/utils/utilities/reader/internal/reader_factory.h Renames idx_path param to index_path
include/dftracer/utils/utilities/reader/internal/reader.h Renames API get_idx_path()get_index_path() + C API signature
include/dftracer/utils/utilities/indexer/internal/scan_prefix.h Adds shared iterator prefix scan helper with status checking
include/dftracer/utils/utilities/indexer/internal/indexer_factory.h Updates docs/signature for .dftindex path
include/dftracer/utils/utilities/indexer/internal/indexer.h Renames index path getter + C API signature
include/dftracer/utils/utilities/indexer/index_builder_utility.h Renames build result idx_pathindex_path
include/dftracer/utils/utilities/fileio/types/chunk_spec.h Renames chunk spec field to index_path
include/dftracer/utils/utilities/fileio/lines/sources/async_streaming_gz_line_generator.h Uses ScopedFd for fd lifetime safety
include/dftracer/utils/utilities/fileio/lines/sources/async_plain_file_line_generator.h Uses ScopedFd for fd lifetime safety
include/dftracer/utils/utilities/fileio/lines/sources/async_plain_file_bytes_generator.h Uses ScopedFd for fd lifetime safety
include/dftracer/utils/utilities/fileio/lines/line_types.h Renames idx_pathindex_path in line read input
include/dftracer/utils/utilities/fileio/lines/line_bytes_range.h Updates docs to .dftindex example
include/dftracer/utils/utilities/composites/types.h Renames various composite inputs to index_path
include/dftracer/utils/utilities/composites/line_batch_processor_utility.h Updates inputs and iterator binding to index_path
include/dftracer/utils/utilities/composites/file_merger_utility.h Renames fluent builder parameter to match index_path
include/dftracer/utils/utilities/composites/dft/views/view_reader_utility.h Renames view reader input and builder method
include/dftracer/utils/utilities/composites/dft/views/view_builder_utility.h Renames view builder input and builder method
include/dftracer/utils/utilities/composites/dft/statistics/trace_statistics.h Renames stats field to index_path
include/dftracer/utils/utilities/composites/dft/statistics/statistics_aggregator_utility.h Renames stats input to index_path
include/dftracer/utils/utilities/composites/dft/statistics/statistics.h Updates docs to .dftindex terminology
include/dftracer/utils/utilities/composites/dft/statistics/chunk_detail_scanner_utility.h Renames scan input to index_path
include/dftracer/utils/utilities/composites/dft/reorganize/reorganization_planner.h Renames source file info to index_path
include/dftracer/utils/utilities/composites/dft/reorganize/provenance_tracker.h Makes flush async (CoroTask<void>)
include/dftracer/utils/utilities/composites/dft/metadata_collector_utility.h Renames inputs/outputs to index_path and updates docs
include/dftracer/utils/utilities/composites/dft/internal/utils.h Updates docs and signatures for .dftindex roots
include/dftracer/utils/utilities/composites/dft/internal/chunk_spec.h Updates spec conversion to index_path
include/dftracer/utils/utilities/composites/dft/indexing/chunk_statistics.h Updates storage docs (no SQLite assumption)
include/dftracer/utils/utilities/composites/dft/indexing/chunk_pruner_utility.h Renames pruner input field to index_path
include/dftracer/utils/utilities/composites/dft/indexing/chunk_indexer_utility.h Renames builder method to with_index_path
include/dftracer/utils/utilities/composites/dft/indexing/chunk_dimension_stats.h Updates docs (no SQLite assumption)
include/dftracer/utils/utilities/composites/dft/indexing/bloom_filter_cache.h Renames cache keying from idx_path to index_path
include/dftracer/utils/utilities/composites/dft/indexing/bloom_filter.h Updates docs to RocksDB blob storage terminology
include/dftracer/utils/utilities/composites/dft/comparator/comparison_config.h Updates docs to .dftindex terminology
include/dftracer/utils/utilities/composites/dft/chunk_extractor_utility.h Updates chunk spec mapping to index_path
include/dftracer/utils/utilities/composites/dft/aggregators/chunk_aggregator_utility.h Renames builder method to with_index_path
include/dftracer/utils/server/trace_index.h Renames cached index path to .dftindex root terminology
include/dftracer/utils/core/sqlite/vfs.h Removes SQLite VFS header
include/dftracer/utils/core/sqlite/statement.h Removes SQLite statement wrapper header
include/dftracer/utils/core/sqlite/error.h Removes SQLite error header
include/dftracer/utils/core/sqlite/database.h Removes SQLite database header
include/dftracer/utils/core/runtime.h Clears moved-from tasks post-await to reduce resource retention
include/dftracer/utils/core/rocksdb/key_codec.h Declares RocksDB key codec + builder
include/dftracer/utils/core/rocksdb/filesystem.h Declares DFTracer RocksDB filesystem/env helpers
include/dftracer/utils/core/rocksdb/db_manager.h Adds process-wide RocksDB instance manager
include/dftracer/utils/core/rocksdb/database.h Adds RocksDB DB wrapper API
include/dftracer/utils/core/pipeline/pipeline_config.h Renames config sqlite_pool_sizedb_pool_size
include/dftracer/utils/core/pipeline/executor.h Renames sqlite pool to db pool; updates API name
include/dftracer/utils/core/io/io_backend.h Adds callback-based pread API to backends
include/dftracer/utils/core/env.h Declares Env helper + RocksDB tuning accessor
include/dftracer/utils/core/common/scoped_fd.h Adds RAII fd wrapper for safe close on all paths
include/dftracer/utils/core/common/constants.h Changes index extension constant to .dftindex
docs/source/utilities/indexer.rst Updates Python examples for .dftindex persistence
docs/source/quickstart.rst Updates quickstart to omit explicit .idx path
docs/source/installation.rst Removes SQLite dependency from install steps
docs/source/cpp_api/index.rst Removes sqlite from C++ API docs overview
docs/source/conf.py Removes unused import
docs/source/api/indexer.rst Updates Python Indexer signature docs to index_path
docs/scripts/generate_api_index.py Formatting cleanup and minor refactors
cmake/modules/LibraryHelpers.cmake Adds rpath emission for non-interface libraries
cmake/modules/InstallHelpers.cmake Removes SQLite dependency handling
cmake/modules/CPM.cmake Vendors CPM fallback and improves download error reporting
Makefile Adds optional ty check in python test target
CMakeLists.txt Adds ccache compiler launcher auto-detection
.github/workflows/python-publish.yaml Updates action versions and cibuildwheel version
.github/workflows/format-check.yaml Updates action versions and uv setup action

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/dftracer/utils/core/rocksdb/key_codec.cpp Outdated
Comment thread src/dftracer/utils/utilities/indexer/internal/indexer_factory.cpp
Comment thread src/dftracer/utils/utilities/reader/trace_reader.cpp Outdated
Comment thread include/dftracer/utils/utilities/composites/dft/internal/utils.h
@rayandrew rayandrew force-pushed the feat/rocksdb branch 2 times, most recently from 17c8cb6 to 41061bb Compare April 6, 2026 07:03
Replace SQLite-backed indexing and provenance storage with RocksDB-backed stores.

  Key changes:
  - add RocksDB async/database/db-manager/filesystem/key-codec layers
  - migrate index and provenance databases from SQLite to RocksDB
  - update index builder, trace reader, reorganize, view, stats, and comparator paths for
  RocksDB
  - harden transaction atomicity and rollback behavior with TransactionScope
  - add iterator status checking for prefix scans
  - harden gzip/tar indexer cache state and metadata handling
  - capture executor context in RocksDB awaitables
  - clean up failed RocksDB open paths and manager lifecycle behavior
  - vendor CPM 0.42.1 and update CI/build integration
  - refresh docs, Python bindings, and C++/Python test coverage for the new backend

  Validation:
  - full test suite passed
  - Ubuntu 22.04 Docker run passed
  - focused RocksDB/indexer regression tests passed.
@rayandrew rayandrew merged commit f3be94e into llnl:develop Apr 7, 2026
49 of 50 checks passed
@rayandrew rayandrew deleted the feat/rocksdb branch April 7, 2026 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants