Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
6ce7ea5
Squashed all previous commits
YCHuang2112sub Mar 8, 2025
98b6083
[Refactor] Standardize RDF Metadata and Fix Linker Errors
YCHuang2112sub Feb 15, 2026
eae580a
[Feature] Implement Pure MVCC RDF Persistence and Redirection
YCHuang2112sub Feb 15, 2026
0a2b240
[Fix] Refine Compaction Boundary Tracking and Range Deletions
YCHuang2112sub Feb 15, 2026
3a2e480
[Build] Update Makefiles and .gitignore for Development Stability
YCHuang2112sub Feb 15, 2026
9c24f95
[Doc] Update Paper Drafts and Project Reports
YCHuang2112sub Feb 15, 2026
351bfc2
Restore in-progress experimental configuration and tool updates
YCHuang2112sub Feb 15, 2026
9b8c200
docs: update README with experiment instructions and implementation d…
YCHuang2112sub Feb 17, 2026
1187474
docs: update README with experiment instructions and implementation d…
YCHuang2112sub Feb 17, 2026
ec35dce
Fix SuRF memory leaks: Implement proper virtual destructors for SuRF …
YCHuang2112sub Feb 17, 2026
cc68cd3
Merge branch 'ych17'
YCHuang2112sub Feb 17, 2026
91cd635
Update README: rename RaTer filters to Spry
YCHuang2112sub Feb 18, 2026
8cc02b8
Fix memory leak in SuRF_RDF by fixing getNumberOfTotalLevels indexing…
YCHuang2112sub Feb 21, 2026
f24725a
Fix indexing bug in PLRDF and PLRDF_t getNumberOfTotalLevels in sys_r…
YCHuang2112sub Feb 21, 2026
849b3a9
Fix SuRF key order issue by skipping point key duplicates after trunc…
YCHuang2112sub Feb 25, 2026
9aa2cbc
Refine RDF level logic and guards in table_cache.cc
YCHuang2112sub Feb 25, 2026
9fbbe32
Synchronize RDF level 0 guards across table cache and table reader
YCHuang2112sub Feb 25, 2026
3f55f96
Initialize e15_YCSB workload with simple example and runner
YCHuang2112sub Feb 25, 2026
eec8763
Cleanup and fix log directory mapping in e15_YCSB runner
YCHuang2112sub Feb 25, 2026
98b644d
Fix compilation errors in version_set.cc, refactor statistics reporti…
YCHuang2112sub Feb 25, 2026
4cef4ba
Fix compilation errors in utils_run_workload.h by uncommenting read c…
YCHuang2112sub Feb 25, 2026
b0c98d6
Implement categorized YCSB point query timing (existing, deleted, non…
YCHuang2112sub Feb 26, 2026
5b29245
Refine Perf IO measurement scope: reset at start of workload
YCHuang2112sub Feb 26, 2026
cc056f7
Refine Statistics scope: reset op.statistics at start of workload
YCHuang2112sub Feb 26, 2026
7eadc2a
Always enable system verifier tracking regardless of load_pq_workload…
YCHuang2112sub Feb 26, 2026
d732a87
Implement fetcher statistics reporting and fix log parsing script
YCHuang2112sub Feb 26, 2026
6b60571
update e15
YCHuang2112sub Feb 26, 2026
852db93
Fix SuRF_RDF range delete loss bug
YCHuang2112sub Feb 26, 2026
9f53d24
Revert LevelIterator RDF filtering and ensure Range Tombstone loading…
YCHuang2112sub Feb 26, 2026
29d809d
Refine RDF logic in Version::Get and TableCache, and add CHECKING_QUE…
YCHuang2112sub Feb 27, 2026
98b293b
Implement ForceLoadingRangeTombstonesFromSSTable flag and update YCSB…
YCHuang2112sub Feb 27, 2026
1082fc5
Implement Kahan summation for precise timing in system_verifier.h and…
YCHuang2112sub Feb 27, 2026
eabaed6
Merge branch 'ych19_rt_loading' into ych18
YCHuang2112sub Feb 27, 2026
4615198
Finalize ych18 merge: Refine RT loading flag and precision timing in …
YCHuang2112sub Feb 27, 2026
3fb44a6
Refactor KahanTimer to use unsigned long long for precise integer nan…
YCHuang2112sub Feb 27, 2026
5448e74
Integrate ych18 refinements: KahanTimer conversion and RT loading flags
YCHuang2112sub Feb 27, 2026
fb9454c
Clean up debug prints in surf.cc on ych18 after merge refinements
YCHuang2112sub Feb 27, 2026
7e7c5ed
Merge pull request #38 from SSD-Brandeis/ych18
YCHuang2112sub Feb 27, 2026
2bd5239
Restore overlapping range tombstone handling logic in surf.cc
YCHuang2112sub Feb 27, 2026
4d31865
Refactor RDF logging to sys_rdfilter.h and clean up self_rd_e15_YCSB
YCHuang2112sub Feb 28, 2026
60f4419
Fix SuRF overlapping range tombstone logic and cleanup redundant code
YCHuang2112sub Feb 28, 2026
d1e5012
quick update
YCHuang2112sub Feb 28, 2026
8aee341
Merge branch 'ych18'
YCHuang2112sub Feb 28, 2026
49b8efd
Fix compilation error: Add missing member function declarations to Ve…
YCHuang2112sub Feb 28, 2026
22efc41
Merge branch 'ych18'
YCHuang2112sub Feb 28, 2026
83ea069
Fix compilation error: Add missing member function declarations to Ve…
YCHuang2112sub Feb 28, 2026
6cc4dcb
Merge branch 'ych18'
YCHuang2112sub Feb 28, 2026
8c255a3
Fix compilation error: Add missing member function declarations to Ve…
YCHuang2112sub Feb 28, 2026
b29b81b
Merge branch 'ych18'
YCHuang2112sub Feb 28, 2026
a9e20a1
Merge branch 'main' of https://github.com/SSD-Brandeis/LSMRangeDeletes
YCHuang2112sub Feb 28, 2026
95db684
update utils_run_workload.h
YCHuang2112sub Feb 28, 2026
2a63c47
update parsing_log_file.py
YCHuang2112sub Feb 28, 2026
ed30b27
Merge pull request #39 from SSD-Brandeis/ych18
YCHuang2112sub Feb 28, 2026
8e23452
Update run_task scripts with use_surf_base parameter
YCHuang2112sub Mar 9, 2026
d6fbdb3
Merge pull request #40 from SSD-Brandeis/ych18
YCHuang2112sub Mar 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
22 changes: 22 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
*.o
# *example*
*.tmp

# .vscode
.vscode/*

*callgrind*

# build
**/build/**

# log
**/*.log

# text
*.txt
secret.txt

tracing_strategy.md

workload_0p*
29 changes: 29 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [

// {
// "type": "node",
// "request": "launch",
// "name": "launch program",
// "skipFiles": [ "<node_internals>/**"],
// "preLaunchTask": "simple_example",
// "program": "${workspaceFolder}/simple_example",
// }
{
"name": "simple_example",
"type": "cppdbg",
"request": "launch",
"program": "${fileDirname}/../development/rocksdb/self_RD/simple_example",
"args": [">", "log"],
"stopAtEntry": false,
"cwd": "${fileDirname}",
"environment": [],
"externalConsole": false,
"MIMode": "gdb",
}
]
}
1 change: 1 addition & 0 deletions .vscode/log
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Clearing system cache ...
29 changes: 29 additions & 0 deletions .vscode/tasks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"tasks": [

// {
// "type": "cppbuild",
// "label": "C/C++: g++ build active file",
// "command": "/usr/bin/g++",
// "args": [
// "-fdiagnostics-color=always",
// "-g",
// "${file}",
// "-o",
// "${fileDirname}/${fileBasenameNoExtension}"
// ],
// "options": {
// "cwd": "${fileDirname}"
// },
// "problemMatcher": [
// "$gcc"
// ],
// "group": {
// "kind": "build",
// "isDefault": true
// },
// "detail": "Task generated by Debugger."
// }
],
"version": "2.0.0"
}
Binary file not shown.
61 changes: 59 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,59 @@
# LSMRangeDeletes
Making range deletes better in LSMs
# LSMRangeDeletes
Making range deletes better in LSMs.

## Building the Project
1. **Build RocksDB Core**:
Go to `./development/rocksdb/` and run:
```bash
make clean # Run if DB file relation is broken
cp ../make_config ./
make static_lib -j8
```

2. **Build Examples**:
Go to experiment directory `./development/rocksdb/self_rd_e[number]_xxx` and run:
```bash
make simple_example -j8
```

## Running Experiments
Run the `simple_example` with specific parameters:
```bash
python run_task_256B_string_key.py
```

## Progress
The project has implemented several advanced range delete filter mechanisms in RocksDB to optimize lookup performance in LSM-trees under frequent range deletions.

### Implemented Filters
1. **Spry-Vec (Range Delete Filter with Vector / PLRDF_STRINGKEY)**:
- **Concept**: A per-level filter that tracks range tombstones at each level of the LSM-tree.
- **Mechanism**: Manages a set of non-overlapping range tombstones `[start, end)` per level. It uses a specialized `isEntryAlive` check to determine if a key is covered by any range tombstone at a given level.
- **Optimization**: Features automatic **Adjacent Range Merging** (e.g., merging `[10, 20)` and `[20, 30)` into `[10, 30)`) to minimize filter size and lookup overhead.

2. **Spry-Vec-Frag (Range Delete Filter with Vector and Fragmentation operation / SPLIT_PLRDF_STRINGKEY)**:
- An extension of PLRDF that supports key-space splitting (by keeping all the newly arrived point queries that fall within older range tombstones) to further refine range tracking within a level, improving filtering efficiency on range deleted keys.

3. **Top-Level-Spry (Top Level Range Delete Filter / TOP_LEVEL_RDF)**:
- **Concept**: A variant of Spry-Vec-Frag that keeps all the rangetombstones on level 1 in the buffer. Theoretically, the fastest filter for point queries on range deleted keys.

4. **Spry-Trie (Range Delete Filter with Trie / SuRF_RDF)**:
- **Concept**: A trie-based range delete filter which utilized LOUDS representation to compress the trie.
- **Mechanism**: Uses a compressed trie to store range tombstones inside the buffer for each file. It allows for efficient "is deleted" checks by traversing the trie.
- **Optimization**: Compresses the trie to store range tombstones.

5. **Spry-Trie-Frag (Range Delete Filter with Trie and Fragmentation operation / SPLIT_SuRF_RDF)**:
- An extension of Spry-Trie that supports key-space splitting (by keeping all the newly arrived point queries that fall within older range tombstones) to further refine range tracking within a level, improving filtering efficiency on range deleted keys.It reaches the balance between memory footprint and speed.

6. **SkyLineRDF (Skyline Range Delete Filter)**:
- **Concept**: A more sophisticated filter that accounts for the sequence numbers (timestamps) of range deletions.
- **Mechanism**: Uses a "skyline" approach to track the maximum sequence number for any given key range. It allows for precise "is alive" checks by comparing a point key's sequence number against the skyline's maximum at that position.
- **Optimization**: Efficiently aggregates overlapping range tombstones into a minimal set of non-overlapping "skyline" segments.


### Core Integration Refinements
- **SST Boundary Tracking**: Modified `FileMetaData` and the MANIFEST to track precise point key and range tombstone boundaries separately, enabling accurate clipping of filters at the SST level.
- **Flush-time Aggregation**: Range tombstones from multiple memtables are consolidated during the flush process to ensure consistent RDF state in Level 0.
- **Compaction Management**: Implemented a "Gather & Shift/Clip" strategy to efficiently update RDF metadata during compaction, ensuring that range deletes are correctly propagated or clipped across SST boundaries.


Loading