range queries#74
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR introduces range query execution for all available indexes and adds supporting configuration and benchmark code. The changes include new range query tests for each index, the implementation of an execute_range_queries helper function, and new configuration options and command-line arguments related to range queries.
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_skiplist/skiplist_tests.cpp | Added range query test for SkipList index |
| tests/test_pgm/pgm_tests.cpp | Added range query test for PGM index |
| tests/test_lipp/lipp_tests.cpp | Added range query test for Lipp index |
| tests/test_leveldb/leveldb_tests.cpp | Added range query test for LevelDB index |
| tests/test_imprints/imprint_tests.cpp | Added range query test for Imprints index |
| tests/test_btree/btree_tests.cpp | Added range query test for BTree index |
| tests/test_art/art_tests.cpp | Added range query test for ART index |
| tests/test_alex/alex_tests.cpp | Added range query test for Alex index |
| src/bliss_bench.cpp | Extended benchmarking to include range query execution |
| src/bliss/util/execute.h | Added execute_range_queries function and included |
| src/bliss/util/config.h | Added range query related config options |
| src/bliss/util/args.h | Added new command-line options for range queries and selectivity factor |
Comments suppressed due to low confidence (2)
src/bliss/util/execute.h:56
- [nitpick] Clarify the intended scale of the selectivity factor: if selectivity is provided as a percentage, the computation may need to divide by 100 to calculate the correct average range size.
key_type avg_range_size = static_cast<key_type>(data_range * selectivity);
tests/test_skiplist/skiplist_tests.cpp:30
- [nitpick] Consider defining a constant for the 'Not implemented' error message to ensure consistency across all tests.
EXPECT_STREQ("Not implemented", e.what());
bfa145c to
9b9c4ec
Compare
|
Added additional range query execution method. |
| size_t num_writes = std::round(config.write_factor * data.size()); | ||
| size_t num_mixed = num_inserts - (num_preload + num_writes); | ||
| size_t num_reads = std::round(config.read_factor * data.size()); | ||
| size_t num_ranges = std::round(config.range_query_factor * data.size()); |
There was a problem hiding this comment.
if range_query_factor is intended to be used to determine the number of range queries we perform, we should use something like range_query_perc. Also, is it required for the number of range queries to be a factor of the data size?
There was a problem hiding this comment.
I thought that factor would be the best to reflect the performance of a range query as the number of entry data in an index is not always the same.
Putting the num_ranges calculate as a factor would measure the relative performance. Please correct me if I'm wrong.
I also changed the variable name to range_query_perc!
|
|
||
| key_type data_range = *std::max_element(data.begin(), data.end()) - | ||
| *std::min_element(data.begin(), data.end()); | ||
| key_type avg_range_size = static_cast<key_type>(data_range * selectivity); |
There was a problem hiding this comment.
let's use better variable names - call this selected_data_range or qualifying_data_range.
Adding range query execution for all indexes if they exist. Modularized the code so that there exists a function for processing point queries, and there is a function that processes range queries (if they exist).
Two additional input were added, which is num_range_queries, and selectivity