Skip to content

range queries#74

Open
JslYoon wants to merge 6 commits into
mainfrom
jslyoon-rangequery-execute
Open

range queries#74
JslYoon wants to merge 6 commits into
mainfrom
jslyoon-rangequery-execute

Conversation

@JslYoon
Copy link
Copy Markdown
Contributor

@JslYoon JslYoon commented Apr 3, 2025

Adding range query execution for all indexes if they exist. Modularized the code so that there exists a function for processing point queries, and there is a function that processes range queries (if they exist).
Two additional input were added, which is num_range_queries, and selectivity

@JslYoon JslYoon requested review from Copilot, ephoris and ramananeesh and removed request for ephoris and ramananeesh April 3, 2025 03:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces range query execution for all available indexes and adds supporting configuration and benchmark code. The changes include new range query tests for each index, the implementation of an execute_range_queries helper function, and new configuration options and command-line arguments related to range queries.

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/test_skiplist/skiplist_tests.cpp Added range query test for SkipList index
tests/test_pgm/pgm_tests.cpp Added range query test for PGM index
tests/test_lipp/lipp_tests.cpp Added range query test for Lipp index
tests/test_leveldb/leveldb_tests.cpp Added range query test for LevelDB index
tests/test_imprints/imprint_tests.cpp Added range query test for Imprints index
tests/test_btree/btree_tests.cpp Added range query test for BTree index
tests/test_art/art_tests.cpp Added range query test for ART index
tests/test_alex/alex_tests.cpp Added range query test for Alex index
src/bliss_bench.cpp Extended benchmarking to include range query execution
src/bliss/util/execute.h Added execute_range_queries function and included
src/bliss/util/config.h Added range query related config options
src/bliss/util/args.h Added new command-line options for range queries and selectivity factor
Comments suppressed due to low confidence (2)

src/bliss/util/execute.h:56

  • [nitpick] Clarify the intended scale of the selectivity factor: if selectivity is provided as a percentage, the computation may need to divide by 100 to calculate the correct average range size.
key_type avg_range_size = static_cast<key_type>(data_range * selectivity);

tests/test_skiplist/skiplist_tests.cpp:30

  • [nitpick] Consider defining a constant for the 'Not implemented' error message to ensure consistency across all tests.
EXPECT_STREQ("Not implemented", e.what());

@JslYoon JslYoon force-pushed the jslyoon-rangequery-execute branch from bfa145c to 9b9c4ec Compare April 20, 2025 04:52
@JslYoon
Copy link
Copy Markdown
Contributor Author

JslYoon commented Apr 20, 2025

Added additional range query execution method.
Selectivity now takes in a comma-separated values of the selectivity factors.
All the changes are reflected in bench.py scripts, where the default selectivity values are [0.01, 0.1, 0.25]

Comment thread src/bliss_bench.cpp Outdated
size_t num_writes = std::round(config.write_factor * data.size());
size_t num_mixed = num_inserts - (num_preload + num_writes);
size_t num_reads = std::round(config.read_factor * data.size());
size_t num_ranges = std::round(config.range_query_factor * data.size());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if range_query_factor is intended to be used to determine the number of range queries we perform, we should use something like range_query_perc. Also, is it required for the number of range queries to be a factor of the data size?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that factor would be the best to reflect the performance of a range query as the number of entry data in an index is not always the same.
Putting the num_ranges calculate as a factor would measure the relative performance. Please correct me if I'm wrong.

I also changed the variable name to range_query_perc!

Comment thread src/bliss/util/execute.h Outdated

key_type data_range = *std::max_element(data.begin(), data.end()) -
*std::min_element(data.begin(), data.end());
key_type avg_range_size = static_cast<key_type>(data_range * selectivity);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use better variable names - call this selected_data_range or qualifying_data_range.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updated!

@ramananeesh ramananeesh self-requested a review May 13, 2025 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants