Skip to content

Dataset-generator: allow search_engine_type=vespa and default vespa_schema to collection_name#264

Open
radu-gheorghe wants to merge 3 commits into
SeaseLtd:dataset-generatorfrom
radu-gheorghe:dataset-generator-vespa-small-adjustments
Open

Dataset-generator: allow search_engine_type=vespa and default vespa_schema to collection_name#264
radu-gheorghe wants to merge 3 commits into
SeaseLtd:dataset-generatorfrom
radu-gheorghe:dataset-generator-vespa-small-adjustments

Conversation

@radu-gheorghe
Copy link
Copy Markdown

This should make it a bit easier to use dataset-generator with Vespa.

search_engine_type=vespa didn't seem to work from the command line (uv run...) before this.

Vespa schema is the equivalent of a collection in Solr, so collection_name should be a good default. I don't know why one would override it, but it's still possible now.

Didn't want to be intrusive, it's my first contribution 🙈

Let me know if you need me to change anything.

@radu-gheorghe
Copy link
Copy Markdown
Author

Added another follow-up fix for num_queries_needed.

While testing the Vespa query_template flow, I noticed that lowering num_queries_needed did not limit the evaluation work the generator performed if resources/tmp/datastore.json already contained more queries from a previous run.

This patch should make the behavior more predictable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant