CascadingReindexingTemplate Refactoring#19106
CascadingReindexingTemplate Refactoring#19106capistrant wants to merge 8 commits intoapache:masterfrom
Conversation
| this.defaultPartitioningRule = ReindexingPartitioningRule.syntheticRule( | ||
| defaultSegmentGranularity, | ||
| defaultPartitionsSpec, | ||
| defaultPartitioningVirtualColumns | ||
| ); |
There was a problem hiding this comment.
should this check if it is set on tuningConfig? or forbid defining it on the tuningConfig?
There was a problem hiding this comment.
by it do you mean partitionsSpec? I had planned on documenting it as being blown away by whatever your
- applicable rule defines
- default top level partitions spec is if no rule applies
alternatives to the above:
- force a tuningConfig to be defined top level with at least a partitions spec and use that as the default
- instead of silently blowing away the static tuningConfig partitions spec, throw an error at submit time saying you can't define it there if someone tries
| List<VirtualColumn> allVCs = new ArrayList<>(deletionVCs); | ||
|
|
||
| if (partitioningVCs != null && !partitioningVCs.isEmpty()) { | ||
| // Check for name collisions |
There was a problem hiding this comment.
VirtualColumns.create does a check like this too, though i guess you lose your more specific exception messaging here. I suppose you could catch the IAE thrown by that and make the nicer message that way, or leave a comment about why validation is being done externally here
There was a problem hiding this comment.
huh, I hadn't realized that create could handle this for me. I like delegating to that and enriching the message if thrown
Description
relates to #19092
After taking the initial implementation of the CascadingReindexingTemplate and associated RuleProvider concepts for a spin, some refactoring opportunities have been identified. I hope that these will make more conceptual sense to operators trying to configure this as well as make things generally easier to reason about.
Rule Type Refactoring
ReindexingSegmentGranularityRuleandReindexingTuningConfigRulehave been deleted.ReindexingPartitioningRule
This new rule type consolidates partitioning details into a single rule. This rule takes the place of
ReindexingSegmentGranularityRulein that it has special consideration for creating the base timeline of reindexing search intervals and that all search intervals created align to natural Segment Granularity boundaries being applied.The components of this rule are:
ReindexingIndexSpecRule
A new rule type that is scoped to the
IndexSpecthat goes in a task tuning config.CascadingReindexingTemplatenew top level fieldsI have considered combining the below 2 + defaultSegmentGranularity into a defaultPartitioningSomething top level field but did not do this as of now.
defaultPartitionsSpecRequired partitionsspec that will be applied in cases where a synthetic
ReindexingPartitioningRuleneeds to be generated to complete the reindexing timelinedefaultPartitioningVirtualColumnsOptional default partitioning virtual column expressions that would be used if your default partitionspec utilized the new functionality in #19061
tuningConfigCascadingReindexingTemplatenow has an optional top leveltuningConfigfield where an operator can provide aUserCompactionTaskQueryTuningConfigthat will apply to all tasks generated by the supervisor. Note that any partitions spec here is useless and will be blown away by either the defaultPartitionsSpec or the applicable ReindexingPartitioningRule's partition spec. IndexSpec however can take a default here that will apply in any case where there is not an IndexSpecRule that applies to the task being generatedRelease note
N/A for now but we will have a larger release note I can share encompassing all of the new reindexing rule provider stuff for druid 37.
-->
Key changed/added classes in this PR
CascadingReindexingTemplateReindexingConfigBuilderReindexingDeletionRuleOptimizerReindexingPartitioningRuleReindexingIndexSpecRuleThis PR has: