Skip to content

Conversation

@airborne12
Copy link
Member

@airborne12 airborne12 commented Dec 17, 2025

Summary

This PR implements Multi-Analyzer Inverted Index feature, which allows creating multiple inverted indexes with different analyzers on a single column.

Key Features

  1. Multiple Indexes on Single Column: Create multiple inverted indexes with different analyzers (standard, keyword, chinese, custom) on the same column
  2. USING ANALYZER Syntax: Query with specific analyzer using MATCH ... USING ANALYZER analyzer_name
  3. Smart Index Selection: When specified analyzer's index is not built, automatically falls back to non-index path (correct results guaranteed)
  4. Analyzer Identity Detection: Prevents duplicate indexes with same analyzer configuration

Use Cases

  • Multi-language search on same text column
  • Precision vs. recall trade-off (exact match vs. fuzzy search)
  • Autocomplete with edge_ngram while keeping standard search

Example

-- Create table with multiple indexes
CREATE TABLE articles (
    id INT,
    content TEXT,
    INDEX idx_std (content) USING INVERTED PROPERTIES("analyzer" = "std_analyzer"),
    INDEX idx_kw (content) USING INVERTED PROPERTIES("analyzer" = "kw_analyzer")
) ...;

-- Query with specific analyzer
SELECT * FROM articles WHERE content MATCH 'hello' USING ANALYZER std_analyzer;
SELECT * FROM articles WHERE content MATCH 'hello' USING ANALYZER kw_analyzer;

Test Plan

  • Backend unit tests: inverted_index_parser_test.cpp, inverted_index_iterator_test.cpp
  • Regression tests: test_multi_tokenize_index_not_built.groovy
  • Tested in both local and cloud mode

Related Documentation

@Thearas
Copy link
Contributor

Thearas commented Dec 17, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12
Copy link
Member Author

run buildall

1 similar comment
@airborne12
Copy link
Member Author

run buildall

@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 36319 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit af7b8646ee4a01bba0e1e809fd27d48cf7219c2f, data reload: false

------ Round 1 ----------------------------------
q1	17629	4224	4044	4044
q2	2007	357	234	234
q3	10205	1331	723	723
q4	10210	851	309	309
q5	7513	2161	1891	1891
q6	189	166	135	135
q7	1004	856	715	715
q8	9350	1486	1178	1178
q9	7110	5371	5409	5371
q10	6834	2398	1977	1977
q11	519	326	305	305
q12	657	732	585	585
q13	17824	3636	3011	3011
q14	288	288	266	266
q15	594	524	529	524
q16	708	690	642	642
q17	708	853	501	501
q18	7610	7713	7950	7713
q19	1155	1004	644	644
q20	414	388	261	261
q21	4577	4262	4208	4208
q22	1110	1128	1082	1082
Total cold run time: 108215 ms
Total hot run time: 36319 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4262	4213	4266	4213
q2	326	429	342	342
q3	2328	2877	2402	2402
q4	1521	1808	1425	1425
q5	4845	4422	4672	4422
q6	217	164	123	123
q7	2006	1946	1782	1782
q8	2670	2668	2549	2549
q9	7489	7517	7507	7507
q10	3114	3254	2866	2866
q11	610	508	497	497
q12	629	706	560	560
q13	3277	3637	3031	3031
q14	258	280	272	272
q15	523	499	490	490
q16	634	650	600	600
q17	1111	1322	1326	1322
q18	7231	7004	6940	6940
q19	818	778	821	778
q20	1972	2010	1815	1815
q21	4648	4306	4066	4066
q22	1067	1023	985	985
Total cold run time: 51556 ms
Total hot run time: 48987 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 179202 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit af7b8646ee4a01bba0e1e809fd27d48cf7219c2f, data reload: false

query5	4991	601	464	464
query6	341	236	228	228
query7	4217	462	275	275
query8	300	245	249	245
query9	8787	2544	2566	2544
query10	532	364	339	339
query11	15330	14771	15057	14771
query12	190	116	113	113
query13	1253	518	389	389
query14	6368	3069	2813	2813
query14_1	2697	2659	2697	2659
query15	212	199	181	181
query16	932	484	448	448
query17	1097	696	630	630
query18	2606	447	341	341
query19	233	221	198	198
query20	123	116	111	111
query21	225	141	117	117
query22	4038	4061	3968	3968
query23	16670	16203	15962	15962
query23_1	16041	16311	16175	16175
query24	7336	1663	1215	1215
query24_1	1285	1237	1250	1237
query25	559	467	423	423
query26	1243	271	162	162
query27	2765	461	317	317
query28	4509	2129	2153	2129
query29	805	539	453	453
query30	316	241	209	209
query31	806	700	599	599
query32	80	71	67	67
query33	541	346	289	289
query34	918	896	553	553
query35	768	841	733	733
query36	858	914	846	846
query37	128	90	77	77
query38	2843	2831	2838	2831
query39	758	771	709	709
query39_1	726	696	697	696
query40	251	136	120	120
query41	71	64	63	63
query42	113	105	106	105
query43	445	462	414	414
query44	1366	750	757	750
query45	196	192	187	187
query46	914	1006	636	636
query47	1674	1689	1608	1608
query48	335	344	258	258
query49	682	442	367	367
query50	672	303	231	231
query51	3843	3798	3841	3798
query52	105	110	104	104
query53	327	362	301	301
query54	297	276	271	271
query55	84	84	78	78
query56	330	331	319	319
query57	1126	1130	1074	1074
query58	290	259	260	259
query59	2402	2430	2362	2362
query60	338	332	313	313
query61	198	186	189	186
query62	715	672	628	628
query63	342	301	307	301
query64	5081	1418	1120	1120
query65	4026	3994	3980	3980
query66	1430	455	341	341
query67	15421	15126	14795	14795
query68	6074	1042	735	735
query69	529	356	317	317
query70	1062	1008	976	976
query71	400	313	289	289
query72	6097	4908	4873	4873
query73	660	580	309	309
query74	8823	8738	8652	8652
query75	3231	3218	2848	2848
query76	3808	1126	736	736
query77	515	403	297	297
query78	9679	9572	8790	8790
query79	1402	881	618	618
query80	707	653	564	564
query81	523	266	231	231
query82	222	130	111	111
query83	255	258	237	237
query84	269	113	104	104
query85	902	495	470	470
query86	381	312	283	283
query87	3030	3003	3029	3003
query88	4200	2296	2315	2296
query89	476	423	402	402
query90	2188	163	157	157
query91	166	160	145	145
query92	93	73	63	63
query93	1658	936	567	567
query94	488	302	280	280
query95	583	389	302	302
query96	614	499	217	217
query97	2262	2310	2231	2231
query98	217	199	189	189
query99	1287	1314	1238	1238
Total cold run time: 260081 ms
Total hot run time: 179202 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.21 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit af7b8646ee4a01bba0e1e809fd27d48cf7219c2f, data reload: false

query1	0.05	0.04	0.05
query2	0.10	0.05	0.05
query3	0.25	0.09	0.08
query4	1.62	0.11	0.11
query5	0.27	0.25	0.26
query6	1.20	0.63	0.63
query7	0.03	0.02	0.02
query8	0.06	0.04	0.04
query9	0.57	0.53	0.52
query10	0.56	0.55	0.56
query11	0.16	0.12	0.11
query12	0.16	0.12	0.11
query13	0.63	0.62	0.60
query14	0.99	0.99	0.98
query15	0.83	0.79	0.80
query16	0.42	0.40	0.40
query17	1.08	1.09	1.01
query18	0.24	0.22	0.22
query19	1.93	1.75	1.79
query20	0.01	0.01	0.02
query21	15.45	0.28	0.15
query22	4.80	0.05	0.04
query23	16.12	0.30	0.10
query24	2.56	0.26	0.40
query25	0.10	0.05	0.06
query26	0.15	0.13	0.13
query27	0.06	0.06	0.05
query28	4.64	1.22	1.02
query29	12.59	3.98	3.26
query30	0.28	0.14	0.12
query31	2.83	0.64	0.39
query32	3.23	0.55	0.47
query33	3.02	3.03	3.05
query34	16.74	5.17	4.50
query35	4.64	4.53	4.58
query36	0.65	0.51	0.48
query37	0.11	0.07	0.07
query38	0.07	0.05	0.04
query39	0.05	0.03	0.03
query40	0.18	0.16	0.14
query41	0.09	0.03	0.03
query42	0.05	0.03	0.02
query43	0.05	0.04	0.03
Total cold run time: 99.62 s
Total hot run time: 27.21 s

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

};

if (runtime_state->query_options().enable_profile) {
if (runtime_state != nullptr && runtime_state->query_options().enable_profile) {
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks runtime_state != nullptr before accessing query_options(), but in the original code this check didn't exist. This suggests that runtime_state could be null in some scenarios. However, the query execution path that follows this block doesn't have similar null checks, which could lead to crashes if runtime_state is actually null in practice. Review whether runtime_state can legitimately be null here and add consistent null handling throughout the method if so.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 34 changed files in this pull request and generated 15 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +399 to +404
public String getAnalyzerIdentity() {
if (indexType != IndexDefinition.IndexType.INVERTED) {
return "";
}
return InvertedIndexUtil.buildAnalyzerIdentity(properties);
}
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the new getAnalyzerIdentity method. This new public method returns analyzer identity for indexes but doesn't appear to be tested. Consider adding unit tests to verify it returns the correct identity for different property configurations.

Copilot uses AI. Check for mistakes.
std::string column_name = block.get_by_position(arguments[0]).name;
VLOG_DEBUG << "begin to execute match directly, column_name=" << column_name
<< ", match_query_str=" << match_query_str;
auto* analyzer_ctx = get_match_analyzer_ctx(context);
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary change removing blank line. This change removes a blank line that was separating the analyzer_ctx initialization from the source column extraction. The blank line improved readability by grouping related operations. Consider keeping it.

Suggested change
auto* analyzer_ctx = get_match_analyzer_ctx(context);
auto* analyzer_ctx = get_match_analyzer_ctx(context);

Copilot uses AI. Check for mistakes.
Comment on lines +84 to +86
//waitAnalyzerReady(analyzerStandard)
//waitAnalyzerReady(analyzerEdge)
//waitAnalyzerReady(analyzerKeyword)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out code should be removed. These analyzer readiness check calls appear to be intentionally disabled but if they're not needed, they should be deleted rather than commented out. If they're needed for future use, consider documenting why they're disabled.

Copilot uses AI. Check for mistakes.
Comment on lines +141 to +161
/*sql "DROP TABLE IF EXISTS ${tableName}"
try {
sql "DROP INVERTED INDEX ANALYZER ${analyzerStandard}"
} catch (Exception ignored) {
}
try {
sql "DROP INVERTED INDEX ANALYZER ${analyzerEdge}"
} catch (Exception ignored) {
}
try {
sql "DROP INVERTED INDEX ANALYZER ${analyzerKeyword}"
} catch (Exception ignored) {
}
try {
sql "DROP INVERTED INDEX TOKENIZER ${tokenizerName}"
} catch (Exception ignored) {
}*/
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large block of commented-out cleanup code should be removed or uncommented. The finally block contains extensive cleanup logic that is commented out. If this cleanup is necessary, it should be active. If it's not needed (perhaps handled elsewhere), the commented code should be deleted to improve code clarity.

Copilot uses AI. Check for mistakes.
Comment on lines +2835 to +3025
indexDef.getIndexType() + " index for column (" + columnName + ") with "
+ (isNewIndexAnalyzer ? "analyzed" : "non-analyzed")
+ " type already exists.");
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent error message format. The error message format changes between line 2827-2828 (using "analyzer identity") versus lines 2835-2836 (using "analyzed"/"non-analyzed"). For consistency, consider using the analyzer identity format for both branches, or restructure to have uniform messaging.

Suggested change
indexDef.getIndexType() + " index for column (" + columnName + ") with "
+ (isNewIndexAnalyzer ? "analyzed" : "non-analyzed")
+ " type already exists.");
indexDef.getIndexType() + " index for column (" + columnName + ") with analyzer "
+ (isNewIndexAnalyzer ? "analyzed" : "non-analyzed")
+ " already exists.");

Copilot uses AI. Check for mistakes.
Comment on lines 33 to +35
#include <memory>
#include <ostream>
#include <string>
#include <string_view>
#include <type_traits>
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary inclusion of includes in comment. The comment references "cstddef" and "ostream" which are no longer included after the changes. These appear to be removed includes that shouldn't be mentioned in the comment about what changed. Consider removing them from the diff context or clarifying the comment intent.

Copilot uses AI. Check for mistakes.
auto* column_slot_ref = assert_cast<VSlotRef*>(get_child(0).get());
std::string column_name = column_slot_ref->expr_name();
auto it = std::find(column_names.begin(), column_names.end(), column_name);
auto it = std::ranges::find(column_names, column_name);
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent range-based for loop usage. The code changes lines 97, 106 to use "const auto&" for range-based loops but then uses "std::ranges::find" on line 175 which requires C++20 ranges. Consider using consistent modern C++ patterns throughout, or ensure all usages are compatible with the project's C++ standard.

Suggested change
auto it = std::ranges::find(column_names, column_name);
auto it = std::find(column_names.begin(), column_names.end(), column_name);

Copilot uses AI. Check for mistakes.
"disable_auto_compaction" = "true"
)
"""
sleep(10000)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded sleep of 10 seconds without explanation. This appears arbitrary and could make tests unnecessarily slow. Consider either using the existing waitAnalyzerReady mechanism (which is commented out above) or documenting why this specific duration is required.

Copilot uses AI. Check for mistakes.
Comment on lines +96 to +114
protected MatchPredicate(MatchPredicate other) {
super(other);
op = other.op;
invertedIndexParser = other.invertedIndexParser;
invertedIndexParserMode = other.invertedIndexParserMode;
invertedIndexCharFilter = other.invertedIndexCharFilter;
invertedIndexParserLowercase = other.invertedIndexParserLowercase;
invertedIndexParserStopwords = other.invertedIndexParserStopwords;
invertedIndexAnalyzerName = other.invertedIndexAnalyzerName;
explicitAnalyzer = other.explicitAnalyzer;
}

/**
* use for Nereids ONLY
*/
public MatchPredicate(Operator op, Expr e1, Expr e2, Type retType,
NullableMode nullableMode, Index invertedIndex, boolean nullable) {
this(op, e1, e2, retType, nullableMode, invertedIndex, nullable, null);
}
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructor order is unusual - simpler constructor delegates to more complex one which then delegates to another. Consider reordering: have the most complex constructor first, then simpler ones delegate to it. This would make the delegation chain clearer and follow common Java constructor patterns.

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +87
// Fields for thrift serialization (restored from old version)
private String invertedIndexParser;
private String invertedIndexParserMode;
private Map<String, String> invertedIndexCharFilter;
private boolean invertedIndexParserLowercase = true;
private String invertedIndexParserStopwords = "";
private String invertedIndexAnalyzerName = "";
// Fields for SQL generation
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary comment about code that was moved. The comment "Fields for thrift serialization (restored from old version)" seems to reference implementation history rather than explaining the purpose of these fields. Consider updating to describe what these fields are used for functionally, or remove if self-explanatory.

Suggested change
// Fields for thrift serialization (restored from old version)
private String invertedIndexParser;
private String invertedIndexParserMode;
private Map<String, String> invertedIndexCharFilter;
private boolean invertedIndexParserLowercase = true;
private String invertedIndexParserStopwords = "";
private String invertedIndexAnalyzerName = "";
// Fields for SQL generation
// Inverted index analyzer configuration persisted via Thrift serialization
private String invertedIndexParser;
private String invertedIndexParserMode;
private Map<String, String> invertedIndexCharFilter;
private boolean invertedIndexParserLowercase = true;
private String invertedIndexParserStopwords = "";
private String invertedIndexAnalyzerName = "";
// Analyzer name explicitly specified in the SQL MATCH predicate, used for SQL generation

Copilot uses AI. Check for mistakes.
@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 62.09% (131/211) 🎉
Increment coverage report
Complete coverage report

@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 35237 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 97fe72505992878ff3b9def8e30db8897ae74135, data reload: false

------ Round 1 ----------------------------------
q1	17619	4262	4070	4070
q2	2043	366	240	240
q3	10153	1311	714	714
q4	10216	817	319	319
q5	7524	2154	1930	1930
q6	189	169	134	134
q7	1013	849	746	746
q8	9370	1467	1176	1176
q9	7118	5302	5377	5302
q10	6892	2419	1984	1984
q11	516	332	296	296
q12	702	731	587	587
q13	17793	3736	3030	3030
q14	284	299	271	271
q15	595	528	517	517
q16	697	672	626	626
q17	699	868	535	535
q18	7895	7087	7032	7032
q19	1096	970	622	622
q20	404	357	247	247
q21	4276	3956	3928	3928
q22	1034	1014	931	931
Total cold run time: 108128 ms
Total hot run time: 35237 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4085	4098	4062	4062
q2	328	393	308	308
q3	2109	2664	2321	2321
q4	1366	1767	1276	1276
q5	4221	4616	4631	4616
q6	221	175	129	129
q7	2150	1970	1848	1848
q8	2761	2572	2525	2525
q9	7573	7600	7571	7571
q10	3064	3280	2818	2818
q11	596	537	493	493
q12	801	790	609	609
q13	3532	4052	3270	3270
q14	292	297	289	289
q15	544	532	514	514
q16	644	716	633	633
q17	1213	1493	1426	1426
q18	7919	7747	7449	7449
q19	997	999	903	903
q20	1971	2050	1885	1885
q21	4671	4270	4193	4193
q22	1122	1006	1007	1006
Total cold run time: 52180 ms
Total hot run time: 50144 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 178191 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 97fe72505992878ff3b9def8e30db8897ae74135, data reload: false

query5	4365	574	440	440
query6	343	244	221	221
query7	4207	469	272	272
query8	325	262	253	253
query9	8772	2537	2550	2537
query10	512	385	343	343
query11	15294	14823	14566	14566
query12	202	120	115	115
query13	1271	512	403	403
query14	5900	3068	2754	2754
query14_1	2708	2639	2673	2639
query15	220	200	182	182
query16	887	490	432	432
query17	1146	698	606	606
query18	2442	449	355	355
query19	238	231	241	231
query20	119	113	110	110
query21	215	141	117	117
query22	3975	3874	3840	3840
query23	16548	16119	15955	15955
query23_1	16103	16092	15898	15898
query24	7303	1637	1201	1201
query24_1	1246	1237	1260	1237
query25	541	470	414	414
query26	1238	257	167	167
query27	2757	471	311	311
query28	4475	2162	2123	2123
query29	810	551	458	458
query30	315	250	220	220
query31	821	699	609	609
query32	76	70	70	70
query33	536	364	316	316
query34	903	919	547	547
query35	803	809	725	725
query36	844	902	824	824
query37	130	90	74	74
query38	2879	2828	2788	2788
query39	786	746	702	702
query39_1	698	716	706	706
query40	222	137	123	123
query41	68	64	62	62
query42	108	107	103	103
query43	428	439	411	411
query44	1338	754	742	742
query45	192	187	187	187
query46	875	974	608	608
query47	1639	1700	1608	1608
query48	314	327	260	260
query49	605	426	354	354
query50	663	292	223	223
query51	3819	3817	3808	3808
query52	108	109	99	99
query53	323	353	295	295
query54	277	249	242	242
query55	80	75	72	72
query56	291	303	302	302
query57	1127	1147	1085	1085
query58	278	256	251	251
query59	2355	2495	2445	2445
query60	330	315	284	284
query61	168	162	164	162
query62	703	678	611	611
query63	329	294	320	294
query64	5069	1464	1115	1115
query65	3993	3963	3957	3957
query66	1480	450	330	330
query67	15218	15012	14820	14820
query68	5066	1046	741	741
query69	516	366	319	319
query70	1113	989	963	963
query71	381	319	287	287
query72	6309	4949	4952	4949
query73	680	579	310	310
query74	8825	8849	8557	8557
query75	3167	3188	2811	2811
query76	3831	1136	737	737
query77	539	395	288	288
query78	9457	9590	8817	8817
query79	1791	881	618	618
query80	897	656	554	554
query81	530	273	235	235
query82	436	139	105	105
query83	258	252	246	246
query84	265	119	99	99
query85	919	507	463	463
query86	398	306	276	276
query87	3001	3085	2938	2938
query88	3280	2286	2289	2286
query89	463	427	393	393
query90	1980	163	160	160
query91	171	167	144	144
query92	69	70	64	64
query93	1189	917	568	568
query94	535	307	287	287
query95	584	334	362	334
query96	601	467	212	212
query97	2298	2324	2204	2204
query98	209	206	197	197
query99	1292	1301	1265	1265
Total cold run time: 256516 ms
Total hot run time: 178191 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.3 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 97fe72505992878ff3b9def8e30db8897ae74135, data reload: false

query1	0.06	0.05	0.05
query2	0.11	0.05	0.05
query3	0.26	0.09	0.09
query4	1.61	0.12	0.12
query5	0.29	0.26	0.24
query6	1.17	0.64	0.63
query7	0.04	0.03	0.03
query8	0.05	0.04	0.04
query9	0.57	0.51	0.51
query10	0.56	0.57	0.55
query11	0.16	0.11	0.11
query12	0.15	0.11	0.11
query13	0.63	0.60	0.61
query14	0.99	0.98	0.98
query15	0.81	0.79	0.81
query16	0.40	0.40	0.40
query17	1.07	1.05	1.03
query18	0.24	0.22	0.22
query19	1.90	1.80	1.87
query20	0.02	0.01	0.01
query21	15.44	0.29	0.14
query22	4.80	0.05	0.05
query23	16.08	0.29	0.10
query24	1.35	0.74	0.22
query25	0.11	0.14	0.07
query26	0.14	0.13	0.13
query27	0.08	0.04	0.05
query28	5.15	1.21	1.04
query29	12.60	3.99	3.20
query30	0.28	0.13	0.11
query31	2.83	0.65	0.40
query32	3.23	0.56	0.46
query33	3.06	3.00	3.05
query34	16.70	5.18	4.57
query35	4.67	4.57	4.59
query36	0.66	0.51	0.49
query37	0.10	0.06	0.06
query38	0.08	0.05	0.03
query39	0.04	0.03	0.03
query40	0.17	0.15	0.13
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.04
Total cold run time: 98.84 s
Total hot run time: 27.3 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 81.40% (197/242) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.42% (18858/35304)
Line Coverage 39.27% (174997/445614)
Region Coverage 33.81% (135405/400478)
Branch Coverage 34.73% (58381/168078)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 90.87% (219/241) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.28% (25003/34594)
Line Coverage 59.03% (262682/444961)
Region Coverage 53.88% (218229/405044)
Branch Coverage 55.41% (93561/168866)

@airborne12
Copy link
Member Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 11.06% (24/217) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 35344 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a07218f7a9216940781783cb8a38f1bcd94fe6e7, data reload: false

------ Round 1 ----------------------------------
q1	17626	4246	4036	4036
q2	2051	365	247	247
q3	10151	1342	744	744
q4	10209	852	323	323
q5	7492	2110	1964	1964
q6	188	176	140	140
q7	996	864	728	728
q8	9370	1499	1137	1137
q9	7233	5339	5358	5339
q10	6838	2388	1997	1997
q11	539	318	300	300
q12	650	740	563	563
q13	17773	3674	3013	3013
q14	290	293	283	283
q15	605	525	506	506
q16	712	675	640	640
q17	705	825	567	567
q18	8330	7122	7150	7122
q19	1097	963	600	600
q20	412	371	249	249
q21	4210	3883	3917	3883
q22	1034	1000	963	963
Total cold run time: 108511 ms
Total hot run time: 35344 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4060	4001	4031	4001
q2	324	400	308	308
q3	2133	2669	2285	2285
q4	1377	1776	1301	1301
q5	4241	4776	4681	4681
q6	234	176	129	129
q7	2064	1965	1840	1840
q8	2642	2524	2537	2524
q9	7709	7512	7513	7512
q10	3067	3295	2866	2866
q11	605	512	493	493
q12	679	742	643	643
q13	3594	4013	3359	3359
q14	286	292	269	269
q15	556	514	518	514
q16	684	696	668	668
q17	1198	1516	1461	1461
q18	7764	7676	7519	7519
q19	878	884	883	883
q20	1999	2094	1913	1913
q21	4869	4330	4121	4121
q22	1065	1003	996	996
Total cold run time: 52028 ms
Total hot run time: 50286 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 176358 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a07218f7a9216940781783cb8a38f1bcd94fe6e7, data reload: false

query5	5206	596	448	448
query6	337	228	208	208
query7	4220	467	268	268
query8	301	252	232	232
query9	8769	2547	2574	2547
query10	540	403	350	350
query11	15558	14706	14927	14706
query12	183	118	129	118
query13	1265	494	380	380
query14	6511	3033	2774	2774
query14_1	2662	2632	2626	2626
query15	232	207	185	185
query16	849	474	472	472
query17	1085	704	608	608
query18	2725	447	389	389
query19	236	261	211	211
query20	125	120	120	120
query21	228	143	123	123
query22	3980	4153	3963	3963
query23	16497	16101	15894	15894
query23_1	16024	16061	16012	16012
query24	7349	1663	1221	1221
query24_1	1235	1204	1254	1204
query25	595	508	441	441
query26	1254	293	178	178
query27	2713	469	327	327
query28	4464	2154	2147	2147
query29	843	573	466	466
query30	315	252	216	216
query31	849	715	622	622
query32	85	71	71	71
query33	560	365	307	307
query34	902	893	543	543
query35	777	823	732	732
query36	857	912	821	821
query37	132	99	87	87
query38	2827	2842	2818	2818
query39	791	749	723	723
query39_1	700	711	700	700
query40	237	148	127	127
query41	72	68	68	68
query42	112	108	110	108
query43	430	428	417	417
query44	1333	750	767	750
query45	199	193	185	185
query46	890	985	640	640
query47	1700	1688	1626	1626
query48	323	341	269	269
query49	659	457	366	366
query50	667	303	237	237
query51	3842	3842	3822	3822
query52	111	116	104	104
query53	327	353	305	305
query54	313	276	263	263
query55	79	79	74	74
query56	313	351	302	302
query57	1144	1118	1073	1073
query58	284	250	269	250
query59	2421	2523	2372	2372
query60	305	308	288	288
query61	161	158	154	154
query62	716	652	638	638
query63	330	295	304	295
query64	4921	1337	995	995
query65	4031	3992	3944	3944
query66	1384	437	314	314
query67	15323	15167	15252	15167
query68	8279	989	734	734
query69	496	345	304	304
query70	1053	1003	993	993
query71	371	294	280	280
query72	6113	4955	2574	2574
query73	689	597	323	323
query74	8981	8753	8591	8591
query75	3221	3166	2806	2806
query76	3872	1108	728	728
query77	554	405	286	286
query78	9488	9714	8748	8748
query79	1621	867	603	603
query80	733	663	552	552
query81	521	269	232	232
query82	207	139	104	104
query83	266	259	238	238
query84	272	122	108	108
query85	951	513	470	470
query86	386	287	297	287
query87	3033	3009	3041	3009
query88	3274	2294	2271	2271
query89	469	417	390	390
query90	2254	166	161	161
query91	167	169	145	145
query92	86	74	69	69
query93	1539	954	565	565
query94	499	327	283	283
query95	567	332	365	332
query96	609	476	211	211
query97	2260	2294	2267	2267
query98	219	198	188	188
query99	1297	1297	1220	1220
Total cold run time: 261424 ms
Total hot run time: 176358 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.21 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a07218f7a9216940781783cb8a38f1bcd94fe6e7, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.26	0.09	0.08
query4	1.61	0.11	0.11
query5	0.28	0.25	0.27
query6	1.19	0.64	0.63
query7	0.02	0.02	0.02
query8	0.06	0.05	0.04
query9	0.57	0.52	0.50
query10	0.56	0.55	0.57
query11	0.15	0.12	0.11
query12	0.15	0.12	0.12
query13	0.62	0.61	0.61
query14	0.99	0.98	0.99
query15	0.82	0.80	0.81
query16	0.40	0.39	0.39
query17	1.08	1.08	0.99
query18	0.24	0.22	0.22
query19	1.96	1.78	1.77
query20	0.02	0.02	0.02
query21	15.48	0.30	0.14
query22	4.84	0.06	0.05
query23	16.08	0.29	0.11
query24	1.02	0.27	0.48
query25	0.12	0.09	0.05
query26	0.14	0.14	0.12
query27	0.08	0.08	0.06
query28	3.60	1.26	1.03
query29	12.57	4.06	3.17
query30	0.28	0.14	0.11
query31	2.82	0.64	0.39
query32	3.23	0.54	0.46
query33	3.03	3.05	3.05
query34	16.98	5.22	4.53
query35	4.60	4.65	4.54
query36	0.64	0.50	0.49
query37	0.10	0.06	0.07
query38	0.07	0.04	0.03
query39	0.05	0.03	0.03
query40	0.17	0.15	0.14
query41	0.08	0.04	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.04
Total cold run time: 97.2 s
Total hot run time: 27.21 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 81.40% (197/242) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.42% (18861/35308)
Line Coverage 39.26% (174893/445488)
Region Coverage 33.82% (135437/400442)
Branch Coverage 34.73% (58360/168016)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 90.87% (219/241) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.79% (25528/34596)
Line Coverage 61.15% (272015/444806)
Region Coverage 55.92% (226475/404991)
Branch Coverage 57.90% (97730/168788)

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 61.17% (178/291) 🎉
Increment coverage report
Complete coverage report

@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31753 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 78b4cc1ea4c8e9bc56beffec234031185ccc0d58, data reload: false

------ Round 1 ----------------------------------
q1	17619	4309	4056	4056
q2	2034	353	239	239
q3	10159	1273	729	729
q4	10203	829	311	311
q5	7543	2117	1885	1885
q6	185	164	137	137
q7	945	807	669	669
q8	9307	1392	1147	1147
q9	5104	4747	4657	4657
q10	6767	1817	1409	1409
q11	505	306	282	282
q12	690	740	597	597
q13	17796	3782	3051	3051
q14	287	289	270	270
q15	578	508	505	505
q16	686	676	648	648
q17	698	691	615	615
q18	6750	6353	6406	6353
q19	1267	957	613	613
q20	409	373	263	263
q21	3085	2540	2336	2336
q22	1070	1022	981	981
Total cold run time: 103687 ms
Total hot run time: 31753 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4091	4029	4018	4018
q2	315	385	295	295
q3	2121	2568	2243	2243
q4	1322	1755	1327	1327
q5	4045	4012	4075	4012
q6	209	174	128	128
q7	1929	1816	1881	1816
q8	2696	2379	2441	2379
q9	7432	7272	7139	7139
q10	2526	2736	2279	2279
q11	554	504	454	454
q12	692	735	624	624
q13	3548	4145	3406	3406
q14	296	315	298	298
q15	553	522	496	496
q16	664	684	660	660
q17	1133	1316	1428	1316
q18	8125	7884	7911	7884
q19	853	886	917	886
q20	2056	2234	1857	1857
q21	4636	4578	4310	4310
q22	1133	1079	982	982
Total cold run time: 50929 ms
Total hot run time: 48809 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172363 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 78b4cc1ea4c8e9bc56beffec234031185ccc0d58, data reload: false

query5	4429	583	422	422
query6	332	218	221	218
query7	4215	462	272	272
query8	348	250	239	239
query9	8748	2607	2657	2607
query10	515	357	343	343
query11	15239	15105	14873	14873
query12	171	117	112	112
query13	1275	495	395	395
query14	5576	2969	2703	2703
query14_1	2605	2562	2607	2562
query15	206	198	177	177
query16	981	488	457	457
query17	1103	686	581	581
query18	2438	449	342	342
query19	227	232	192	192
query20	122	117	114	114
query21	210	142	119	119
query22	3922	3883	4022	3883
query23	16158	15651	15391	15391
query23_1	15417	15499	15664	15499
query24	7425	1597	1192	1192
query24_1	1221	1196	1239	1196
query25	555	476	430	430
query26	824	270	174	174
query27	2731	460	300	300
query28	4522	2237	2216	2216
query29	756	556	456	456
query30	316	247	211	211
query31	796	627	566	566
query32	79	69	72	69
query33	544	336	293	293
query34	899	886	531	531
query35	750	789	695	695
query36	846	834	849	834
query37	172	97	77	77
query38	2791	2738	2659	2659
query39	773	765	748	748
query39_1	705	690	705	690
query40	209	125	116	116
query41	65	65	63	63
query42	100	99	97	97
query43	433	468	429	429
query44	1353	755	751	751
query45	190	182	173	173
query46	874	959	613	613
query47	1340	1471	1328	1328
query48	315	320	255	255
query49	607	420	330	330
query50	655	285	211	211
query51	3776	3868	3820	3820
query52	101	104	92	92
query53	294	328	270	270
query54	286	253	240	240
query55	76	76	69	69
query56	308	291	303	291
query57	1015	1014	909	909
query58	261	245	242	242
query59	2169	2214	2116	2116
query60	326	317	296	296
query61	204	165	155	155
query62	389	346	318	318
query63	295	260	268	260
query64	4315	1293	983	983
query65	3832	3695	3730	3695
query66	1420	430	317	317
query67	14944	15182	14808	14808
query68	7721	1007	733	733
query69	502	342	312	312
query70	1070	959	962	959
query71	366	297	268	268
query72	5971	2600	3656	2600
query73	774	744	345	345
query74	8737	8720	8632	8632
query75	2935	2815	2454	2454
query76	3806	1081	661	661
query77	535	378	286	286
query78	9667	9914	9189	9189
query79	1249	951	607	607
query80	665	581	489	489
query81	507	269	228	228
query82	219	149	117	117
query83	267	256	236	236
query84	260	125	104	104
query85	892	517	451	451
query86	382	300	291	291
query87	2832	2883	2715	2715
query88	3302	2323	2306	2306
query89	409	341	333	333
query90	2032	157	150	150
query91	175	172	140	140
query92	70	69	63	63
query93	1133	967	574	574
query94	582	334	295	295
query95	566	318	299	299
query96	617	480	214	214
query97	2337	2360	2303	2303
query98	226	207	199	199
query99	568	554	506	506
Total cold run time: 251934 ms
Total hot run time: 172363 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.41 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 78b4cc1ea4c8e9bc56beffec234031185ccc0d58, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.26	0.09	0.10
query4	1.61	0.12	0.11
query5	0.27	0.26	0.25
query6	1.15	0.65	0.65
query7	0.03	0.02	0.02
query8	0.05	0.04	0.04
query9	0.57	0.50	0.51
query10	0.55	0.55	0.55
query11	0.15	0.11	0.11
query12	0.17	0.13	0.12
query13	0.61	0.60	0.60
query14	0.99	0.99	0.97
query15	0.81	0.78	0.80
query16	0.39	0.42	0.40
query17	1.06	1.07	1.02
query18	0.24	0.21	0.21
query19	1.93	1.87	1.88
query20	0.02	0.01	0.02
query21	15.43	0.27	0.14
query22	4.73	0.05	0.05
query23	15.85	0.30	0.10
query24	1.44	0.78	0.71
query25	0.12	0.07	0.07
query26	0.14	0.13	0.13
query27	0.08	0.04	0.05
query28	4.33	1.05	0.88
query29	12.62	3.96	3.20
query30	0.29	0.14	0.12
query31	2.81	0.61	0.38
query32	3.24	0.55	0.46
query33	3.01	2.98	2.98
query34	16.81	5.18	4.42
query35	4.47	4.48	4.50
query36	0.66	0.52	0.48
query37	0.11	0.06	0.06
query38	0.07	0.04	0.04
query39	0.05	0.03	0.03
query40	0.18	0.15	0.14
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 97.63 s
Total hot run time: 27.41 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 75.56% (204/270) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.41% (18976/35530)
Line Coverage 39.29% (176148/448367)
Region Coverage 33.82% (136186/402715)
Branch Coverage 34.78% (58869/169266)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 84.01% (226/269) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.96% (25689/34734)
Line Coverage 61.39% (274538/447174)
Region Coverage 56.27% (228959/406877)
Branch Coverage 58.17% (98793/169828)

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 61.72% (179/290) 🎉
Increment coverage report
Complete coverage report

…imization

This PR addresses multiple issues in the multi-analyzer inverted index feature:

P0 Fixes:
- Fix select_best_reader single reader bug: now checks analyzer key match
  before returning, preventing wrong index usage when specified analyzer
  index is not built
- Unify analyzer key normalization: empty string normalizes to "__default__",
  "none" stays distinct (critical for correct index selection)
- Fix resolveAnalyzerIdentity silent exception: added logging for debugging

P1 Fixes:
- Eliminate double normalization in inverted_index_iterator.cpp
- Simplify MatchPredicate analyzer logic with effectiveAnalyzerName method
- Refactor Match subclasses using Template Method Pattern: added
  createInstance() abstract method, unified withChildren() in base class

P2 Fixes:
- Extract analyzerSqlFragment to InvertedIndexUtil.buildAnalyzerSqlFragment()
- Improve error log level for user-specified analyzer not found

Test:
- Update regression test for cloud mode compatibility using isCloudMode()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@airborne12
Copy link
Member Author

run buildall

Apply clang-format to fix spacing before inline comments.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@airborne12
Copy link
Member Author

run buildall

Use BUILD INDEX ON table (without index name) in cloud mode since
cloud mode doesn't support specifying index name. Both modes now
execute the full test including index building and post-build queries.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32219 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 66cdc7da64e205052153a9cb8433c93c91ebd96a, data reload: false

------ Round 1 ----------------------------------
q1	17616	4241	4084	4084
q2	2051	357	246	246
q3	10139	1283	727	727
q4	10228	841	332	332
q5	7985	2132	1872	1872
q6	243	172	144	144
q7	997	804	666	666
q8	9306	1426	1191	1191
q9	4962	4646	4630	4630
q10	6835	1820	1431	1431
q11	540	294	293	293
q12	753	741	592	592
q13	17809	3838	3093	3093
q14	295	295	269	269
q15	605	519	507	507
q16	677	694	628	628
q17	687	794	565	565
q18	6652	6359	6905	6359
q19	1310	1030	672	672
q20	436	385	268	268
q21	3215	2647	2611	2611
q22	1148	1116	1039	1039
Total cold run time: 104489 ms
Total hot run time: 32219 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4387	4247	4320	4247
q2	339	408	344	344
q3	2264	2781	2459	2459
q4	1465	1963	1440	1440
q5	4639	4468	4258	4258
q6	212	170	130	130
q7	2204	1862	1765	1765
q8	2683	2424	2467	2424
q9	7233	7308	6980	6980
q10	2364	2741	2396	2396
q11	566	476	472	472
q12	696	762	595	595
q13	3658	4149	3060	3060
q14	264	284	262	262
q15	535	478	490	478
q16	631	659	600	600
q17	1124	1330	1387	1330
q18	7311	7259	7088	7088
q19	857	821	835	821
q20	1885	1966	1784	1784
q21	4843	4393	4181	4181
q22	1051	1019	983	983
Total cold run time: 51211 ms
Total hot run time: 48097 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172377 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 66cdc7da64e205052153a9cb8433c93c91ebd96a, data reload: false

query5	4397	591	425	425
query6	341	223	220	220
query7	4210	455	257	257
query8	341	248	234	234
query9	8769	2644	2675	2644
query10	506	360	318	318
query11	15082	15066	14913	14913
query12	170	114	116	114
query13	1237	470	375	375
query14	5958	2933	2731	2731
query14_1	2623	2612	2592	2592
query15	195	195	178	178
query16	976	477	443	443
query17	1085	689	583	583
query18	2444	450	342	342
query19	236	246	202	202
query20	124	120	117	117
query21	216	147	122	122
query22	3880	4014	3826	3826
query23	15869	15591	15265	15265
query23_1	15337	15420	15371	15371
query24	7361	1568	1170	1170
query24_1	1188	1170	1174	1170
query25	568	473	427	427
query26	1241	271	160	160
query27	2769	455	288	288
query28	4538	2137	2120	2120
query29	801	558	469	469
query30	318	240	210	210
query31	785	629	547	547
query32	78	71	68	68
query33	578	346	298	298
query34	908	867	527	527
query35	737	778	688	688
query36	876	938	812	812
query37	131	90	77	77
query38	2650	2727	2684	2684
query39	782	760	741	741
query39_1	712	722	732	722
query40	230	134	122	122
query41	70	68	70	68
query42	104	100	106	100
query43	464	423	427	423
query44	1293	737	718	718
query45	192	189	176	176
query46	849	955	598	598
query47	1376	1441	1372	1372
query48	325	340	246	246
query49	628	422	340	340
query50	656	271	210	210
query51	3755	3828	3741	3741
query52	106	109	99	99
query53	299	332	275	275
query54	295	268	262	262
query55	81	78	71	71
query56	295	315	337	315
query57	1011	999	887	887
query58	270	255	250	250
query59	2173	2172	2101	2101
query60	314	318	289	289
query61	158	156	165	156
query62	408	351	338	338
query63	302	264	267	264
query64	4972	1305	960	960
query65	3774	3719	3735	3719
query66	1457	407	294	294
query67	15096	14661	15582	14661
query68	8246	993	702	702
query69	496	338	305	305
query70	1063	951	932	932
query71	379	298	271	271
query72	5909	3374	3448	3374
query73	765	718	299	299
query74	8754	8740	8601	8601
query75	2864	2812	2448	2448
query76	3475	1051	628	628
query77	608	365	280	280
query78	9657	10021	9183	9183
query79	1235	899	581	581
query80	631	554	472	472
query81	513	262	233	233
query82	210	144	111	111
query83	263	247	241	241
query84	258	114	102	102
query85	909	513	468	468
query86	385	306	324	306
query87	2862	2845	2731	2731
query88	3108	2217	2204	2204
query89	384	342	336	336
query90	2229	149	144	144
query91	168	159	144	144
query92	83	68	62	62
query93	1780	884	521	521
query94	557	324	285	285
query95	569	377	300	300
query96	586	449	199	199
query97	2306	2397	2292	2292
query98	219	196	201	196
query99	606	570	526	526
Total cold run time: 252923 ms
Total hot run time: 172377 ms

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 68.82% (192/279) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.08% (19002/35801)
Line Coverage 39.14% (176162/450085)
Region Coverage 33.71% (136454/404769)
Branch Coverage 34.74% (58989/169786)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 83.09% (231/278) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.88% (25853/34995)
Line Coverage 61.30% (275133/448851)
Region Coverage 56.29% (230151/408884)
Branch Coverage 58.17% (99088/170335)

…alyzer

When FE sends analyzer_name = "__default__" (which occurs when there's no
index and no explicit analyzer specified), BE's create_analyzer function
should treat it the same as an empty analyzer name and use the default
builtin analyzer based on parser_type.

Previously, this caused "Policy not found with name: __default__" errors
because __default__ was being looked up as a custom analyzer policy.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@airborne12
Copy link
Member Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 9.82% (28/285) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 31666 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2f8c568d6936a39c2053ce0bf2e4d3472f1eeb7f, data reload: false

------ Round 1 ----------------------------------
q1	17696	4185	4033	4033
q2	2023	365	246	246
q3	10141	1281	723	723
q4	10211	874	324	324
q5	7505	2066	1892	1892
q6	196	179	148	148
q7	933	806	679	679
q8	9284	1454	1204	1204
q9	4943	4699	4547	4547
q10	6849	1813	1418	1418
q11	498	303	287	287
q12	723	756	590	590
q13	17784	3831	3119	3119
q14	296	293	291	291
q15	609	528	507	507
q16	677	681	638	638
q17	677	829	491	491
q18	6787	6503	6363	6363
q19	1100	971	622	622
q20	406	386	248	248
q21	3018	2418	2331	2331
q22	1036	1037	965	965
Total cold run time: 103392 ms
Total hot run time: 31666 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4124	4037	4093	4037
q2	336	398	312	312
q3	2112	2673	2227	2227
q4	1342	1749	1343	1343
q5	4082	4023	4056	4023
q6	210	172	132	132
q7	1893	1835	1871	1835
q8	2689	2634	2408	2408
q9	7449	7035	7105	7035
q10	2513	2740	2298	2298
q11	561	484	474	474
q12	729	800	614	614
q13	3609	4049	3431	3431
q14	324	323	294	294
q15	544	524	510	510
q16	641	698	665	665
q17	1169	1431	1411	1411
q18	8304	7677	7975	7677
q19	849	904	842	842
q20	1978	2125	1927	1927
q21	4712	4534	4347	4347
q22	1143	1028	962	962
Total cold run time: 51313 ms
Total hot run time: 48804 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172462 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2f8c568d6936a39c2053ce0bf2e4d3472f1eeb7f, data reload: false

query5	4423	558	432	432
query6	356	228	223	223
query7	4227	462	272	272
query8	344	263	247	247
query9	8756	2634	2653	2634
query10	544	391	327	327
query11	15128	15181	14776	14776
query12	172	116	113	113
query13	1263	489	389	389
query14	5811	2988	2735	2735
query14_1	2653	2660	2657	2657
query15	204	193	174	174
query16	988	474	478	474
query17	1112	698	578	578
query18	2470	464	390	390
query19	253	224	204	204
query20	120	123	117	117
query21	218	138	126	126
query22	4061	4034	3974	3974
query23	16061	15474	15315	15315
query23_1	15642	15365	15524	15365
query24	7359	1557	1176	1176
query24_1	1201	1167	1219	1167
query25	563	476	427	427
query26	1258	277	161	161
query27	2750	447	293	293
query28	4544	2140	2145	2140
query29	823	565	470	470
query30	311	251	216	216
query31	812	634	549	549
query32	83	70	79	70
query33	572	347	289	289
query34	891	882	519	519
query35	723	826	677	677
query36	887	924	856	856
query37	130	95	80	80
query38	2736	2777	2619	2619
query39	779	776	731	731
query39_1	716	729	720	720
query40	221	135	115	115
query41	65	62	64	62
query42	109	105	102	102
query43	463	443	427	427
query44	1310	730	729	729
query45	185	178	172	172
query46	838	950	584	584
query47	1396	1412	1368	1368
query48	302	326	247	247
query49	607	418	344	344
query50	629	284	200	200
query51	3856	3745	3768	3745
query52	115	113	98	98
query53	309	324	275	275
query54	278	258	264	258
query55	77	77	72	72
query56	285	281	292	281
query57	1007	1022	924	924
query58	321	264	245	245
query59	1952	2160	2165	2160
query60	319	313	300	300
query61	161	163	161	161
query62	398	362	320	320
query63	301	266	270	266
query64	5005	1293	989	989
query65	3795	3762	3725	3725
query66	1451	429	302	302
query67	15039	14570	15479	14570
query68	8047	986	706	706
query69	505	338	302	302
query70	1056	970	928	928
query71	350	298	275	275
query72	6064	3381	3406	3381
query73	747	722	303	303
query74	8856	8731	8567	8567
query75	2838	2782	2452	2452
query76	3351	1037	636	636
query77	522	374	280	280
query78	9858	9821	9141	9141
query79	1501	923	580	580
query80	652	579	476	476
query81	518	278	234	234
query82	218	141	112	112
query83	266	263	235	235
query84	256	120	95	95
query85	917	506	451	451
query86	382	327	315	315
query87	2880	2923	2768	2768
query88	3182	2239	2238	2238
query89	384	371	328	328
query90	2064	158	151	151
query91	179	165	146	146
query92	77	68	61	61
query93	1124	934	530	530
query94	569	325	279	279
query95	580	326	305	305
query96	598	469	208	208
query97	2345	2413	2291	2291
query98	227	203	202	202
query99	590	573	497	497
Total cold run time: 253142 ms
Total hot run time: 172462 ms

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 68.93% (193/280) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.03% (18992/35811)
Line Coverage 39.12% (176124/450271)
Region Coverage 33.66% (136298/404878)
Branch Coverage 34.71% (58951/169847)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 82.80% (231/279) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.97% (25954/35088)
Line Coverage 61.38% (276027/449673)
Region Coverage 56.18% (230000/409378)
Branch Coverage 58.15% (99202/170608)

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 65.26% (186/285) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 82.80% (231/279) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.97% (25954/35088)
Line Coverage 61.38% (276021/449673)
Region Coverage 56.18% (230000/409378)
Branch Coverage 58.14% (99199/170608)

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 65.26% (186/285) 🎉
Increment coverage report
Complete coverage report

…tive

Normalize analyzer, normalizer, and policy names to lowercase for
case-insensitive matching between table creation and query time.
This ensures that users can use any case when specifying analyzer
names (e.g., 'MyAnalyzer' vs 'myanalyzer') and they will match correctly.

Changes:
- FE: Normalize analyzer/normalizer names in InvertedIndexUtil
- FE: Normalize policy names in IndexPolicyMgr using normalizeKey()
- FE: Normalize explicit analyzer in MatchPredicate and Match expressions
- BE: Add normalize_name() helper in IndexPolicyMgr for consistent lookup

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants