From 6aee49978967d61ab1554595e300325f863d9ca5 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Wed, 15 Jan 2025 10:28:36 +0200
Subject: [PATCH 01/25] related content, edits
---
.../query/arg-max-aggregation-function.md | 53 +++++++++++--------
.../kusto/query/array-sort-asc-function.md | 4 +-
.../kusto/query/array-sort-desc-function.md | 4 +-
.../kusto/query/avg-aggregation-function.md | 11 ++--
.../kusto/query/avgif-aggregation-function.md | 15 ++++--
.../binary-all-and-aggregation-function.md | 13 +++--
.../binary-all-or-aggregation-function.md | 14 +++--
.../binary-all-xor-aggregation-function.md | 11 +++-
.../query/buildschema-aggregation-function.md | 52 +++++++-----------
.../kusto/query/count-aggregation-function.md | 37 +++++++------
.../count-distinct-aggregation-function.md | 12 ++++-
.../count-distinctif-aggregation-function.md | 13 +++--
.../query/countif-aggregation-function.md | 33 ++++++------
.../query/dcount-aggregation-function.md | 11 +++-
.../kusto/query/dcount-intersect-plugin.md | 4 ++
.../query/dcountif-aggregation-function.md | 11 +++-
.../kusto/query/hll-aggregation-function.md | 28 ++++++----
.../query/hll-if-aggregation-function.md | 26 +++++----
18 files changed, 218 insertions(+), 134 deletions(-)
diff --git a/data-explorer/kusto/query/arg-max-aggregation-function.md b/data-explorer/kusto/query/arg-max-aggregation-function.md
index 498d1151e7..cd9a9dd41f 100644
--- a/data-explorer/kusto/query/arg-max-aggregation-function.md
+++ b/data-explorer/kusto/query/arg-max-aggregation-function.md
@@ -3,7 +3,7 @@ title: arg_max() (aggregation function)
description: Learn how to use the arg_max() aggregation function to find a row in a table that maximizes the input expression.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 11/11/2024
+ms.date: 01/15/2025
---
# arg_max() (aggregation function)
@@ -37,7 +37,9 @@ Returns a row in the table that maximizes the specified expression *ExprToMaximi
## Examples
-Find the maximum latitude of a storm event in each state.
+### Find maximum latitute
+
+The following example finds the maximum latitude of a storm event in each state.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -49,6 +51,8 @@ StormEvents
| summarize arg_max(BeginLat, BeginLocation) by State
```
+**Output**
+
The results table displays only the first 10 rows.
| State | BeginLat | BeginLocation |
@@ -65,9 +69,11 @@ The results table displays only the first 10 rows.
| TEXAS | 36.4607 | DARROUZETT |
| ... | ... | ... |
-Find the last time an event with a direct death happened in each state, showing all the columns.
+### Find last state fatal event
+
+The following example finds the last time an event with a direct death happened in each state, showing all the columns.
-The query first filters the events to only include those where there was at least one direct death. Then the query returns the entire row with the most recent StartTime.
+The query first filters the events to include only those events where there was at least one direct death. Then the query returns the entire row with the most recent `StartTime`.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -80,21 +86,25 @@ StormEvents
| summarize arg_max(StartTime, *) by State
```
-The results table displays only the first 10 rows and first 3 columns.
+**Output**
+
+The results table displays only the first 10 rows and first three columns.
-| State | StartTime | EndTime | ... |
-| -------------- | -------------------- | -------------------- | --- |
-| GUAM | 2007-01-27T11:15:00Z | 2007-01-27T11:30:00Z | ... |
-| MASSACHUSETTS | 2007-02-03T22:00:00Z | 2007-02-04T10:00:00Z | ... |
+| State | StartTime | EndTime | ... |
+|--|--|--|--|
+| GUAM | 2007-01-27T11:15:00Z | 2007-01-27T11:30:00Z | ... |
+| MASSACHUSETTS | 2007-02-03T22:00:00Z | 2007-02-04T10:00:00Z | ... |
| AMERICAN SAMOA | 2007-02-17T13:00:00Z | 2007-02-18T11:00:00Z | ... |
-| IDAHO | 2007-02-17T13:00:00Z | 2007-02-17T15:00:00Z | ... |
-| DELAWARE | 2007-02-25T13:00:00Z | 2007-02-26T01:00:00Z | ... |
-| WYOMING | 2007-03-10T17:00:00Z | 2007-03-10T17:00:00Z | ... |
-| NEW MEXICO | 2007-03-23T18:42:00Z | 2007-03-23T19:06:00Z | ... |
-| INDIANA | 2007-05-15T14:14:00Z | 2007-05-15T14:14:00Z | ... |
-| MONTANA | 2007-05-18T14:20:00Z | 2007-05-18T14:20:00Z | ... |
-| LAKE MICHIGAN | 2007-06-07T13:00:00Z | 2007-06-07T13:00:00Z | ... |
-|... | ... | ...| ... |
+| IDAHO | 2007-02-17T13:00:00Z | 2007-02-17T15:00:00Z | ... |
+| DELAWARE | 2007-02-25T13:00:00Z | 2007-02-26T01:00:00Z | ... |
+| WYOMING | 2007-03-10T17:00:00Z | 2007-03-10T17:00:00Z | ... |
+| NEW MEXICO | 2007-03-23T18:42:00Z | 2007-03-23T19:06:00Z | ... |
+| INDIANA | 2007-05-15T14:14:00Z | 2007-05-15T14:14:00Z | ... |
+| MONTANA | 2007-05-18T14:20:00Z | 2007-05-18T14:20:00Z | ... |
+| LAKE MICHIGAN | 2007-06-07T13:00:00Z | 2007-06-07T13:00:00Z | ... |
+| ... | ... | ... | ... |
+
+### Handle nulls
The following example demonstrates null handling.
@@ -125,7 +135,7 @@ datatable(Fruit: string, Color: string, Version: int) [
## Comparison to max()
-The arg_max() function differs from the [max() function](max-aggregation-function.md). The arg_max() function allows you to return additional columns along with the maximum value, and [max()](max-aggregation-function.md) only returns the maximum value itself.
+The arg_max() function differs from the [max() function](max-aggregation-function.md). The arg_max() function allows you to return other columns along with the maximum value, and [max()](max-aggregation-function.md) only returns the maximum value itself.
### Examples
@@ -133,7 +143,7 @@ The arg_max() function differs from the [max() function](max-aggregation-functio
Find the last time an event with a direct death happened, showing all the columns in the table.
-The query first filters the events to only include those where there was at least one direct death. Then the query returns the entire row with the most recent (maximum) StartTime.
+The query first filters the events to only include events where there was at least one direct death. Then the query returns the entire row with the most recent (maximum) StartTime.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -156,7 +166,7 @@ The results table returns all the columns for the row containing the highest val
Find the last time an event with a direct death happened.
-The query filters events to only include those where there is at least one direct death, and then returns the maximum value for StartTime.
+The query filters events to only include events where there is at least one direct death, and then returns the maximum value for StartTime.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -177,7 +187,8 @@ The results table returns the maximum value of StartTime, without returning othe
## Related content
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [arg_min function](arg-min-aggregation-function.md)
* [max function](max-aggregation-function.md)
-* [min function](min-aggregation-function.md)
* [avg function](avg-aggregation-function.md)
* [percentile function](percentiles-aggregation-function.md)
diff --git a/data-explorer/kusto/query/array-sort-asc-function.md b/data-explorer/kusto/query/array-sort-asc-function.md
index 04dab236ad..7fcb2c957a 100644
--- a/data-explorer/kusto/query/array-sort-asc-function.md
+++ b/data-explorer/kusto/query/array-sort-asc-function.md
@@ -153,4 +153,6 @@ print array_sort_asc(dynamic([null,"blue","yellow","green",null]), false)
## Related content
-To sort the first array in descending order, use [array_sort_desc()](array-sort-desc-function.md).
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [array_sort_desc()](array-sort-desc-function.md)
+*
diff --git a/data-explorer/kusto/query/array-sort-desc-function.md b/data-explorer/kusto/query/array-sort-desc-function.md
index 453d02fe68..348b2be90a 100644
--- a/data-explorer/kusto/query/array-sort-desc-function.md
+++ b/data-explorer/kusto/query/array-sort-desc-function.md
@@ -155,4 +155,6 @@ print array_sort_desc(dynamic([null,"blue","yellow","green",null]), false)
## Related content
-To sort the first array in ascending order, use [array_sort_asc()](array-sort-asc-function.md).
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [array_sort_asc()](array-sort-asc-function.md)
+*
diff --git a/data-explorer/kusto/query/avg-aggregation-function.md b/data-explorer/kusto/query/avg-aggregation-function.md
index 3c1c527a08..877e5ab925 100644
--- a/data-explorer/kusto/query/avg-aggregation-function.md
+++ b/data-explorer/kusto/query/avg-aggregation-function.md
@@ -3,7 +3,7 @@ title: avg() (aggregation function)
description: Learn how to use the avg() function to calculate the average value of an expression.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# avg() (aggregation function)
@@ -31,7 +31,7 @@ Returns the average value of *expr* across the group.
## Example
-This example returns the average number of damaged crops per state.
+The following example returns the average number of damaged crops per state.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -61,6 +61,7 @@ The results table shown includes only the first 10 rows.
## Related content
-* [min function](min-aggregation-function.md)
-* [max function](max-aggregation-function.md)
-* [percentile function](percentiles-aggregation-function.md)
\ No newline at end of file
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [min() (aggregation function)](min-aggregation-function.md)
+* [max() (aggregation function)](max-aggregation-function.md)
+* [percentile(), percentiles() (aggregation function)](percentiles-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/avgif-aggregation-function.md b/data-explorer/kusto/query/avgif-aggregation-function.md
index 9e63761dd7..93f9b6293a 100644
--- a/data-explorer/kusto/query/avgif-aggregation-function.md
+++ b/data-explorer/kusto/query/avgif-aggregation-function.md
@@ -3,7 +3,7 @@ title: avgif() (aggregation function)
description: Learn how to use the avgif() function to return the average value of an expression where the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# avgif() (aggregation function)
@@ -24,7 +24,7 @@ Calculates the [average](avg-aggregation-function.md) of *expr* in records for w
| Name | Type | Required | Description |
|--|--|--|--|
| *expr* | `string` | :heavy_check_mark: | The expression used for aggregation calculation. Records with `null` values are ignored and not included in the calculation. |
-| *predicate* | `string` | :heavy_check_mark: | The predicate that if true, the *expr* calculated value will be added to the average. |
+| *predicate* | `string` | :heavy_check_mark: | The predicate that if true, the *expr* calculated value is added to the average. |
## Returns
@@ -32,7 +32,7 @@ Returns the average value of *expr* in records where *predicate* evaluates to `t
## Example
-This example calculates the average damage by state in cases where there was any damage.
+The following example calculates the average damage by state in cases where there was any damage.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -44,6 +44,8 @@ StormEvents
| summarize Averagedamage=tolong(avg( DamageCrops)),AverageWhenDamage=tolong(avgif(DamageCrops,DamageCrops >0)) by State
```
+**Output**
+
The results table shown includes only the first 10 rows.
| State | Averagedamage | Averagewhendamage |
@@ -59,3 +61,10 @@ The results table shown includes only the first 10 rows.
| NEBRASKA | 21366 | 187726 |
| NEW YORK | 5 | 10000 |
| ... | ... | ... |
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [avg() (aggregation function)](avg-aggregation-function.md)
+* [minif() (aggregation function)](minif-aggregation-function.md)
+* [maxif() (aggregation function)](maxif-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/binary-all-and-aggregation-function.md b/data-explorer/kusto/query/binary-all-and-aggregation-function.md
index b91d362fd0..bb24aa2512 100644
--- a/data-explorer/kusto/query/binary-all-and-aggregation-function.md
+++ b/data-explorer/kusto/query/binary-all-and-aggregation-function.md
@@ -2,13 +2,13 @@
title: binary_all_and() (aggregation function)
description: Learn how to use the binary_all_and() function to aggregate values using the binary AND operation.
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# binary_all_and() (aggregation function)
> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)]
-Accumulates values using the binary `AND` operation for each summarization group, or in total if a group is not specified.
+Accumulates values using the binary `AND` operation for each summarization group, or in total if a group isn't specified.
[!INCLUDE [data-explorer-agg-function-summarize-note](../includes/agg-function-summarize-note.md)]
@@ -26,7 +26,7 @@ Accumulates values using the binary `AND` operation for each summarization group
## Returns
-Returns an aggregated value using the binary `AND` operation over records for each summarization group, or in total if a group is not specified.
+Returns an aggregated value using the binary `AND` operation over records for each summarization group, or in total if a group isn't specified.
## Example
@@ -53,3 +53,10 @@ datatable(num:long)
|result|
|---|
|CAFEF00D|
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [binary_all_or() (aggregation function)](binary-all-or-aggregation-function.md)
+* [binary_all_xor() (aggregation function)](binary-all-xor-aggregation-function.md)
+* [binary_and()](binary-and-function.md)
diff --git a/data-explorer/kusto/query/binary-all-or-aggregation-function.md b/data-explorer/kusto/query/binary-all-or-aggregation-function.md
index 415044824f..1ad5af1ac0 100644
--- a/data-explorer/kusto/query/binary-all-or-aggregation-function.md
+++ b/data-explorer/kusto/query/binary-all-or-aggregation-function.md
@@ -3,13 +3,13 @@ title: binary_all_or() (aggregation function)
description: Learn how to use the binary_all_or() function to aggregate values using the binary OR operation.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# binary_all_or() (aggregation function)
> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)]
-Accumulates values using the binary `OR` operation for each summarization group, or in total if a group is not specified.
+Accumulates values using the binary `OR` operation for each summarization group, or in total if a group isn't specified.
[!INCLUDE [data-explorer-agg-function-summarize-note](../includes/agg-function-summarize-note.md)]
@@ -27,7 +27,7 @@ Accumulates values using the binary `OR` operation for each summarization group,
## Returns
-Returns an aggregated value using the binary `OR` operation over records for each summarization group, or in total if a group is not specified.
+Returns an aggregated value using the binary `OR` operation over records for each summarization group, or in total if a group isn't specified.
## Example
@@ -54,3 +54,11 @@ datatable(num:long)
|result|
|---|
|CAFEF00D|
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [binary_all_or() (aggregation function)](binary-all-or-aggregation-function.md)
+* [binary_all_xor() (aggregation function)](binary-all-xor-aggregation-function.md)
+* [binary_and()](binary-and-function.md)
+* [binary_or()](binary-or-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/binary-all-xor-aggregation-function.md b/data-explorer/kusto/query/binary-all-xor-aggregation-function.md
index adf1e93c79..9c566a766e 100644
--- a/data-explorer/kusto/query/binary-all-xor-aggregation-function.md
+++ b/data-explorer/kusto/query/binary-all-xor-aggregation-function.md
@@ -3,7 +3,7 @@ title: binary_all_xor() (aggregation function)
description: Learn how to use the binary_all_xor() function to aggregate values using the binary XOR operation.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# binary_all_xor() (aggregation function)
@@ -27,7 +27,7 @@ Accumulates values using the binary `XOR` operation for each summarization group
## Returns
-Returns a value that is aggregated using the binary `XOR` operation over records for each summarization group, or in total if a group is not specified.
+Returns a value that is aggregated using the binary `XOR` operation over records for each summarization group, or in total if a group isn't specified.
## Example
@@ -54,3 +54,10 @@ datatable(num:long)
|results|
|--|
|CAFEF00D|
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [binary_all_or() (aggregation function)](binary-all-or-aggregation-function.md)
+* [binary_all_and() (aggregation function)](binary-all-and-aggregation-function.md)
+* [binary_xor()](binary-xor-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/buildschema-aggregation-function.md b/data-explorer/kusto/query/buildschema-aggregation-function.md
index d3e7e4e508..6dda61c1c2 100644
--- a/data-explorer/kusto/query/buildschema-aggregation-function.md
+++ b/data-explorer/kusto/query/buildschema-aggregation-function.md
@@ -3,7 +3,7 @@ title: buildschema() (aggregation function)
description: Learn how to use the buildschema() function to build a table schema from a dynamic expression.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# buildschema() (aggregation function)
@@ -30,7 +30,7 @@ Builds the minimal schema that admits all values of *DynamicExpr*.
Returns the minimal schema that admits all values of *DynamicExpr*.
> [!TIP]
-> If the input is a JSON string, use the [parse_json()](parse-json-function.md) function to convert the JSON to a [dynamic](scalar-data-types/dynamic.md) value. Otherwise, an error may occur.
+> If the input is a JSON string, use the [parse_json()](parse-json-function.md) function to convert the JSON to a [dynamic](scalar-data-types/dynamic.md) value. Otherwise, an error might occur.
## Example
@@ -60,35 +60,19 @@ datatable(value: dynamic) [
|--|
|{"x":["long","string"],"y":["double",{"w":"string"}],"z":{"`indexer`":["long","string"]},"t":{"`indexer`":"string"}}|
-The resulting schema tells us that:
-
-* The root object is a container with four properties named x, y, z, and t.
-* The property called `x` is of type *long* or of type *string*.
-* The property called `y` ii of type *double*, or another container with a property called `w` of type *string*.
-* The `indexer` keyword indicates that `z` and `t` are arrays.
-* Each item in the array `z` is of type *long* or of type *string*.
-* `t` is an array of strings.
-* Every property is implicitly optional, and any array may be empty.
-
-### Schema model
-
-The syntax of the returned schema is:
-
-Container ::= '{' Named-type* '}';
-Named-type: := (name | '"`indexer`"') ':' Type;
-Type ::= Primitive-type | Union-type | Container;
-Union-type ::= '[' Type* ']';
-Primitive-type ::= "long" | "string" | ...;
-
-The values are equivalent to a subset of TypeScript type annotations, encoded as a Kusto dynamic value.
-In TypeScript, the example schema would be:
-
-```typescript
-var someobject:
-{
- x?: (number | string),
- y?: (number | { w?: string}),
- z?: { [n:number] : (long | string)},
- t?: { [n:number]: string }
-}
-```
+### Schema breakdown
+
+In the resulting schema:
+
+* The root object is a container with four properties named `x`, `y`, `z`, and `t`.
+* Property `x` is either type *long* or type *string*.
+* Property `y` is either type *double* or another container with a property `w` of type *string*.
+* Property `z` is an array, indicated by the `indexer` keyword, where each item can be either type *long* or type *string*.
+* Property `t` is an array, indicated by the `indexer` keyword, where each item is a *string*.
+* Every property is implicitly optional, and any array might be empty.
+
+## Related content
+
+* [Best practices for schema management](../management/management-best-practices.md)
+* [getschema operator](getschema-operator.md)
+* [infer_storage_schema plugin](infer-storage-schema-plugin.md)
diff --git a/data-explorer/kusto/query/count-aggregation-function.md b/data-explorer/kusto/query/count-aggregation-function.md
index d2ea0c1710..09eeadadd9 100644
--- a/data-explorer/kusto/query/count-aggregation-function.md
+++ b/data-explorer/kusto/query/count-aggregation-function.md
@@ -3,14 +3,13 @@ title: count() (aggregation function)
description: Learn how to use the count() function to count the number of records in a group.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
monikerRange: "microsoft-fabric || azure-data-explorer || azure-monitor || microsoft-sentinel "
---
# count() (aggregation function)
> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)]
-
Counts the number of records per summarization group, or total if summarization is done without grouping.
[!INCLUDE [ignore-nulls](../includes/ignore-nulls.md)]
@@ -32,7 +31,7 @@ Returns a count of the records per summarization group, or in total if summariza
## Example
-This example returns a count of events in states:
+The following example returns a count of events in states:
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -46,22 +45,26 @@ StormEvents
**Output**
-|State|Count|
-|---|---|
-|TEXAS |4701|
-|KANSAS |3166|
-|IOWA |2337|
-|ILLINOIS |2022|
-|MISSOURI |2016|
-|GEORGIA |1983|
-|MINNESOTA |1881|
-|WISCONSIN |1850|
-|NEBRASKA |1766|
-|NEW YORK |1750|
-|...|...|
+| State | Count |
+|--|--|
+| TEXAS | 4701 |
+| KANSAS | 3166 |
+| IOWA | 2337 |
+| ILLINOIS | 2022 |
+| MISSOURI | 2016 |
+| GEORGIA | 1983 |
+| MINNESOTA | 1881 |
+| WISCONSIN | 1850 |
+| NEBRASKA | 1766 |
+| NEW YORK | 1750 |
+| ... | ... |
::: moniker range="microsoft-fabric || azure-data-explorer || azure-monitor || microsoft-sentinel"
## Related content
-* [bin_at()](bin-at-function.md#bin_at) rounds values down to a fixed-size bin, which can be used to aggregate data, such as by time unit.
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [countof()](countof-function.md)
+* [countif() (aggregation function)](countif-aggregation-function.md)
+* [count_distinct() (aggregation function)](count-distinct-aggregation-function.md)
+* [bin_at()](bin-at-function.md#bin_at)
::: moniker-end
\ No newline at end of file
diff --git a/data-explorer/kusto/query/count-distinct-aggregation-function.md b/data-explorer/kusto/query/count-distinct-aggregation-function.md
index e5a84fb42f..580be78486 100644
--- a/data-explorer/kusto/query/count-distinct-aggregation-function.md
+++ b/data-explorer/kusto/query/count-distinct-aggregation-function.md
@@ -3,7 +3,7 @@ title: count_distinct() (aggregation function) - (preview)
description: Learn how to use the count_distinct() (aggregation function) to count unique values specified by a scalar expression per summary group.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# count_distinct() (aggregation function) - (preview)
@@ -21,7 +21,7 @@ To count only records for which a predicate returns `true`, use the [count_disti
> [!NOTE]
>
-> * This function is limited to 100M unique values. An attempt to apply the function on an expression returning too many values will produce a runtime error (HRESULT: 0x80DA0012).
+> * This function is limited to 100M unique values. An attempt to apply the function on an expression returning too many values produce a runtime error (HRESULT: 0x80DA0012).
:::moniker range="azure-data-explorer"
> * Function performance can be degraded when operating on multiple data sources from different clusters.
::: moniker-end
@@ -71,3 +71,11 @@ StormEvents
| PENNSYLVANIA | 25 |
| GEORGIA | 24 |
| NORTH CAROLINA | 23 |
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [count_distinctif() (aggregation function)](count-distinctif-aggregation-function.md)
+* [count() (aggregation function)](count-aggregation-function.md)
+* [countof()](countof-function.md)
+* [countif() (aggregation function)](countif-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/count-distinctif-aggregation-function.md b/data-explorer/kusto/query/count-distinctif-aggregation-function.md
index 4c11f28f29..f85a3f2e4f 100644
--- a/data-explorer/kusto/query/count-distinctif-aggregation-function.md
+++ b/data-explorer/kusto/query/count-distinctif-aggregation-function.md
@@ -3,7 +3,7 @@ title: count_distinctif() (aggregation function) - (preview)
description: Learn how to use the count_distinctif() function to count unique values of a scalar expression in records for which the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# count_distinctif() (aggregation function) - (preview)
@@ -17,7 +17,7 @@ If you only need an estimation of unique values count, we recommend using the le
> [!NOTE]
>
-> * This function is limited to 100M unique values. An attempt to apply the function on an expression returning too many values will produce a runtime error (HRESULT: 0x80DA0012).
+> * This function is limited to 100M unique values. An attempt to apply the function on an expression returning too many values produces a runtime error (HRESULT: 0x80DA0012).
:::moniker range="azure-data-explorer"
> * Function performance can be degraded when operating on multiple data sources from different clusters.
::: moniker-end
@@ -44,7 +44,7 @@ Integer value indicating the number of unique values of *expr* per summary group
## Example
-This example shows how many types of death-causing storm events happened in each state. Only storm events with a nonzero count of deaths will be counted.
+This example shows how many types of death-causing storm events happened in each state. Only storm events with a nonzero count of deaths are counted.
:::moniker range="azure-data-explorer"
> [!NOTE]
@@ -70,3 +70,10 @@ StormEvents
| OKLAHOMA | 10 |
| NEW YORK | 9 |
| KANSAS | 9 |
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [count_distinct() (aggregation function)](count-distinct-aggregation-function.md)
+* [countif() (aggregation function)](countif-aggregation-function.md)
+* [dcountif() (aggregation function)](dcountif-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/countif-aggregation-function.md b/data-explorer/kusto/query/countif-aggregation-function.md
index 15ec3a008c..a9305d50d9 100644
--- a/data-explorer/kusto/query/countif-aggregation-function.md
+++ b/data-explorer/kusto/query/countif-aggregation-function.md
@@ -3,7 +3,7 @@ title: countif() (aggregation function)
description: Learn how to use the countif() function to count the rows where the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# countif() (aggregation function)
@@ -49,23 +49,23 @@ StormEvents
The results table shown includes only the first 10 rows.
-| State | TotalCount | TotalWithDamage |
-| -------------------- | ---------- | --------------- |
-| TEXAS | 4701 | 72 |
-| KANSAS | 3166 | 70 |
-| IOWA | 2337 | 359 |
-| ILLINOIS | 2022 | 35 |
-| MISSOURI | 2016 | 78 |
-| GEORGIA | 1983 | 17 |
-| MINNESOTA | 1881 | 37 |
-| WISCONSIN | 1850 | 75 |
-| NEBRASKA | 1766 | 201 |
-| NEW YORK | 1750 | 1 |
+| State | TotalCount | TotalWithDamage |
+|--|--|--|
+| TEXAS | 4701 | 72 |
+| KANSAS | 3166 | 70 |
+| IOWA | 2337 | 359 |
+| ILLINOIS | 2022 | 35 |
+| MISSOURI | 2016 | 78 |
+| GEORGIA | 1983 | 17 |
+| MINNESOTA | 1881 | 37 |
+| WISCONSIN | 1850 | 75 |
+| NEBRASKA | 1766 | 201 |
+| NEW YORK | 1750 | 1 |
| ... | ... | ... |
### Count based on string length
-This example shows the number of names with more than 4 letters.
+This example shows the number of names with more than four letters.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -92,4 +92,7 @@ T
## Related content
-[count()](count-aggregation-function.md) function, which counts rows without predicate expression.
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [count_distinctif() (aggregation function) - (preview)](count-distinctif-aggregation-function.md)
+* [dcountif() (aggregation function)](dcountif-aggregation-function.md)
+* [count()](count-aggregation-function.md)
diff --git a/data-explorer/kusto/query/dcount-aggregation-function.md b/data-explorer/kusto/query/dcount-aggregation-function.md
index f8d7fb1e8d..e650ff22f2 100644
--- a/data-explorer/kusto/query/dcount-aggregation-function.md
+++ b/data-explorer/kusto/query/dcount-aggregation-function.md
@@ -3,7 +3,7 @@ title: dcount() (aggregation function)
description: Learn how to use the dcount() function to return an estimate of the number of distinct values of an expression within a group.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# dcount() (aggregation function)
@@ -14,7 +14,7 @@ Calculates an estimate of the number of distinct values that are taken by a scal
[!INCLUDE [ignore-nulls](../includes/ignore-nulls.md)]
> [!NOTE]
-> The `dcount()` aggregation function is primarily useful for estimating the cardinality of huge sets. It trades accuracy for performance, and may return a result that varies between executions. The order of inputs may have an effect on its output.
+> The `dcount()` aggregation function is primarily useful for estimating the cardinality of huge sets. It trades accuracy for performance, and might return a result that varies between executions. The order of inputs might have an effect on its output.
[!INCLUDE [data-explorer-agg-function-summarize-note](../includes/agg-function-summarize-note.md)]
@@ -70,3 +70,10 @@ The results table shown includes only the first 10 rows.
## Estimation accuracy
[!INCLUDE [data-explorer-estimation-accuracy](../includes/estimation-accuracy.md)]
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [dcountif() (aggregation function)](dcountif-aggregation-function.md)
+* [count()](count-aggregation-function.md)
+* [count_distinct() (aggregation function)](count-distinct-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/dcount-intersect-plugin.md b/data-explorer/kusto/query/dcount-intersect-plugin.md
index fae0d14978..90c600fb01 100644
--- a/data-explorer/kusto/query/dcount-intersect-plugin.md
+++ b/data-explorer/kusto/query/dcount-intersect-plugin.md
@@ -63,3 +63,7 @@ range x from 1 to 100 step 1
|evenNumbers|even_and_mod3|even_and_mod3_and_mod5|
|---|---|---|
|50|16|3|
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/dcountif-aggregation-function.md b/data-explorer/kusto/query/dcountif-aggregation-function.md
index 1b1a00a125..a08b7c15f0 100644
--- a/data-explorer/kusto/query/dcountif-aggregation-function.md
+++ b/data-explorer/kusto/query/dcountif-aggregation-function.md
@@ -3,7 +3,7 @@ title: dcountif() (aggregation function)
description: Learn how to use the dcountif() function to return an estimate of the number of distinct values of an expression for rows where the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# dcountif() (aggregation function)
@@ -34,7 +34,7 @@ Estimates the number of distinct values of *expr* for rows in which *predicate*
Returns an estimate of the number of distinct values of *expr* for rows in which *predicate* evaluates to `true`.
> [!TIP]
-> `dcountif()` may return an error in cases where all, or none of the rows pass the `Predicate` expression.
+> `dcountif()` might return an error in cases where all, or none of the rows pass the `Predicate` expression.
## Example
@@ -71,3 +71,10 @@ The results table shown includes only the first 10 rows.
## Estimation accuracy
[!INCLUDE [data-explorer-estimation-accuracy](../includes/estimation-accuracy.md)]
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [dcount() (aggregation function)](dcount-aggregation-function.md)
+* [countif() (aggregation function)](countif-aggregation-function.md)
+* [count_distinctif() (aggregation function)](count-distinctif-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/hll-aggregation-function.md b/data-explorer/kusto/query/hll-aggregation-function.md
index 206aa31612..5ed2f31c13 100644
--- a/data-explorer/kusto/query/hll-aggregation-function.md
+++ b/data-explorer/kusto/query/hll-aggregation-function.md
@@ -3,27 +3,18 @@ title: hll() (aggregation function)
description: Learn how to use the hll() function to calculate the results of the dcount() function.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# hll() (aggregation function)
> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)]
-The `hll()` function is a way to estimate the number of unique values in a set of values. It does this by calculating intermediate results for aggregation within the [summarize](summarize-operator.md) operator for a group of data using the [`dcount`](dcount-aggregation-function.md) function.
+The `hll()` function is a way to estimate the number of unique values in a set of values. It does so by calculating intermediate results for aggregation within the [summarize](summarize-operator.md) operator for a group of data using the [`dcount`](dcount-aggregation-function.md) function.
Read about the [underlying algorithm (*H*yper*L*og*L*og) and the estimation accuracy](#estimation-accuracy).
[!INCLUDE [data-explorer-agg-function-summarize-note](../includes/agg-function-summarize-note.md)]
-> [!TIP]
->
->- Use the [hll_merge](hll-merge-function.md) function to merge the results of multiple `hll()` functions.
->- Use the [dcount_hll](dcount-hll-function.md) function to calculate the number of distinct values from the output of the `hll()` or `hll_merge` functions.
-
-> [!IMPORTANT]
->The results of hll(), hll_if(), and hll_merge() can be stored and later retrieved. For example, you may want to create a daily unique users summary, which can then be used to calculate weekly counts.
-> However, the precise binary representation of these results may change over time. There's no guarantee that these functions will produce identical results for identical inputs, and therefore we don't advise relying on them.
-
## Syntax
`hll` `(`*expr* [`,` *accuracy*]`)`
@@ -41,6 +32,12 @@ Read about the [underlying algorithm (*H*yper*L*og*L*og) and the estimation accu
Returns the intermediate results of distinct count of *expr* across the group.
+> [!NOTE]
+> - The results of hll(), hll_if(), and hll_merge() can be stored and later retrieved. For example, you might want to create a daily unique user summary, which can then be used to calculate weekly counts.
+> However, the precise binary representation of these results might change over time. There's no guarantee that these functions produce identical results for identical inputs, and therefore we don't advise relying on them.
+> - Use the [hll_merge](hll-merge-function.md) function to merge the results of multiple `hll()` functions.
+> - Use the [dcount_hll](dcount-hll-function.md) function to calculate the number of distinct values from the output of the `hll()` or `hll_merge` functions.
+
## Example
In the following example, the `hll()` function is used to estimate the number of unique values of the `DamageProperty` column within each 10-minute time bin of the `StartTime` column.
@@ -55,6 +52,8 @@ StormEvents
| summarize hll(DamageProperty) by bin(StartTime,10m)
```
+**Output**
+
The results table shown includes only the first 10 rows.
| StartTime | hll_DamageProperty |
@@ -72,3 +71,10 @@ The results table shown includes only the first 10 rows.
## Estimation accuracy
[!INCLUDE [data-explorer-estimation-accuracy](../includes/estimation-accuracy.md)]
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [Using hll() and tdigest()](using-hll-tdigest.md)
+* [hll_if() (aggregation function)](hll-if-aggregation-function.md)
+* [hll_merge()](hll-merge-function.md)
diff --git a/data-explorer/kusto/query/hll-if-aggregation-function.md b/data-explorer/kusto/query/hll-if-aggregation-function.md
index e8789f3dd8..7f092be552 100644
--- a/data-explorer/kusto/query/hll-if-aggregation-function.md
+++ b/data-explorer/kusto/query/hll-if-aggregation-function.md
@@ -3,7 +3,7 @@ title: hll_if() (aggregation function)
description: Learn how to use the hll_if() function to calculate the intermediate results of the dcount() function.
ms.reviewer: ziham
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# hll_if() (aggregation function)
@@ -15,10 +15,6 @@ Read about the [underlying algorithm (*H*yper*L*og*L*og) and the estimation accu
[!INCLUDE [data-explorer-agg-function-summarize-note](../includes/agg-function-summarize-note.md)]
-> [!IMPORTANT]
-> The results of hll(), hll_if(), and hll_merge() can be stored and later retrieved. For example, you may want to create a daily unique users summary, which can then be used to calculate weekly counts.
-> However, the precise binary representation of these results may change over time. There's no guarantee that these functions will produce identical results for identical inputs, and therefore we don't advise relying on them.
-
## Syntax
`hll_if` `(`*expr*, *predicate* [`,` *accuracy*]`)`
@@ -37,13 +33,16 @@ Read about the [underlying algorithm (*H*yper*L*og*L*og) and the estimation accu
Returns the intermediate results of distinct count of *Expr* for which *Predicate* evaluates to `true`.
-> [!TIP]
->
-> - You can use the aggregation function [`hll_merge`](hll-merge-aggregation-function.md) to merge more than one `hll` intermediate result. Only works with `hll` output only.
-> - You can use [`dcount_hll`](dcount-hll-function.md), to calculate the distinct count from `hll`,`hll_merge`, or `hll_if` aggregation functions.
+> [!NOTE]
+> - The results of hll(), hll_if(), and hll_merge() can be stored and later retrieved. For example, you might want to create a daily unique user summary, which can then be used to calculate weekly counts.
+> However, the precise binary representation of these results might change over time. There's no guarantee that these functions produce identical results for identical inputs, and therefore we don't advise relying on them.
+> - Use the [`hll_merge`](hll-merge-aggregation-function.md) function to merge more than one `hll` intermediate result. Only works with `hll` output.
+> - Use [`dcount_hll`](dcount-hll-function.md), to calculate the distinct count from `hll`,`hll_merge`, or `hll_if` aggregation functions.
## Examples
+The following query results in the number of unique flood event sources in Iowa and Kansas. It uses the `hll_if()` function to show only flood events.
+
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
> Run the query
@@ -56,6 +55,8 @@ StormEvents
| project State, SourcesOfFloodEvents = dcount_hll(hll_flood)
```
+**Output**
+
|State|SourcesOfFloodEvents|
|---|---|
|KANSAS|11|
@@ -70,3 +71,10 @@ StormEvents
| 2 | Slow | 0.4 |
| 3 | Slow | 0.28 |
| 4 | Slowest | 0.2 |
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [Using hll() and tdigest()](using-hll-tdigest.md)
+* [hll() (aggregation function)](hll-aggregation-function.md)
+* [hll_merge()](hll-merge-function.md)
From 66029c4e40de2f32dc6d717ebe87b33b0aa99922 Mon Sep 17 00:00:00 2001
From: Yifat Schachter
Date: Wed, 15 Jan 2025 10:32:22 +0200
Subject: [PATCH 02/25] Materialized views monitoring
---
.../materialized-views-monitoring.md | 103 ++++++++++++++----
1 file changed, 81 insertions(+), 22 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 5c21b9bdc3..d0b498dbdd 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -12,7 +12,7 @@ ms.date: 08/11/2024
Monitor the materialized view's health in the following ways:
::: moniker range="azure-data-explorer"
* Monitor [materialized view metrics](/azure/data-explorer/using-metrics#materialized-view-metrics) in the Azure portal.
- * The materialized view age metric `MaterializedViewAgeSeconds` should be used to monitor the freshness of the view. This one should be the primary metric to monitor.
+ * The materialized view age metric `MaterializedViewAgeSeconds` should be used to monitor the freshness of the view. This should be the primary metric to monitor.
::: moniker-end
* Monitor the `IsHealthy` property returned from [`.show materialized-view`](materialized-view-show-command.md#show-materialized-views).
* Check for failures using [`.show materialized-view failures`](materialized-view-show-failures-command.md#show-materialized-view-failures).
@@ -23,33 +23,44 @@ Monitor the materialized view's health in the following ways:
## Troubleshooting unhealthy materialized views
-The `MaterializedViewHealth` metric indicates whether a materialized view is healthy. Before a materialized view becomes unhealthy, its age, noted by the `MaterializedViewAgeSeconds` metric, gradually increases.
+If the materialized view is not keeping up with the ingestion rate and not able to materialize all newly ingested data in a timely manner, the `MaterializedViewAge` metric will gradually increase, and the `MaterializedViewHealth` metric will show that the view is unhealthy.
+You can follow the recommendations below to troubleshoot why the materialized view is unhealthy:
-A materialized view can become unhealthy for any or all of the following reasons:
+1. Check how many materialized views are defined on the cluster. If there are several, then the concurrency in which they run depends on the cluster's current materialized views capacity. You can check the current capacity by running the following command:
-* The materialization process is failing. The [MaterializedViewResult metric](#materializedviewresult-metric) and the [`.show materialized-view failures`](materialized-view-show-failures-command.md#show-materialized-view-failures) command can help identify the root cause of the failure.
-* The system may have automatically disabled the materialized view, due to changes to the source table. You can check if the view is disabled by checking the `IsEnabled` column returned from [`.show materialized-view` command](materialized-view-show-command.md#show-materialized-views). See more details in [materialized views limitations and known issues](materialized-views-limitations.md#the-materialized-view-source)
-* The database doesn't have sufficient capacity to materialize all incoming data on-time. In this case, there may not be failures in execution. However, the view's age gradually increases, since it isn't able to keep up with the ingestion rate. There could be several root causes for this situation:
- * There are more materialized views in the database, and the database doesn't have sufficient capacity to run all views. See [materialized view capacity policy](../capacity-policy.md#materialized-views-capacity-policy) to change the default settings for number of materialized views executed concurrently.
- * Materialization is slow because the newly ingested data intersects with a large portion of the view and there are many records to update in each materialization cycle. To learn more about why this impacts the view's performance, see [how materialized views work](materialized-view-overview.md#how-materialized-views-work).
+ ```kusto
+ .show capacity
+ | where Resource == "MaterializedView"
+ | project Resource, Total, Consumed
+ ```
-## MaterializedViewResult metric
+ |Resource|Total|Consumed|
+ |---|---|---|
+ |MaterializedView|1|0|
-The `MaterializedViewResult` metric provides information about the result of a materialization cycle, and can be used to identify issues in the materialized view health status. The metric includes the `Database` and `MaterializedViewName` and a `Result` dimension.
+ The value in `Total` shows how many materialized views can run concurrently, while `Consumed` shows how many are currently running. The `Total` value is based on the [materialized view capacity policy](../capacity-policy.md#materialized-views-capacity-policy). The policy specifies the minimum and maximum concurrent operations, and the system chooses the current concurrency based on the cluster's available resources. This decision is conservative - the system will increase concurrency only if the cluster's CPU is under a threshold throughout some period of time. You can override the system's decision and increase concurrency of materialization processes by setting the minimum concurrent operations in the policy:
-The `Result` dimension can have one of the following values:
-
-* **Success**: Materialization completed successfully.
-* **SourceTableNotFound**: Source table of the materialization view was dropped. The materialized view is automatically disabled as a result.
-* **SourceTableSchemaChange**: The schema of the source table has changed in a way that isn't compatible with the materialized view definition (materialized view query doesn't match the materialized view schema). The materialized view is automatically disabled as a result.
-* **InsufficientCapacity**: The database doesn't have sufficient capacity to materialize the materialized view. This can either indicate missing [ingestion capacity](../capacity-policy.md#ingestion-capacity) or missing [materialized views capacity](../capacity-policy.md#materialized-views-capacity-policy). Insufficient capacity failures can be transient, but if they reoccur often we recommend scaling out the database or increasing relevant capacity in the policy.
-* **InsufficientResources:** The database doesn't have sufficient resources (CPU/memory) to materialize the materialized view. This failure may be transient, but if it reoccurs try scaling the database up or out.
+ ```kusto
+ .alter-merge cluster policy capacity '{ "MaterializedViewsCapacity": { "ClusterMinimumConcurrentOperations": 3 } }'
+ ```
- * If the materialization process hits memory limits, the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group) limits can be increased to support more memory or CPU for the materialization process to consume.
-
- For example, the following command will alter the materialized views workload group to use a max of 64 gigabytes (GB) of memory per node during materialization (the default value is 15 GB):
+ If you explicitly change this policy, you should monitor the cluster's health and verify other workloads are not impacted by this change.
+
+1. Check if there are failures during materialization process using [`.show materialized-view failures command`](materialized-view-show-failures-command.md#show-materialized-view-failures).
+ * If the failure is transient (for example, hitting memory limits, query timeout), the system will automatically retry the operation, but such failures can delay the materialization and result in an increase in the materialized view age. See more recommendations below about how to troubleshoot transient failures.
+ * If the error is permanent (for example, a change in the schema of the source table that makes it incompatible with the materialized view), the system will automatically disable the materialized view. You can identify such a situation by checking the `IsEnabled` column in the [.show materialized-view](materialized-view-show-command.md), and by checking the [journal](../journal.md) for the disable event. See more details about this in the [.create materialized-view command](materialized-view-create.md#supported-properties).
- ~~~kusto
+1. Analyze the materialization process using [.show commands-and-queries command](../commands-and-queries.md). You can analyze workloads of a specific view using the following command (replace `DatabaseName` and `ViewName`):
+
+ ```kusto
+ .show commands-and-queries
+ | where Database == "DatabaseName" and ClientActivityId startswith "DN.MaterializedViews;ViewName;"
+ ```
+
+ * Check the memory consumption in the `MemoryPeak` column. The materialization process is limited to 15GB memory peak by default. If the queries / commands run by the materialization process exceed this value, materialization fails due to memory limits. You can alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group) to increase the memory peak per node in materialization process. For example, the following command will alter the materialized views workload group to use a max of 64 gigabytes (GB) of memory per node during materialization:
+
+
+ ```kusto
.alter-merge workload_group ['$materialized-views'] ```
{
"RequestLimitsPolicy": {
@@ -58,11 +69,59 @@ The `Result` dimension can have one of the following values:
}
}
} ```
- ~~~
> [!NOTE]
> MaxMemoryPerQueryPerNode can't be set to more than 50% of the total memory of each node.
+ * Check if the materialization process is hitting cold cache. If the view is not fully in hot cache, materialization can hit disk misses, which significantly slows down materialization. You can read more about [hot and cold cache and caching policy](../cache-policy.md) and how to [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md). The following command, for example, shows cache statistics in the past day:
+
+
+ ```kusto
+ .show commands-and-queries
+ | where ClientActivityId startswith "DN.MaterializedViews;ViewName"
+ | where StartedOn > ago(1d)
+ | extend HotCacheHits = tolong(CacheStatistics.Shards.Hot.HitBytes),
+ HotCacheMisses = tolong(CacheStatistics.Shards.Hot.MissBytes),
+ HotCacheRetreived = tolong(CacheStatistics.Shards.Hot.RetrieveBytes),
+ ColdCacheHits = tolong(CacheStatistics.Shards.Cold.HitBytes),
+ ColdCacheMisses = tolong(CacheStatistics.Shards.Cold.MissBytes),
+ ColdCacheRetreived = tolong(CacheStatistics.Shards.Cold.RetrieveBytes)
+ | summarize HotCacheHits = format_bytes(sum(HotCacheHits)),
+ HotCacheMisses = format_bytes(sum(HotCacheMisses)),
+ HotCacheRetreived = format_bytes(sum(HotCacheRetreived)),
+ ColdCacheHits =format_bytes(sum(ColdCacheHits)),
+ ColdCacheMisses = format_bytes(sum(ColdCacheMisses)),
+ ColdCacheRetreived = format_bytes(sum(ColdCacheRetreived))
+ ```
+
+ |HotCacheHits|HotCacheMisses|HotCacheRetreived|ColdCacheHits|ColdCacheMisses|ColdCacheRetreived|
+ |---|---|---|---|---|---|
+ |26 GB|0 Bytes|0 Bytes|1 GB|0 Bytes|866 MB|
+
+ * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics`. If the number of scanned extents is high, and the `MinDataScannedTime` is old, this indicates the materialization cycle requires scanning all, or most, of the materialized part of the view in order to find intersections with the "delta". See more "delta" and "materialized part" in [How materialized views work](materialized-view-overview.md#how-materialized-views-work). Below are several recommendations for minimizing the intersection with the "delta", and therefore reducing the amount of data scanned in materialization cycles.
+
+1. If the analysis above shows that each materialization cycle scans much data, potentially cold cache as well, you can consider the following changes to the definition of view, if they are applicable to your scenario:
+ * Include a `datetime` group by key in the view definition. A `datetime` group by key can significantly reduce the amount of data scanned from the view, **as long as there is no late arriving data in this column**. See more in [Performance tips](materialized-view-create.md#performance-tips).
+ * Use a `lookback` as part of the view definition. Read more about `lookback` in [create materialized view properties](../../includes/materialized-view-create-properties.md).
+
+1. Check the cluster's ingestion utilization metric. If the cluster doesn't have sufficient ingestion capacity, materialization cannot run. The `MaterializedViewResult` metric will show `InsufficientCapacity` in this case. You can increase ingestion capacity by scaling the cluster, or by altering the cluster's [ingestion capacity policy](../capacity-policy.md#ingestion-capacity) (less recommended).
+
+1. If neither of the above suggestions work, and the view is still unhealthy, this indicates that the cluster doesn't have sufficient capacity and/or resources to materialize all data on time. You can consider the following options in this case:
+ * Scale out the cluster by increasing the min instance count. [Optimized autoscale](../../../manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option) does not take materialized views into consideration and does not scale out the cluster automatically if materialized views are unhealthy. Therefore, if you would like to give the cluster more resources to accommodate for materialized views, you need to set the minimum instance count accordingly.
+ * Split the materialized view into several smaller views, each covering a subset of the data. You can split based on some high cardinality key from the materialized view's group by keys, for example. All views will be based on same source table, and each will filter by `SourceTable | where hash(key, number_of_views) == i`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) to union between all materialized views, and use that function in queries. This option might consume more CPU, but will reduce the memory peak in materialization cycles, and therefore if the single view is failing due to memory limits, this approach can help.
+
+## MaterializedViewResult metric
+
+The `MaterializedViewResult` metric provides information about the result of a materialization cycle, and can be used to identify issues in the materialized view health status. The metric includes the `Database` and `MaterializedViewName` and a `Result` dimension.
+
+The `Result` dimension can have one of the following values:
+
+* **Success**: Materialization completed successfully.
+* **SourceTableNotFound**: Source table of the materialization view was dropped. The materialized view is automatically disabled as a result.
+* **SourceTableSchemaChange**: The schema of the source table has changed in a way that isn't compatible with the materialized view definition (materialized view query doesn't match the materialized view schema). The materialized view is automatically disabled as a result.
+* **InsufficientCapacity**: The cluster doesn't have sufficient capacity to materialize the materialized view, due to lack of [ingestion capacity](../capacity-policy.md#ingestion-capacity). Insufficient capacity failures can be transient, but if they reoccur often we recommend scaling out the cluster or increasing relevant capacity in the policy.
+* **InsufficientResources:** The database doesn't have sufficient resources (CPU/memory) to materialize the materialized view. This failure may be transient, but if it reoccurs try scaling the cluster up or out, and/or following the suggestions in the [troubleshooting section](#troubleshooting-unhealthy-materialized-views).
+
## Materialized views in follower databases
Materialized views can be defined in [follower databases](materialized-views-limitations.md#follower-databases). However, the monitoring of these materialized views should be based on the leader database, where the materialized view is defined. Specifically:
From 70ab707386523ecf86b5ed72735b415c78f77013 Mon Sep 17 00:00:00 2001
From: Yifat Schachter
Date: Wed, 15 Jan 2025 10:46:55 +0200
Subject: [PATCH 03/25] typos
---
.../materialized-views-monitoring.md | 23 ++++++++++---------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index d0b498dbdd..eb883afb81 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -23,7 +23,7 @@ Monitor the materialized view's health in the following ways:
## Troubleshooting unhealthy materialized views
-If the materialized view is not keeping up with the ingestion rate and not able to materialize all newly ingested data in a timely manner, the `MaterializedViewAge` metric will gradually increase, and the `MaterializedViewHealth` metric will show that the view is unhealthy.
+If the materialized view doesn't keep up with the ingestion rate and isn't able to materialize all newly ingested data in a timely manner, the `MaterializedViewAge` metric will gradually increase, and the `MaterializedViewHealth` metric will show that the view is unhealthy.
You can follow the recommendations below to troubleshoot why the materialized view is unhealthy:
1. Check how many materialized views are defined on the cluster. If there are several, then the concurrency in which they run depends on the cluster's current materialized views capacity. You can check the current capacity by running the following command:
@@ -48,7 +48,7 @@ You can follow the recommendations below to troubleshoot why the materialized vi
1. Check if there are failures during materialization process using [`.show materialized-view failures command`](materialized-view-show-failures-command.md#show-materialized-view-failures).
* If the failure is transient (for example, hitting memory limits, query timeout), the system will automatically retry the operation, but such failures can delay the materialization and result in an increase in the materialized view age. See more recommendations below about how to troubleshoot transient failures.
- * If the error is permanent (for example, a change in the schema of the source table that makes it incompatible with the materialized view), the system will automatically disable the materialized view. You can identify such a situation by checking the `IsEnabled` column in the [.show materialized-view](materialized-view-show-command.md), and by checking the [journal](../journal.md) for the disable event. See more details about this in the [.create materialized-view command](materialized-view-create.md#supported-properties).
+ * If the error is permanent (for example, a change in the schema of the source table that makes it incompatible with the materialized view), the system will automatically disable the materialized view. You can identify this case by checking the `IsEnabled` column in the [.show materialized-view](materialized-view-show-command.md), and by checking the [Journal](../journal.md) for the disable event. See more details in the [.create materialized-view command](materialized-view-create.md#supported-properties).
1. Analyze the materialization process using [.show commands-and-queries command](../commands-and-queries.md). You can analyze workloads of a specific view using the following command (replace `DatabaseName` and `ViewName`):
@@ -57,7 +57,7 @@ You can follow the recommendations below to troubleshoot why the materialized vi
| where Database == "DatabaseName" and ClientActivityId startswith "DN.MaterializedViews;ViewName;"
```
- * Check the memory consumption in the `MemoryPeak` column. The materialization process is limited to 15GB memory peak by default. If the queries / commands run by the materialization process exceed this value, materialization fails due to memory limits. You can alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group) to increase the memory peak per node in materialization process. For example, the following command will alter the materialized views workload group to use a max of 64 gigabytes (GB) of memory per node during materialization:
+ * Check the memory consumption in the `MemoryPeak` column. The materialization process is limited to 15GB memory peak per node by default. If the queries or commands run by the materialization process exceed this value, materialization fails due to memory limits. You can alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group) to increase the memory peak per node in materialization process. For example, the following command will alter the materialized views workload group to use a max of 64GB memory peak per node during materialization:
```kusto
@@ -68,15 +68,16 @@ You can follow the recommendations below to troubleshoot why the materialized vi
"Value": 68719241216
}
}
- } ```
+ }
+ ```
> [!NOTE]
> MaxMemoryPerQueryPerNode can't be set to more than 50% of the total memory of each node.
- * Check if the materialization process is hitting cold cache. If the view is not fully in hot cache, materialization can hit disk misses, which significantly slows down materialization. You can read more about [hot and cold cache and caching policy](../cache-policy.md) and how to [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md). The following command, for example, shows cache statistics in the past day:
+ * Check if the materialization process is hitting cold cache. If the view is not fully in hot cache, materialization can hit disk misses, which significantly slow down materialization. You can read more about [hot and cold cache and caching policy](../cache-policy.md) and how to [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md). The following command, for example, shows cache statistics of the materialization process for view `ViewName` in the past day:
-
- ```kusto
+
+ ```kusto
.show commands-and-queries
| where ClientActivityId startswith "DN.MaterializedViews;ViewName"
| where StartedOn > ago(1d)
@@ -92,15 +93,15 @@ You can follow the recommendations below to troubleshoot why the materialized vi
ColdCacheHits =format_bytes(sum(ColdCacheHits)),
ColdCacheMisses = format_bytes(sum(ColdCacheMisses)),
ColdCacheRetreived = format_bytes(sum(ColdCacheRetreived))
- ```
+ ```
|HotCacheHits|HotCacheMisses|HotCacheRetreived|ColdCacheHits|ColdCacheMisses|ColdCacheRetreived|
|---|---|---|---|---|---|
|26 GB|0 Bytes|0 Bytes|1 GB|0 Bytes|866 MB|
- * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics`. If the number of scanned extents is high, and the `MinDataScannedTime` is old, this indicates the materialization cycle requires scanning all, or most, of the materialized part of the view in order to find intersections with the "delta". See more "delta" and "materialized part" in [How materialized views work](materialized-view-overview.md#how-materialized-views-work). Below are several recommendations for minimizing the intersection with the "delta", and therefore reducing the amount of data scanned in materialization cycles.
+ * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics`. If the number of scanned extents is high, and the `MinDataScannedTime` is old, this indicates the materialization cycle requires scanning all, or most, of the materialized part of the view in order to find intersections with the "delta". See more about "delta" and "materialized part" in [How materialized views work](materialized-view-overview.md#how-materialized-views-work). Below are several recommendations for minimizing the intersection with the "delta", and therefore reducing the amount of data scanned in materialization cycles.
-1. If the analysis above shows that each materialization cycle scans much data, potentially cold cache as well, you can consider the following changes to the definition of view, if they are applicable to your scenario:
+1. If the analysis above shows that each materialization cycle scans much data, potentially cold cache as well, consider the following changes to the definition of view, if they are applicable to your scenario:
* Include a `datetime` group by key in the view definition. A `datetime` group by key can significantly reduce the amount of data scanned from the view, **as long as there is no late arriving data in this column**. See more in [Performance tips](materialized-view-create.md#performance-tips).
* Use a `lookback` as part of the view definition. Read more about `lookback` in [create materialized view properties](../../includes/materialized-view-create-properties.md).
@@ -108,7 +109,7 @@ You can follow the recommendations below to troubleshoot why the materialized vi
1. If neither of the above suggestions work, and the view is still unhealthy, this indicates that the cluster doesn't have sufficient capacity and/or resources to materialize all data on time. You can consider the following options in this case:
* Scale out the cluster by increasing the min instance count. [Optimized autoscale](../../../manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option) does not take materialized views into consideration and does not scale out the cluster automatically if materialized views are unhealthy. Therefore, if you would like to give the cluster more resources to accommodate for materialized views, you need to set the minimum instance count accordingly.
- * Split the materialized view into several smaller views, each covering a subset of the data. You can split based on some high cardinality key from the materialized view's group by keys, for example. All views will be based on same source table, and each will filter by `SourceTable | where hash(key, number_of_views) == i`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) to union between all materialized views, and use that function in queries. This option might consume more CPU, but will reduce the memory peak in materialization cycles, and therefore if the single view is failing due to memory limits, this approach can help.
+ * Split the materialized view into several smaller views, each covering a subset of the data. You can split based on some high cardinality key from the materialized view's group by keys, for example. All views will be based on same source table, and each will filter by `SourceTable | where hash(key, number_of_views) == i` where `i ∈ {0,1,…,number_of_views-1}`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) to union between all materialized views, and use that function in queries. This option might consume more CPU, but will reduce the memory peak in materialization cycles, and therefore if the single view is failing due to memory limits, this approach can help.
## MaterializedViewResult metric
From cf2b62adffa30bf4f79c27608add5791bde58c89 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Wed, 15 Jan 2025 15:51:36 +0200
Subject: [PATCH 04/25] edits
---
.../query/hll-merge-aggregation-function.md | 21 +++++++-----
.../query/make-bag-aggregation-function.md | 13 +++++---
.../query/make-bag-if-aggregation-function.md | 15 +++++++--
.../query/make-list-aggregation-function.md | 6 ++--
.../make-list-if-aggregation-function.md | 8 +++--
.../query/make-set-aggregation-function.md | 7 ++--
.../query/make-set-if-aggregation-function.md | 16 +++++----
.../kusto/query/max-aggregation-function.md | 7 ++--
.../kusto/query/maxif-aggregation-function.md | 33 +++++++++++--------
.../kusto/query/minif-aggregation-function.md | 33 +++++++++++--------
.../query/percentiles-aggregation-function.md | 4 ++-
.../percentilesw-aggregation-function.md | 8 ++++-
12 files changed, 112 insertions(+), 59 deletions(-)
diff --git a/data-explorer/kusto/query/hll-merge-aggregation-function.md b/data-explorer/kusto/query/hll-merge-aggregation-function.md
index ff1f46bb52..ff2d8c2dd7 100644
--- a/data-explorer/kusto/query/hll-merge-aggregation-function.md
+++ b/data-explorer/kusto/query/hll-merge-aggregation-function.md
@@ -3,7 +3,7 @@ title: hll_merge() (aggregation function)
description: Learn how to use the hll_merge() function to merge HLL results into a single HLL value.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# hll_merge() (aggregation function)
@@ -13,15 +13,11 @@ Merges HLL results across the group into a single HLL value.
> [!NOTE]
> You can't merge hll values that were created using different accuracy values. For more information, see [hll()](hll-aggregation-function.md).
-
+
[!INCLUDE [data-explorer-agg-function-summarize-note](../includes/agg-function-summarize-note.md)]
For more information, see the [underlying algorithm (*H*yper*L*og*L*og) and estimation accuracy](#estimation-accuracy).
-> [!IMPORTANT]
-> The results of hll(), hll_if(), and hll_merge() can be stored and later retrieved. For example, you may want to create a daily unique users summary, which can then be used to calculate weekly counts.
-> However, the precise binary representation of these results may change over time. There's no guarantee that these functions will produce identical results for identical inputs, and therefore we don't advise relying on them.
-
## Syntax
`hll_merge` `(`*hll*`)`
@@ -38,8 +34,10 @@ For more information, see the [underlying algorithm (*H*yper*L*og*L*og) and esti
The function returns the merged HLL values of *hll* across the group.
-> [!TIP]
-> Use the [dcount_hll](dcount-hll-function.md) function to calculate the `dcount` from [hll()](hll-aggregation-function.md) and [hll_merge()](hll-merge-aggregation-function.md) aggregation functions.
+> [!NOTE]
+> - The results of hll(), hll_if(), and hll_merge() can be stored and later retrieved. For example, you might want to create a daily unique user summary, which can then be used to calculate weekly counts.
+> However, the precise binary representation of these results might change over time. There's no guarantee that these functions produce identical results for identical inputs, and therefore we don't advise relying on them.
+> - Use the [dcount_hll](dcount-hll-function.md) function to calculate the `dcount` from [hll()](hll-aggregation-function.md) and [hll_merge()](hll-merge-aggregation-function.md) aggregation functions.
## Example
@@ -67,3 +65,10 @@ The results show only the first five results in the array.
## Estimation accuracy
[!INCLUDE [data-explorer-estimation-accuracy](../includes/estimation-accuracy.md)]
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [Using hll() and tdigest()](using-hll-tdigest.md)
+* [hll() (aggregation function)](hll-aggregation-function.md)
+* [hll_if() (aggregation function)](hll-if-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/make-bag-aggregation-function.md b/data-explorer/kusto/query/make-bag-aggregation-function.md
index a27d8f7048..ad5379170b 100644
--- a/data-explorer/kusto/query/make-bag-aggregation-function.md
+++ b/data-explorer/kusto/query/make-bag-aggregation-function.md
@@ -3,7 +3,7 @@ title: make_bag() (aggregation function)
description: Learn how to use the make_bag() aggregation function to create a dynamic JSON property bag.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# make_bag() (aggregation function)
@@ -33,8 +33,8 @@ Creates a `dynamic` JSON property bag (dictionary) of all the values of *expr* i
## Returns
-Returns a `dynamic` JSON property bag (dictionary) of all the values of *Expr* in the group, which are property bags. Non-dictionary values will be skipped.
-If a key appears in more than one row, an arbitrary value, out of the possible values for this key, will be selected.
+Returns a `dynamic` JSON property bag (dictionary) of all the values of *Expr* in the group, which are property bags. Nondictionary values are skipped.
+If a key appears in more than one row, an arbitrary value, out of the possible values for this key, is selected.
## Example
@@ -91,4 +91,9 @@ T
## Related content
-[bag_unpack()](bag-unpack-plugin.md).
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [make_bag_if() (aggregation function)](make-bag-if-aggregation-function.md)
+* [bag_unpack()](bag-unpack-plugin.md)
+* [bag_pack()](pack-function.md)
+* [make_list() (aggregation function)](make-list-aggregation-function.md)
+* [parse_json()](parse-json-function.md)
diff --git a/data-explorer/kusto/query/make-bag-if-aggregation-function.md b/data-explorer/kusto/query/make-bag-if-aggregation-function.md
index acb817b961..1a8d992a08 100644
--- a/data-explorer/kusto/query/make-bag-if-aggregation-function.md
+++ b/data-explorer/kusto/query/make-bag-if-aggregation-function.md
@@ -3,7 +3,7 @@ title: make_bag_if() (aggregation function)
description: Learn how to use the make_bag_if() function to create a dynamic JSON property bag of expression values where the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# make_bag_if() (aggregation function)
@@ -31,8 +31,8 @@ Creates a `dynamic` JSON property bag (dictionary) of *expr* values in records f
## Returns
-Returns a `dynamic` JSON property bag (dictionary) of *expr* values in records for which *predicate* evaluates to `true`. Non-dictionary values will be skipped.
-If a key appears in more than one row, an arbitrary value, out of the possible values for this key, will be selected.
+Returns a `dynamic` JSON property bag (dictionary) of *expr* values in records for which *predicate* evaluates to `true`. Nondictionary values are skipped.
+If a key appears in more than one row, an arbitrary value, out of the possible values for this key, are selected.
> [!NOTE]
> This function without the predicate is similar to [`make_bag`](make-bag-aggregation-function.md).
@@ -89,3 +89,12 @@ T
|prop01|prop03|
|---|---|
|val_a|val_c|
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [make_bag() (aggregation function)](make-bag-aggregation-function.md)
+* [bag_unpack()](bag-unpack-plugin.md)
+* [bag_pack()](pack-function.md)
+* [make_list_if() (aggregation function)](make-list-if-aggregation-function.md)
+* [parse_json()](parse-json-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/make-list-aggregation-function.md b/data-explorer/kusto/query/make-list-aggregation-function.md
index e21daeb97c..1748331ee5 100644
--- a/data-explorer/kusto/query/make-list-aggregation-function.md
+++ b/data-explorer/kusto/query/make-list-aggregation-function.md
@@ -3,7 +3,7 @@ title: make_list() (aggregation function)
description: Learn how to use the make_list() function to create a dynamic JSON object array of all the values of the expressions in the group.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
adobe-target: true
---
# make_list() (aggregation function)
@@ -146,4 +146,6 @@ shapes
## Related content
-[`make_list_if`](make-list-if-aggregation-function.md) operator is similar to `make_list`, except it also accepts a predicate.
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [`make_list_if`](make-list-if-aggregation-function.md)
+* [make_bag() (aggregation function)](make-bag-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/make-list-if-aggregation-function.md b/data-explorer/kusto/query/make-list-if-aggregation-function.md
index b3b5710474..6c6c7ea650 100644
--- a/data-explorer/kusto/query/make-list-if-aggregation-function.md
+++ b/data-explorer/kusto/query/make-list-if-aggregation-function.md
@@ -3,7 +3,7 @@ title: make_list_if() (aggregation function)
description: Learn how to use the make_list_if() aggregation function to create a dynamic JSON object of expression values where the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# make_list_if() (aggregation function)
@@ -31,7 +31,7 @@ Creates a `dynamic` array of *expr* values in the group for which *predicate* ev
## Returns
-Returns a `dynamic` array of *expr* vlaues in the group for which *predicate* evaluates to `true`.
+Returns a `dynamic` array of *expr* values in the group for which *predicate* evaluates to `true`.
If the input to the `summarize` operator isn't sorted, the order of elements in the resulting array is undefined.
If the input to the `summarize` operator is sorted, the order of elements in the resulting array tracks that of the input.
@@ -64,4 +64,6 @@ T
## Related content
-[`make_list`](make-list-aggregation-function.md) function, which does the same, without predicate expression.
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [`make_list`](make-list-aggregation-function.md)
+* [make_bag_if() (aggregation function)](make-bag-if-aggregation-function.md)
diff --git a/data-explorer/kusto/query/make-set-aggregation-function.md b/data-explorer/kusto/query/make-set-aggregation-function.md
index 8a8e46a873..78c4ac2c27 100644
--- a/data-explorer/kusto/query/make-set-aggregation-function.md
+++ b/data-explorer/kusto/query/make-set-aggregation-function.md
@@ -98,5 +98,8 @@ datatable (Val: int, Arr1: dynamic)
## Related content
-* Use [`mv-expand`](mv-expand-operator.md) operator for the opposite function.
-* [`make_set_if`](make-set-if-aggregation-function.md) operator is similar to `make_set`, except it also accepts a predicate.
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [`make_set_if`](make-set-if-aggregation-function.md)
+* [`make_list`](make-list-aggregation-function.md)
+* [make_bag() (aggregation function)](make-bag-aggregation-function.md)
+* [`mv-expand`](mv-expand-operator.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/make-set-if-aggregation-function.md b/data-explorer/kusto/query/make-set-if-aggregation-function.md
index 0a0c89ab70..593b180d36 100644
--- a/data-explorer/kusto/query/make-set-if-aggregation-function.md
+++ b/data-explorer/kusto/query/make-set-if-aggregation-function.md
@@ -3,7 +3,7 @@ title: make_set_if() (aggregation function)
description: Learn how to use the make_set_if() function to create a dynamic JSON object of a set of distinct values that an expression takes where the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# make_set_if() (aggregation function)
@@ -36,13 +36,9 @@ Returns a `dynamic` array of the set of distinct values that *expr* takes in rec
> [!TIP]
> To only count the distinct values, use [dcountif()](dcountif-aggregation-function.md).
-## Related content
-
-[`make_set`](make-set-aggregation-function.md) function, which does the same, without predicate expression.
-
## Example
-The following example shows a list of names with more than 4 letters.
+The following example shows a list of names with more than four letters.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -66,3 +62,11 @@ T
|set_name|
|----|
|["George", "Ringo"]|
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [make_set() (aggregation function)](make-set-aggregation-function.md)
+* [make_list_if() (aggregation function)](make-list-if-aggregation-function.md)
+* [make_bag_if() (aggregation function)](make-bag-if-aggregation-function.md)
+* [`mv-expand`](mv-expand-operator.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/max-aggregation-function.md b/data-explorer/kusto/query/max-aggregation-function.md
index 4df6fe5326..80807690fa 100644
--- a/data-explorer/kusto/query/max-aggregation-function.md
+++ b/data-explorer/kusto/query/max-aggregation-function.md
@@ -3,7 +3,7 @@ title: max() (aggregation function)
description: Learn how to use the max() function to find the maximum value of the expression in the table.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 11/11/2024
+ms.date: 01/15/2025
---
# max() (aggregation function)
@@ -30,11 +30,11 @@ Finds the maximum value of the expression in the table.
Returns the value in the table that maximizes the specified expression.
> [!TIP]
-> This gives you the max on its own. If you want to see other columns in addition to the max, use [arg_max](arg-max-aggregation-function.md).
+> This function gives you the max on its own. If you want to see other columns in addition to the max, use [arg_max](arg-max-aggregation-function.md).
## Example
-This example returns the last record in a table by querying the maximum value for StartTime.
+The following example returns the last record in a table by querying the maximum value for StartTime.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -54,6 +54,7 @@ StormEvents
## Related content
+* [Aggregation function types at a glance](aggregation-functions.md)
* [arg_max](arg-max-aggregation-function.md)
* [min function](min-aggregation-function.md)
* [avg function](avg-aggregation-function.md)
diff --git a/data-explorer/kusto/query/maxif-aggregation-function.md b/data-explorer/kusto/query/maxif-aggregation-function.md
index 6af235602b..e4feb534e8 100644
--- a/data-explorer/kusto/query/maxif-aggregation-function.md
+++ b/data-explorer/kusto/query/maxif-aggregation-function.md
@@ -3,7 +3,7 @@ title: maxif() (aggregation function)
description: Learn how to use the maxif() function to calculate the maximum value of an expression where the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# maxif() (aggregation function)
@@ -51,16 +51,23 @@ StormEvents
The results table shown includes only the first 10 rows.
-| State | MaxDamageNoCasualties |
-| -------------------- | --------------------- |
-| TEXAS | 25000000 |
-| KANSAS | 37500000 |
-| IOWA | 15000000 |
-| ILLINOIS | 5000000 |
-| MISSOURI | 500005000 |
-| GEORGIA | 344000000 |
-| MINNESOTA | 38390000 |
-| WISCONSIN | 45000000 |
-| NEBRASKA | 4000000 |
-| NEW YORK | 26000000 |
+| -- | -- |
+|--|--|
+| TEXAS | 25000000 |
+| KANSAS | 37500000 |
+| IOWA | 15000000 |
+| ILLINOIS | 5000000 |
+| MISSOURI | 500005000 |
+| GEORGIA | 344000000 |
+| MINNESOTA | 38390000 |
+| WISCONSIN | 45000000 |
+| NEBRASKA | 4000000 |
+| NEW YORK | 26000000 |
| ... | ... |
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [minif() (aggregation function)](minif-aggregation-function.md)
+* [max_of()](max-of-function.md)
+* [arg_max() (aggregation function)](arg-max-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/minif-aggregation-function.md b/data-explorer/kusto/query/minif-aggregation-function.md
index 103d00b793..9fbbde9982 100644
--- a/data-explorer/kusto/query/minif-aggregation-function.md
+++ b/data-explorer/kusto/query/minif-aggregation-function.md
@@ -3,7 +3,7 @@ title: minif() (aggregation function)
description: Learn how to use the minif() function to return the minimum value of an expression where the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# minif() (aggregation function)
@@ -52,16 +52,23 @@ StormEvents
The results table shown includes only the first 10 rows.
-| State | MinDamageWithCasualties |
-| -------------- | ----------------------- |
-| TEXAS | 8000 |
-| KANSAS | 5000 |
-| IOWA | 45000 |
-| ILLINOIS | 100000 |
-| MISSOURI | 10000 |
-| GEORGIA | 500000 |
-| MINNESOTA | 200000 |
-| WISCONSIN | 10000 |
-| NEW YORK | 25000 |
-| NORTH CAROLINA | 15000 |
+| State | MinDamageWithCasualties |
+|--|--|
+| TEXAS | 8000 |
+| KANSAS | 5000 |
+| IOWA | 45000 |
+| ILLINOIS | 100000 |
+| MISSOURI | 10000 |
+| GEORGIA | 500000 |
+| MINNESOTA | 200000 |
+| WISCONSIN | 10000 |
+| NEW YORK | 25000 |
+| NORTH CAROLINA | 15000 |
| ... | ... |
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [maxif() (aggregation function)](maxif-aggregation-function.md)
+* [min_of()](min-of-function.md)
+* [arg_max() (aggregation function)](arg-max-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/percentiles-aggregation-function.md b/data-explorer/kusto/query/percentiles-aggregation-function.md
index 96fee8b894..fe6012a2d2 100644
--- a/data-explorer/kusto/query/percentiles-aggregation-function.md
+++ b/data-explorer/kusto/query/percentiles-aggregation-function.md
@@ -3,7 +3,7 @@ title: percentile(), percentiles()
description: Learn how to use the percentile(), percentiles() functions to calculate estimates for nearest rank percentiles.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# percentile(), percentiles() (aggregation function)
@@ -211,4 +211,6 @@ The percentiles aggregate provides an approximate value using [T-Digest](https:/
## Related content
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [percentilew(), percentilesw() (aggregation function)](percentilesw-aggregation-function.md)
* [avg function](avg-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/percentilesw-aggregation-function.md b/data-explorer/kusto/query/percentilesw-aggregation-function.md
index 617336941b..9b498a136f 100644
--- a/data-explorer/kusto/query/percentilesw-aggregation-function.md
+++ b/data-explorer/kusto/query/percentilesw-aggregation-function.md
@@ -3,7 +3,7 @@ title: percentilew(), percentilesw()
description: Learn how to use the percentilew(), percentilesw() functions to calculate weighted percentiles.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/15/2025
---
# percentilew(), percentilesw() (aggregation function)
@@ -177,3 +177,9 @@ latencyTable
| percentile_LatencyBucket |
|---|---|---|
| [20, 20, 40] |
+
+## Related content
+
+* [Aggregation function types at a glance](aggregation-functions.md)
+* [percentile(), percentiles() (aggregation function)](percentiles-aggregation-function.md)
+* [avg function](avg-aggregation-function.md)
\ No newline at end of file
From e2de7c24181dc3fc4f1d6c68f7b223588ad315f5 Mon Sep 17 00:00:00 2001
From: Yifat Schachter
Date: Wed, 15 Jan 2025 23:55:03 +0200
Subject: [PATCH 05/25] changes
---
.../materialized-views-monitoring.md | 27 ++++++++++++-------
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index eb883afb81..447c3e2bba 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -23,10 +23,10 @@ Monitor the materialized view's health in the following ways:
## Troubleshooting unhealthy materialized views
-If the materialized view doesn't keep up with the ingestion rate and isn't able to materialize all newly ingested data in a timely manner, the `MaterializedViewAge` metric will gradually increase, and the `MaterializedViewHealth` metric will show that the view is unhealthy.
-You can follow the recommendations below to troubleshoot why the materialized view is unhealthy:
+If the `MaterializedViewAge` metric constantly increases, and the `MaterializedViewHealth` metric shows that the view is unhealthy,
+you can follow the recommendations below to troubleshoot why the materialized view is unhealthy:
-1. Check how many materialized views are defined on the cluster. If there are several, then the concurrency in which they run depends on the cluster's current materialized views capacity. You can check the current capacity by running the following command:
+1. Check how many materialized views there are on the cluster, and what is the current capacity for materialized views:
```kusto
.show capacity
@@ -38,7 +38,10 @@ You can follow the recommendations below to troubleshoot why the materialized vi
|---|---|---|
|MaterializedView|1|0|
- The value in `Total` shows how many materialized views can run concurrently, while `Consumed` shows how many are currently running. The `Total` value is based on the [materialized view capacity policy](../capacity-policy.md#materialized-views-capacity-policy). The policy specifies the minimum and maximum concurrent operations, and the system chooses the current concurrency based on the cluster's available resources. This decision is conservative - the system will increase concurrency only if the cluster's CPU is under a threshold throughout some period of time. You can override the system's decision and increase concurrency of materialization processes by setting the minimum concurrent operations in the policy:
+ If there are several materialized views in the cluster, then the concurrency in which they run depends on the value of
+ `Total` in the command above. The `Consumed` column shows how many are currently running. The [materialized view capacity policy](../capacity-policy.md#materialized-views-capacity-policy), specifies the minimum and maximum concurrent operations, and system
+ chooses the current concurrency, noted in `Total`, based on the cluster's available resources. You can override the system's
+ decision and increase concurrency of materialization processes by setting the minimum concurrent operations in the policy:
```kusto
.alter-merge cluster policy capacity '{ "MaterializedViewsCapacity": { "ClusterMinimumConcurrentOperations": 3 } }'
@@ -47,17 +50,17 @@ You can follow the recommendations below to troubleshoot why the materialized vi
If you explicitly change this policy, you should monitor the cluster's health and verify other workloads are not impacted by this change.
1. Check if there are failures during materialization process using [`.show materialized-view failures command`](materialized-view-show-failures-command.md#show-materialized-view-failures).
+ * If the error is permanent, the system will automatically disable the materialized view. You can identify this case by checking the `IsEnabled` column in the [.show materialized-view](materialized-view-show-command.md), and by checking the [Journal](../journal.md) for the disable event. This can happen, for example, if there's a change in the schema of the source table that makes it incompatible with the materialized view. See more details in the [.create materialized-view command](materialized-view-create.md#supported-properties).
* If the failure is transient (for example, hitting memory limits, query timeout), the system will automatically retry the operation, but such failures can delay the materialization and result in an increase in the materialized view age. See more recommendations below about how to troubleshoot transient failures.
- * If the error is permanent (for example, a change in the schema of the source table that makes it incompatible with the materialized view), the system will automatically disable the materialized view. You can identify this case by checking the `IsEnabled` column in the [.show materialized-view](materialized-view-show-command.md), and by checking the [Journal](../journal.md) for the disable event. See more details in the [.create materialized-view command](materialized-view-create.md#supported-properties).
-1. Analyze the materialization process using [.show commands-and-queries command](../commands-and-queries.md). You can analyze workloads of a specific view using the following command (replace `DatabaseName` and `ViewName`):
+1. Analyze the materialization process using [.show commands-and-queries command](../commands-and-queries.md) (replace `DatabaseName` and `ViewName` to filter on a specific view):
```kusto
.show commands-and-queries
| where Database == "DatabaseName" and ClientActivityId startswith "DN.MaterializedViews;ViewName;"
```
- * Check the memory consumption in the `MemoryPeak` column. The materialization process is limited to 15GB memory peak per node by default. If the queries or commands run by the materialization process exceed this value, materialization fails due to memory limits. You can alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group) to increase the memory peak per node in materialization process. For example, the following command will alter the materialized views workload group to use a max of 64GB memory peak per node during materialization:
+ * Check the memory consumption in the `MemoryPeak` column and whether there are operations that failed due to hitting memory limits (for example, [runaway queries](../../concepts/runaway-queries.md)). The materialization process is limited to 15GB memory peak per node by default. If the queries or commands during the materialization process exceed this value, materialization fails due to memory limits. You can alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group) to increase the memory peak per node in materialization process. For example, the following command will alter the materialized views workload group to use a max of 64GB memory peak per node during materialization:
```kusto
@@ -74,7 +77,7 @@ You can follow the recommendations below to troubleshoot why the materialized vi
> [!NOTE]
> MaxMemoryPerQueryPerNode can't be set to more than 50% of the total memory of each node.
- * Check if the materialization process is hitting cold cache. If the view is not fully in hot cache, materialization can hit disk misses, which significantly slow down materialization. You can read more about [hot and cold cache and caching policy](../cache-policy.md) and how to [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md). The following command, for example, shows cache statistics of the materialization process for view `ViewName` in the past day:
+ * Check if the materialization process is hitting cold cache. The following command, for example, shows cache statistics of the materialization process for view `ViewName` in the past day:
```kusto
@@ -94,18 +97,22 @@ You can follow the recommendations below to troubleshoot why the materialized vi
ColdCacheMisses = format_bytes(sum(ColdCacheMisses)),
ColdCacheRetreived = format_bytes(sum(ColdCacheRetreived))
```
-
+
|HotCacheHits|HotCacheMisses|HotCacheRetreived|ColdCacheHits|ColdCacheMisses|ColdCacheRetreived|
|---|---|---|---|---|---|
|26 GB|0 Bytes|0 Bytes|1 GB|0 Bytes|866 MB|
+ If the view is not fully in hot cache, materialization can hit disk misses, which significantly slow down materialization.
+ Increasing the caching policy for the materialized view will help avoiding cache misses. You can read more about
+ [hot and cold cache and caching policy](../cache-policy.md) and how to [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
+
* Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics`. If the number of scanned extents is high, and the `MinDataScannedTime` is old, this indicates the materialization cycle requires scanning all, or most, of the materialized part of the view in order to find intersections with the "delta". See more about "delta" and "materialized part" in [How materialized views work](materialized-view-overview.md#how-materialized-views-work). Below are several recommendations for minimizing the intersection with the "delta", and therefore reducing the amount of data scanned in materialization cycles.
1. If the analysis above shows that each materialization cycle scans much data, potentially cold cache as well, consider the following changes to the definition of view, if they are applicable to your scenario:
* Include a `datetime` group by key in the view definition. A `datetime` group by key can significantly reduce the amount of data scanned from the view, **as long as there is no late arriving data in this column**. See more in [Performance tips](materialized-view-create.md#performance-tips).
* Use a `lookback` as part of the view definition. Read more about `lookback` in [create materialized view properties](../../includes/materialized-view-create-properties.md).
-1. Check the cluster's ingestion utilization metric. If the cluster doesn't have sufficient ingestion capacity, materialization cannot run. The `MaterializedViewResult` metric will show `InsufficientCapacity` in this case. You can increase ingestion capacity by scaling the cluster, or by altering the cluster's [ingestion capacity policy](../capacity-policy.md#ingestion-capacity) (less recommended).
+1. Check if the `MaterializedViewResult` metric shows `InsufficientCapacity` values. This indicates that the cluster doesn't have sufficient ingestion capacity, which should also be noted in the cluster's [IngestionUtilization metric](../../../monitor-data-explorer-reference.md#supported-metrics-for-microsoftkustoclusters). You can increase ingestion capacity by scaling the cluster, or by altering the cluster's [ingestion capacity policy](../capacity-policy.md#ingestion-capacity) (less recommended).
1. If neither of the above suggestions work, and the view is still unhealthy, this indicates that the cluster doesn't have sufficient capacity and/or resources to materialize all data on time. You can consider the following options in this case:
* Scale out the cluster by increasing the min instance count. [Optimized autoscale](../../../manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option) does not take materialized views into consideration and does not scale out the cluster automatically if materialized views are unhealthy. Therefore, if you would like to give the cluster more resources to accommodate for materialized views, you need to set the minimum instance count accordingly.
From 57af969ec38edeb57cbce49af7591600412e8a38 Mon Sep 17 00:00:00 2001
From: Yifat Schachter
Date: Thu, 16 Jan 2025 09:28:54 +0200
Subject: [PATCH 06/25] cosmetics
---
.../materialized-views-monitoring.md | 82 +++++++++++++------
1 file changed, 58 insertions(+), 24 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 447c3e2bba..bdd26745ae 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -23,8 +23,8 @@ Monitor the materialized view's health in the following ways:
## Troubleshooting unhealthy materialized views
-If the `MaterializedViewAge` metric constantly increases, and the `MaterializedViewHealth` metric shows that the view is unhealthy,
-you can follow the recommendations below to troubleshoot why the materialized view is unhealthy:
+If the value of `MaterializedViewAge` metric constantly increases, and the `MaterializedViewHealth` metric shows that
+the view is unhealthy, you can follow the recommendations below to identify the root cause:
1. Check how many materialized views there are on the cluster, and what is the current capacity for materialized views:
@@ -38,10 +38,12 @@ you can follow the recommendations below to troubleshoot why the materialized vi
|---|---|---|
|MaterializedView|1|0|
- If there are several materialized views in the cluster, then the concurrency in which they run depends on the value of
- `Total` in the command above. The `Consumed` column shows how many are currently running. The [materialized view capacity policy](../capacity-policy.md#materialized-views-capacity-policy), specifies the minimum and maximum concurrent operations, and system
- chooses the current concurrency, noted in `Total`, based on the cluster's available resources. You can override the system's
- decision and increase concurrency of materialization processes by setting the minimum concurrent operations in the policy:
+ If there are several materialized views in the cluster, then the concurrency in which they run depends on the value of the
+ `Total` capacity. The `Consumed` column shows how many are currently running. The
+ [materialized view capacity policy](../capacity-policy.md#materialized-views-capacity-policy), specifies the minimum and
+ maximum concurrent operations, and system chooses the current concurrency, noted in `Total`, based on the cluster's
+ available resources. You can override the system's decision and increase concurrency of materialization processes by
+ setting the minimum concurrent operations in the policy:
```kusto
.alter-merge cluster policy capacity '{ "MaterializedViewsCapacity": { "ClusterMinimumConcurrentOperations": 3 } }'
@@ -50,17 +52,28 @@ you can follow the recommendations below to troubleshoot why the materialized vi
If you explicitly change this policy, you should monitor the cluster's health and verify other workloads are not impacted by this change.
1. Check if there are failures during materialization process using [`.show materialized-view failures command`](materialized-view-show-failures-command.md#show-materialized-view-failures).
- * If the error is permanent, the system will automatically disable the materialized view. You can identify this case by checking the `IsEnabled` column in the [.show materialized-view](materialized-view-show-command.md), and by checking the [Journal](../journal.md) for the disable event. This can happen, for example, if there's a change in the schema of the source table that makes it incompatible with the materialized view. See more details in the [.create materialized-view command](materialized-view-create.md#supported-properties).
- * If the failure is transient (for example, hitting memory limits, query timeout), the system will automatically retry the operation, but such failures can delay the materialization and result in an increase in the materialized view age. See more recommendations below about how to troubleshoot transient failures.
-
-1. Analyze the materialization process using [.show commands-and-queries command](../commands-and-queries.md) (replace `DatabaseName` and `ViewName` to filter on a specific view):
+ * If the error is permanent, the system will automatically disable the materialized view. You can identify this case by checking
+ the `IsEnabled` column in the [.show materialized-view](materialized-view-show-command.md), and by checking the [Journal](../journal.md)
+ for the disable event. This can happen, for example, if there's a change in the schema of the source table that makes it incompatible
+ with the materialized view. See more details in the [.create materialized-view command](materialized-view-create.md#supported-properties).
+ * If the failure is transient (for example, hitting memory limits, query timeout), the system will automatically retry the
+ operation, but such failures can delay the materialization and result in an increase in the materialized view age. See more
+ recommendations below about how to troubleshoot transient failures.
+
+1. Analyze the materialization process using [.show commands-and-queries command](../commands-and-queries.md)
+ (replace `DatabaseName` and `ViewName` to filter on a specific view):
```kusto
.show commands-and-queries
| where Database == "DatabaseName" and ClientActivityId startswith "DN.MaterializedViews;ViewName;"
```
- * Check the memory consumption in the `MemoryPeak` column and whether there are operations that failed due to hitting memory limits (for example, [runaway queries](../../concepts/runaway-queries.md)). The materialization process is limited to 15GB memory peak per node by default. If the queries or commands during the materialization process exceed this value, materialization fails due to memory limits. You can alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group) to increase the memory peak per node in materialization process. For example, the following command will alter the materialized views workload group to use a max of 64GB memory peak per node during materialization:
+ * Check the memory consumption in the `MemoryPeak` column and whether there are operations that failed due to hitting memory limits
+ (for example, [runaway queries](../../concepts/runaway-queries.md)). The materialization process is limited to 15GB memory peak
+ per node by default. If the queries or commands executed during the materialization process exceed this value, materialization
+ fails due to memory limits. You can alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group)
+ to increase the memory peak per node in materialization process. For example, the following command will alter the materialized views
+ workload group to use a max of 64GB memory peak per node during materialization:
```kusto
@@ -77,7 +90,8 @@ you can follow the recommendations below to troubleshoot why the materialized vi
> [!NOTE]
> MaxMemoryPerQueryPerNode can't be set to more than 50% of the total memory of each node.
- * Check if the materialization process is hitting cold cache. The following command, for example, shows cache statistics of the materialization process for view `ViewName` in the past day:
+ * Check if the materialization process is hitting cold cache. The following command, for example, shows cache statistics of the
+ materialization process for view `ViewName` in the past day:
```kusto
@@ -102,21 +116,41 @@ you can follow the recommendations below to troubleshoot why the materialized vi
|---|---|---|---|---|---|
|26 GB|0 Bytes|0 Bytes|1 GB|0 Bytes|866 MB|
- If the view is not fully in hot cache, materialization can hit disk misses, which significantly slow down materialization.
+ If the view is not fully in hot cache, materialization can hit disk misses, which significantly slows down materialization.
Increasing the caching policy for the materialized view will help avoiding cache misses. You can read more about
[hot and cold cache and caching policy](../cache-policy.md) and how to [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
- * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics`. If the number of scanned extents is high, and the `MinDataScannedTime` is old, this indicates the materialization cycle requires scanning all, or most, of the materialized part of the view in order to find intersections with the "delta". See more about "delta" and "materialized part" in [How materialized views work](materialized-view-overview.md#how-materialized-views-work). Below are several recommendations for minimizing the intersection with the "delta", and therefore reducing the amount of data scanned in materialization cycles.
-
-1. If the analysis above shows that each materialization cycle scans much data, potentially cold cache as well, consider the following changes to the definition of view, if they are applicable to your scenario:
- * Include a `datetime` group by key in the view definition. A `datetime` group by key can significantly reduce the amount of data scanned from the view, **as long as there is no late arriving data in this column**. See more in [Performance tips](materialized-view-create.md#performance-tips).
- * Use a `lookback` as part of the view definition. Read more about `lookback` in [create materialized view properties](../../includes/materialized-view-create-properties.md).
-
-1. Check if the `MaterializedViewResult` metric shows `InsufficientCapacity` values. This indicates that the cluster doesn't have sufficient ingestion capacity, which should also be noted in the cluster's [IngestionUtilization metric](../../../monitor-data-explorer-reference.md#supported-metrics-for-microsoftkustoclusters). You can increase ingestion capacity by scaling the cluster, or by altering the cluster's [ingestion capacity policy](../capacity-policy.md#ingestion-capacity) (less recommended).
-
-1. If neither of the above suggestions work, and the view is still unhealthy, this indicates that the cluster doesn't have sufficient capacity and/or resources to materialize all data on time. You can consider the following options in this case:
- * Scale out the cluster by increasing the min instance count. [Optimized autoscale](../../../manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option) does not take materialized views into consideration and does not scale out the cluster automatically if materialized views are unhealthy. Therefore, if you would like to give the cluster more resources to accommodate for materialized views, you need to set the minimum instance count accordingly.
- * Split the materialized view into several smaller views, each covering a subset of the data. You can split based on some high cardinality key from the materialized view's group by keys, for example. All views will be based on same source table, and each will filter by `SourceTable | where hash(key, number_of_views) == i` where `i ∈ {0,1,…,number_of_views-1}`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) to union between all materialized views, and use that function in queries. This option might consume more CPU, but will reduce the memory peak in materialization cycles, and therefore if the single view is failing due to memory limits, this approach can help.
+ * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics`. If the number of scanned extents is
+ high, and the `MinDataScannedTime` is old, this indicates the materialization cycle requires scanning all, or most, of the materialized
+ part of the view in order to find intersections with the "delta". See more about "delta" and "materialized part" in
+ [How materialized views work](materialized-view-overview.md#how-materialized-views-work). Below are several recommendations for
+ minimizing the intersection with the "delta", and therefore reducing the amount of data scanned in materialization cycles.
+
+1. If the analysis above shows that each materialization cycle scans much data, potentially cold cache as well, consider the following
+ changes to the definition of view, if they are applicable to your scenario:
+ * Include a `datetime` group by key in the view definition. A `datetime` group by key can significantly reduce the amount of
+ data scanned from the view, **as long as there is no late arriving data in this column**. See more in
+ [Performance tips](materialized-view-create.md#performance-tips).
+ * Use a `lookback` as part of the view definition. Read more about `lookback` in
+ [create materialized view properties](../../includes/materialized-view-create-properties.md).
+
+1. Check if the `MaterializedViewResult` metric shows `InsufficientCapacity` values. This indicates that the cluster doesn't have
+ sufficient ingestion capacity, which should also be noted in the cluster's [IngestionUtilization metric](../../../monitor-data-explorer-reference.md#supported-metrics-for-microsoftkustoclusters).
+ You can increase ingestion capacity by scaling the cluster, or by altering the cluster's [ingestion capacity policy](../capacity-policy.md#ingestion-capacity)
+ (less recommended).
+
+1. If neither of the above suggestions work, and the view is still unhealthy, this indicates that the cluster doesn't have sufficient
+ capacity and/or resources to materialize all data on time. You can consider the following options in this case:
+ * Scale out the cluster by increasing the min instance count. [Optimized autoscale](../../../manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option)
+ does not take materialized views into consideration and does not scale out the cluster automatically if materialized views are unhealthy.
+ Therefore, if you would like to give the cluster more resources to accommodate for materialized views, you need to set the minimum
+ instance count accordingly.
+ * Split the materialized view into several smaller views, each covering a subset of the data. You can split based on some high
+ cardinality key from the materialized view's group by keys, for example. All views will be based on same source table, and each
+ will filter by `SourceTable | where hash(key, number_of_views) == i` where `i ∈ {0,1,…,number_of_views-1}`. Then, you can define
+ a [stored function](../../query/schema-entities/stored-functions.md) to union between all materialized views, and use that
+ function in queries. This option might consume more CPU, but will reduce the memory peak in materialization cycles, and therefore
+ if the single view is failing due to memory limits, this approach can help.
## MaterializedViewResult metric
From 8362d3f806adf38dc869b597958c4a8f91f05062 Mon Sep 17 00:00:00 2001
From: Yifat Schachter
Date: Thu, 16 Jan 2025 13:11:39 +0200
Subject: [PATCH 07/25] pr comments
---
.../materialized-views/materialized-views-monitoring.md | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index bdd26745ae..84f7651c28 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -130,7 +130,8 @@ the view is unhealthy, you can follow the recommendations below to identify the
changes to the definition of view, if they are applicable to your scenario:
* Include a `datetime` group by key in the view definition. A `datetime` group by key can significantly reduce the amount of
data scanned from the view, **as long as there is no late arriving data in this column**. See more in
- [Performance tips](materialized-view-create.md#performance-tips).
+ [Performance tips](materialized-view-create.md#performance-tips). Note that this change requires creating a new materialized view,
+ as updates to group by keys of an existing view aren't supported.
* Use a `lookback` as part of the view definition. Read more about `lookback` in
[create materialized view properties](../../includes/materialized-view-create-properties.md).
From 060730456c3493dc23981bd0d53c1cf63ea187cf Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Sun, 26 Jan 2025 13:28:30 +0200
Subject: [PATCH 08/25] Trobuleshooting materialized views
---
.../materialized-views-monitoring.md | 162 ++++++++----------
1 file changed, 75 insertions(+), 87 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 84f7651c28..76bf142260 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -3,7 +3,7 @@ title: Monitor materialized views
description: This article describes how to monitor materialized views.
ms.reviewer: yifats
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 01/26/2025
---
# Monitor materialized views
@@ -11,22 +11,27 @@ ms.date: 08/11/2024
Monitor the materialized view's health in the following ways:
::: moniker range="azure-data-explorer"
-* Monitor [materialized view metrics](/azure/data-explorer/using-metrics#materialized-view-metrics) in the Azure portal.
- * The materialized view age metric `MaterializedViewAgeSeconds` should be used to monitor the freshness of the view. This should be the primary metric to monitor.
+* Monitor [materialized views metrics](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) in the [Azure portal](https://portal.azure.com/) with [Azure Monitor](/azure/data-explorer/monitor-data-explorer-reference#metrics).
::: moniker-end
+:::moniker range="microsoft-fabric"
+* Monitor [materialized view metrics](/fabric/real-time-intelligence/monitor-metrics#metric-specific-dimension-column) in your Microsoft Fabric workspace. For more information, see [Enable monitoring in your workspace](/fabric/get-started/enable-workspace-monitoring).
+::: moniker-end
+
+ * Use the materialized view age metric, `MaterializedViewAgeSeconds`, as the primary metric to monitor the freshness of the view.
* Monitor the `IsHealthy` property returned from [`.show materialized-view`](materialized-view-show-command.md#show-materialized-views).
* Check for failures using [`.show materialized-view failures`](materialized-view-show-failures-command.md#show-materialized-view-failures).
> [!NOTE]
>
-> Materialization never skips any data, even if there are constant failures. The view is always guaranteed to return the most up-to-date snapshot of the query, based on all records in the source table. Constant failures will significantly degrade query performance, but won't cause incorrect results in view queries.
+> Materialization never skips any data, even if there are constant failures. The view is always guaranteed to return the most up-to-date snapshot of the query, based on all records in the source table. Constant failures significantly degrade query performance, but don't cause incorrect results in view queries.
## Troubleshooting unhealthy materialized views
-If the value of `MaterializedViewAge` metric constantly increases, and the `MaterializedViewHealth` metric shows that
-the view is unhealthy, you can follow the recommendations below to identify the root cause:
+If the `MaterializedViewAge` metric constantly increases, and the `MaterializedViewHealth` metric shows that the view is unhealthy, follow these recommendations to identify the root cause:
+
+:::moniker range="azure-data-explorer"
-1. Check how many materialized views there are on the cluster, and what is the current capacity for materialized views:
+1. Check the number of materialized views using the [.show capacity](../show-capacity-command.md) command and the current capacity for materialized views:
```kusto
.show capacity
@@ -34,48 +39,36 @@ the view is unhealthy, you can follow the recommendations below to identify the
| project Resource, Total, Consumed
```
+ **Output**
+
|Resource|Total|Consumed|
|---|---|---|
|MaterializedView|1|0|
- If there are several materialized views in the cluster, then the concurrency in which they run depends on the value of the
- `Total` capacity. The `Consumed` column shows how many are currently running. The
- [materialized view capacity policy](../capacity-policy.md#materialized-views-capacity-policy), specifies the minimum and
- maximum concurrent operations, and system chooses the current concurrency, noted in `Total`, based on the cluster's
- available resources. You can override the system's decision and increase concurrency of materialization processes by
- setting the minimum concurrent operations in the policy:
+ If there are many materialized views, concurrency level depends on the capacity shown in the `Total` column. The `Consumed` column shows how many materialized views are currently running. The [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) specifies the minimum and maximum number of concurrent operations. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. You can override the system's decision and increase the concurrency of materialization processes by setting the minimum number of concurrent operations in the policy. The following example changes the minimum concurrent operations to 3:
```kusto
.alter-merge cluster policy capacity '{ "MaterializedViewsCapacity": { "ClusterMinimumConcurrentOperations": 3 } }'
```
- If you explicitly change this policy, you should monitor the cluster's health and verify other workloads are not impacted by this change.
+ If you explicitly change this policy, monitor the cluster's health and ensure that other workloads aren't affected by this change.
+::: moniker-end
+
+1. Check if there are failures during the materialization process using [.show materialized-view failures](materialized-view-show-failures-command.md#show-materialized-view-failures).
+ * If the error is permanent, the system automatically disables the materialized view. To verify whether its disabled, use the [.show materialized-view](materialized-view-show-command.md) command to check if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
-1. Check if there are failures during materialization process using [`.show materialized-view failures command`](materialized-view-show-failures-command.md#show-materialized-view-failures).
- * If the error is permanent, the system will automatically disable the materialized view. You can identify this case by checking
- the `IsEnabled` column in the [.show materialized-view](materialized-view-show-command.md), and by checking the [Journal](../journal.md)
- for the disable event. This can happen, for example, if there's a change in the schema of the source table that makes it incompatible
- with the materialized view. See more details in the [.create materialized-view command](materialized-view-create.md#supported-properties).
- * If the failure is transient (for example, hitting memory limits, query timeout), the system will automatically retry the
- operation, but such failures can delay the materialization and result in an increase in the materialized view age. See more
- recommendations below about how to troubleshoot transient failures.
+ An example of a permanent failure is a change in the schema of the source table that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
+ * If the failure is transient, the system automatically retries the operation, but the failure can delay the materialization and result in an increase in the materialized view age. This type of failure occurs, for example, when hitting memory limits or with a query time-out. For more recommendations, see the following recommendations on how to troubleshoot transient failures.
+
+1. Analyze the materialization process using the [.show commands-and-queries](../commands-and-queries.md) command. Replace *Databasename* and *ViewName* to filter for a specific view:
-1. Analyze the materialization process using [.show commands-and-queries command](../commands-and-queries.md)
- (replace `DatabaseName` and `ViewName` to filter on a specific view):
-
```kusto
.show commands-and-queries
| where Database == "DatabaseName" and ClientActivityId startswith "DN.MaterializedViews;ViewName;"
```
- * Check the memory consumption in the `MemoryPeak` column and whether there are operations that failed due to hitting memory limits
- (for example, [runaway queries](../../concepts/runaway-queries.md)). The materialization process is limited to 15GB memory peak
- per node by default. If the queries or commands executed during the materialization process exceed this value, materialization
- fails due to memory limits. You can alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group)
- to increase the memory peak per node in materialization process. For example, the following command will alter the materialized views
- workload group to use a max of 64GB memory peak per node during materialization:
-
-
+ * Check the memory consumption in the `MemoryPeak` column to identify any operations that failed due to hitting memory limits, such as, [runaway queries](../../concepts/runaway-queries.md). By default, the materialization process is limited to a 15-GB memory peak per node. If the queries or commands executed during the materialization process exceed this value, materialization fails due to memory limits. To increase the memory peak per node, alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group). The following example alters the materialized views workload group to use a max of 64-GB memory peak per node during materialization:
+
```kusto
.alter-merge workload_group ['$materialized-views'] ```
{
@@ -88,89 +81,84 @@ the view is unhealthy, you can follow the recommendations below to identify the
```
> [!NOTE]
- > MaxMemoryPerQueryPerNode can't be set to more than 50% of the total memory of each node.
+ > `MaxMemoryPerQueryPerNode` can't be set to more than 50% of the total memory of each node.
- * Check if the materialization process is hitting cold cache. The following command, for example, shows cache statistics of the
- materialization process for view `ViewName` in the past day:
+ * Check if the materialization process is hitting cold cache. The following example shows cache statistics over the past day for the materialization process of the view `ViewName`:
-
```kusto
.show commands-and-queries
| where ClientActivityId startswith "DN.MaterializedViews;ViewName"
| where StartedOn > ago(1d)
| extend HotCacheHits = tolong(CacheStatistics.Shards.Hot.HitBytes),
HotCacheMisses = tolong(CacheStatistics.Shards.Hot.MissBytes),
- HotCacheRetreived = tolong(CacheStatistics.Shards.Hot.RetrieveBytes),
+ HotCacheRetrieved = tolong(CacheStatistics.Shards.Hot.RetrieveBytes),
ColdCacheHits = tolong(CacheStatistics.Shards.Cold.HitBytes),
ColdCacheMisses = tolong(CacheStatistics.Shards.Cold.MissBytes),
- ColdCacheRetreived = tolong(CacheStatistics.Shards.Cold.RetrieveBytes)
+ ColdCacheRetrieved = tolong(CacheStatistics.Shards.Cold.RetrieveBytes)
| summarize HotCacheHits = format_bytes(sum(HotCacheHits)),
HotCacheMisses = format_bytes(sum(HotCacheMisses)),
- HotCacheRetreived = format_bytes(sum(HotCacheRetreived)),
+ HotCacheRetrieved = format_bytes(sum(HotCacheRetrieved)),
ColdCacheHits =format_bytes(sum(ColdCacheHits)),
ColdCacheMisses = format_bytes(sum(ColdCacheMisses)),
- ColdCacheRetreived = format_bytes(sum(ColdCacheRetreived))
+ ColdCacheRetrieved = format_bytes(sum(ColdCacheRetrieved))
```
-
- |HotCacheHits|HotCacheMisses|HotCacheRetreived|ColdCacheHits|ColdCacheMisses|ColdCacheRetreived|
+
+ **Output**
+
+ |HotCacheHits|HotCacheMisses|HotCacheRetrieved|ColdCacheHits|ColdCacheMisses|ColdCacheRetrieved|
|---|---|---|---|---|---|
|26 GB|0 Bytes|0 Bytes|1 GB|0 Bytes|866 MB|
- If the view is not fully in hot cache, materialization can hit disk misses, which significantly slows down materialization.
- Increasing the caching policy for the materialized view will help avoiding cache misses. You can read more about
- [hot and cold cache and caching policy](../cache-policy.md) and how to [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
-
- * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics`. If the number of scanned extents is
- high, and the `MinDataScannedTime` is old, this indicates the materialization cycle requires scanning all, or most, of the materialized
- part of the view in order to find intersections with the "delta". See more about "delta" and "materialized part" in
- [How materialized views work](materialized-view-overview.md#how-materialized-views-work). Below are several recommendations for
- minimizing the intersection with the "delta", and therefore reducing the amount of data scanned in materialization cycles.
-
-1. If the analysis above shows that each materialization cycle scans much data, potentially cold cache as well, consider the following
- changes to the definition of view, if they are applicable to your scenario:
- * Include a `datetime` group by key in the view definition. A `datetime` group by key can significantly reduce the amount of
- data scanned from the view, **as long as there is no late arriving data in this column**. See more in
- [Performance tips](materialized-view-create.md#performance-tips). Note that this change requires creating a new materialized view,
- as updates to group by keys of an existing view aren't supported.
- * Use a `lookback` as part of the view definition. Read more about `lookback` in
- [create materialized view properties](../../includes/materialized-view-create-properties.md).
-
-1. Check if the `MaterializedViewResult` metric shows `InsufficientCapacity` values. This indicates that the cluster doesn't have
- sufficient ingestion capacity, which should also be noted in the cluster's [IngestionUtilization metric](../../../monitor-data-explorer-reference.md#supported-metrics-for-microsoftkustoclusters).
- You can increase ingestion capacity by scaling the cluster, or by altering the cluster's [ingestion capacity policy](../capacity-policy.md#ingestion-capacity)
- (less recommended).
-
-1. If neither of the above suggestions work, and the view is still unhealthy, this indicates that the cluster doesn't have sufficient
- capacity and/or resources to materialize all data on time. You can consider the following options in this case:
- * Scale out the cluster by increasing the min instance count. [Optimized autoscale](../../../manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option)
- does not take materialized views into consideration and does not scale out the cluster automatically if materialized views are unhealthy.
- Therefore, if you would like to give the cluster more resources to accommodate for materialized views, you need to set the minimum
- instance count accordingly.
- * Split the materialized view into several smaller views, each covering a subset of the data. You can split based on some high
- cardinality key from the materialized view's group by keys, for example. All views will be based on same source table, and each
- will filter by `SourceTable | where hash(key, number_of_views) == i` where `i ∈ {0,1,…,number_of_views-1}`. Then, you can define
- a [stored function](../../query/schema-entities/stored-functions.md) to union between all materialized views, and use that
- function in queries. This option might consume more CPU, but will reduce the memory peak in materialization cycles, and therefore
- if the single view is failing due to memory limits, this approach can help.
+ If the view isn’t fully in the hot cache, materialization can experience disk misses, significantly slowing down the process.
+
+ Increasing the caching policy for the materialized view helps avoid cache misses. For more information, see [hot and cold cache and caching policy](../cache-policy.md) and [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
+
+ * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics` with the [.show queries](../show-queries-command.md) command. If the number of scanned extents is high and the `MinDataScannedTime` is old, the materialization cycle needs to scan all, or most, of the *materialized* part of the view. The scan is needed to find intersections with the *delta*. For more information about the *delta* and the *materialized* part, see [How materialized views work](materialized-view-overview.md#how-materialized-views-work). See the following recommendations for ways to reduce the amount of data scanned in materialized cycles by minimizing the intersection with the *delta*.
+
+1. If the materialization cycle scans a large amount of data, potentially including cold cache, consider the following changes to the materialized view definition:
+ * Include a `datetime` group-by key in the view definition. This can significantly reduce the view's scanned data scanned, **as long as there is no late arriving data in this column**. For more information, see [Performance tips](materialized-view-create.md#performance-tips). You need to create a new materialized view since updates to group-by keys aren't supported.
+ * Use a `lookback` as part of the view definition. For more information, see [.create materialized view supported properties](materialized-view-create.md#supported-properties).
+:::moniker range="azure-data-explorer"
+
+1. Check whether there's enough ingestion capacity by checking if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](../../../monitor-data-explorer-reference.md#supported-metrics-for-microsoftkustoclusters) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the resources available (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
+
+::: moniker-end
+:::moniker range="microsoft-fabric"
+
+1. Check whether there's enough ingestion capacity by checking if the[`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the resources available.
+
+::: moniker-end
+
+1. If the materialized view is still unhealthy, then the service doesn't have sufficient capacity and/or resources to materialize all the data on time. Consider the following options:
+:::moniker range="azure-data-explorer"
+ * Scale out the cluster by increasing the min instance count. [Optimized autoscale](/azure/data-explorer/manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option) doesn't take materialized views into consideration and doesn't scale out the cluster automatically if materialized views are unhealthy. You need to set the minimum instance count to provide the cluster with more resources to accommodate materialized views.
+::: moniker-end
+:::moniker range="microsoft-fabric"
+
+ * Scale out the Eventhouse to provide the Eventhouse with more resources to accommodate materialized views. For more information, see [Enable minimum consumption](/fabric/real-time-intelligence/manage-monitor-eventhouse#enable-minimum-consumption).
+::: moniker-end
+ * Divide the materialized view into several smaller views, each covering a subset of the data. For instance, you can split them based on a high cardinality key from the materialized view's group-by keys. All views are based on same source table, and each view filters by `SourceTable | where hash(key, number_of_views) == i` where `i ∈ {0,1,…,number_of_views-1}`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) that [unions](../../query/union-operator.md) all the smaller materialized views. Use this function in queries to access the combined data.
+
+ While splitting the view might consume more CPUs, it reduces the memory peak in materialization cycles. This can help if the single view is failing due to memory limits.
## MaterializedViewResult metric
-The `MaterializedViewResult` metric provides information about the result of a materialization cycle, and can be used to identify issues in the materialized view health status. The metric includes the `Database` and `MaterializedViewName` and a `Result` dimension.
+The `MaterializedViewResult` metric provides information about the result of a materialization cycle and can be used to identify issues in the materialized view health status. The metric includes the `Database` and `MaterializedViewName` and a `Result` dimension.
The `Result` dimension can have one of the following values:
-
-* **Success**: Materialization completed successfully.
-* **SourceTableNotFound**: Source table of the materialization view was dropped. The materialized view is automatically disabled as a result.
-* **SourceTableSchemaChange**: The schema of the source table has changed in a way that isn't compatible with the materialized view definition (materialized view query doesn't match the materialized view schema). The materialized view is automatically disabled as a result.
-* **InsufficientCapacity**: The cluster doesn't have sufficient capacity to materialize the materialized view, due to lack of [ingestion capacity](../capacity-policy.md#ingestion-capacity). Insufficient capacity failures can be transient, but if they reoccur often we recommend scaling out the cluster or increasing relevant capacity in the policy.
-* **InsufficientResources:** The database doesn't have sufficient resources (CPU/memory) to materialize the materialized view. This failure may be transient, but if it reoccurs try scaling the cluster up or out, and/or following the suggestions in the [troubleshooting section](#troubleshooting-unhealthy-materialized-views).
+
+* **Success**: The materialization completed successfully.
+* **SourceTableNotFound**: The source table of the materialized view was dropped, so the materialized view is disabled automatically.
+* **SourceTableSchemaChange**: The schema of the source table changed in a way that isn’t compatible with the materialized view definition. Since the materialized view query no longer matches the materialized view schema, the materialized view is disabled automatically.
+* **InsufficientCapacity**: The instance doesn't have sufficient capacity to materialize the materialized view, due to a lack of [ingestion capacity](../capacity-policy.md#ingestion-capacity). While insufficient capacity failures can be transient, if they reoccur often, try scaling out the instance or increasing the relevant capacity in the policy.
+* **InsufficientResources:** The database doesn't have sufficient resources (CPU/memory) to materialize the materialized view. While insufficient resource errors might be transient, if they reoccur often, try scaling up or scaling out. For more ideas, see [Troubleshooting unhealthy materialized views](#troubleshooting-unhealthy-materialized-views).
## Materialized views in follower databases
Materialized views can be defined in [follower databases](materialized-views-limitations.md#follower-databases). However, the monitoring of these materialized views should be based on the leader database, where the materialized view is defined. Specifically:
::: moniker range="azure-data-explorer"
-* [Metrics](/azure/data-explorer/using-metrics#materialized-view-metrics) related to materialized view execution (`MaterializedViewResult`, `MaterializedViewExtentsRebuild`) are only present in the leader database. Metrics related to monitoring (`MaterializedViewAgeSeconds`, `MaterializedViewHealth`, `MaterializedViewRecordsInDelta`) will also appear in the follower databases.
+* [Metrics](/azure/data-explorer/using-metrics#materialized-view-metrics) related to materialized view execution (`MaterializedViewResult`, `MaterializedViewExtentsRebuild`) are only present in the leader database. Metrics related to monitoring (`MaterializedViewAgeSeconds`, `MaterializedViewHealth`, `MaterializedViewRecordsInDelta`) also appear in the follower databases.
::: moniker-end
* The [.show materialized-view failures command](materialized-view-show-failures-command.md) only works in the leader database.
From 38c9c7af5dc382886253fe5c02ea62259095f5e2 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Sun, 26 Jan 2025 13:48:40 +0200
Subject: [PATCH 09/25] edits
---
.../materialized-views-monitoring.md | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 76bf142260..4e407a2efd 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -112,7 +112,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
If the view isn’t fully in the hot cache, materialization can experience disk misses, significantly slowing down the process.
Increasing the caching policy for the materialized view helps avoid cache misses. For more information, see [hot and cold cache and caching policy](../cache-policy.md) and [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
-
+
* Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics` with the [.show queries](../show-queries-command.md) command. If the number of scanned extents is high and the `MinDataScannedTime` is old, the materialization cycle needs to scan all, or most, of the *materialized* part of the view. The scan is needed to find intersections with the *delta*. For more information about the *delta* and the *materialized* part, see [How materialized views work](materialized-view-overview.md#how-materialized-views-work). See the following recommendations for ways to reduce the amount of data scanned in materialized cycles by minimizing the intersection with the *delta*.
1. If the materialization cycle scans a large amount of data, potentially including cold cache, consider the following changes to the materialized view definition:
@@ -120,7 +120,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
* Use a `lookback` as part of the view definition. For more information, see [.create materialized view supported properties](materialized-view-create.md#supported-properties).
:::moniker range="azure-data-explorer"
-1. Check whether there's enough ingestion capacity by checking if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](../../../monitor-data-explorer-reference.md#supported-metrics-for-microsoftkustoclusters) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the resources available (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
+1. Check whether there's enough ingestion capacity by checking if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the resources available (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
::: moniker-end
:::moniker range="microsoft-fabric"
@@ -130,13 +130,13 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
::: moniker-end
1. If the materialized view is still unhealthy, then the service doesn't have sufficient capacity and/or resources to materialize all the data on time. Consider the following options:
-:::moniker range="azure-data-explorer"
+ :::moniker range="azure-data-explorer"
* Scale out the cluster by increasing the min instance count. [Optimized autoscale](/azure/data-explorer/manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option) doesn't take materialized views into consideration and doesn't scale out the cluster automatically if materialized views are unhealthy. You need to set the minimum instance count to provide the cluster with more resources to accommodate materialized views.
-::: moniker-end
-:::moniker range="microsoft-fabric"
-
+ ::: moniker-end
+ :::moniker range="microsoft-fabric"
+
* Scale out the Eventhouse to provide the Eventhouse with more resources to accommodate materialized views. For more information, see [Enable minimum consumption](/fabric/real-time-intelligence/manage-monitor-eventhouse#enable-minimum-consumption).
-::: moniker-end
+ ::: moniker-end
* Divide the materialized view into several smaller views, each covering a subset of the data. For instance, you can split them based on a high cardinality key from the materialized view's group-by keys. All views are based on same source table, and each view filters by `SourceTable | where hash(key, number_of_views) == i` where `i ∈ {0,1,…,number_of_views-1}`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) that [unions](../../query/union-operator.md) all the smaller materialized views. Use this function in queries to access the combined data.
While splitting the view might consume more CPUs, it reduces the memory peak in materialization cycles. This can help if the single view is failing due to memory limits.
@@ -166,7 +166,6 @@ Materialized views can be defined in [follower databases](materialized-views-lim
**Materialized views resource consumption:** the resources consumed by the materialized views materialization process can be tracked using the [`.show commands-and-queries`](../commands-and-queries.md) command. Filter the records for a specific view using the following (replace `DatabaseName` and `ViewName`):
-
```kusto
.show commands-and-queries
| where Database == "DatabaseName" and ClientActivityId startswith "DN.MaterializedViews;ViewName;"
From 2b68a1460809b4106cbb31f24324b674d4138216 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Sun, 26 Jan 2025 13:51:56 +0200
Subject: [PATCH 10/25] edits
---
.../materialized-views/materialized-views-monitoring.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 4e407a2efd..022d60c928 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -131,7 +131,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
1. If the materialized view is still unhealthy, then the service doesn't have sufficient capacity and/or resources to materialize all the data on time. Consider the following options:
:::moniker range="azure-data-explorer"
- * Scale out the cluster by increasing the min instance count. [Optimized autoscale](/azure/data-explorer/manage-cluster-horizontal-scaling.md#optimized-autoscale-recommended-option) doesn't take materialized views into consideration and doesn't scale out the cluster automatically if materialized views are unhealthy. You need to set the minimum instance count to provide the cluster with more resources to accommodate materialized views.
+ * Scale out the cluster by increasing the min instance count. [Optimized autoscale](/azure/data-explorer/manage-cluster-horizontal-scaling#optimized-autoscale-recommended-option) doesn't take materialized views into consideration and doesn't scale out the cluster automatically if materialized views are unhealthy. You need to set the minimum instance count to provide the cluster with more resources to accommodate materialized views.
::: moniker-end
:::moniker range="microsoft-fabric"
From 84683affcc466da2a97a647b359ea090dc774480 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Sun, 26 Jan 2025 14:12:24 +0200
Subject: [PATCH 11/25] edits
---
.../materialized-views-monitoring.md | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 022d60c928..1e63daef54 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -17,8 +17,8 @@ Monitor the materialized view's health in the following ways:
* Monitor [materialized view metrics](/fabric/real-time-intelligence/monitor-metrics#metric-specific-dimension-column) in your Microsoft Fabric workspace. For more information, see [Enable monitoring in your workspace](/fabric/get-started/enable-workspace-monitoring).
::: moniker-end
- * Use the materialized view age metric, `MaterializedViewAgeSeconds`, as the primary metric to monitor the freshness of the view.
-* Monitor the `IsHealthy` property returned from [`.show materialized-view`](materialized-view-show-command.md#show-materialized-views).
+ * Use the materialized view age metric, `MaterializedViewAgeSeconds`, as the primary metric to monitor the freshness of the view.
+* Monitor the `IsHealthy` property using [`.show materialized-view`](materialized-view-show-command.md#show-materialized-views).
* Check for failures using [`.show materialized-view failures`](materialized-view-show-failures-command.md#show-materialized-view-failures).
> [!NOTE]
@@ -45,18 +45,17 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
|---|---|---|
|MaterializedView|1|0|
- If there are many materialized views, concurrency level depends on the capacity shown in the `Total` column. The `Consumed` column shows how many materialized views are currently running. The [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) specifies the minimum and maximum number of concurrent operations. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. You can override the system's decision and increase the concurrency of materialization processes by setting the minimum number of concurrent operations in the policy. The following example changes the minimum concurrent operations to 3:
+ * If there are many materialized views, concurrency level depends on the capacity shown in the `Total` column while the `Consumed` column shows how many materialized views are currently running. The [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) specifies the minimum and maximum number of concurrent operations. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. You can override the system's decision and increase the concurrency of materialization processes by setting the minimum number of concurrent operations in the policy. The following example changes the minimum concurrent operations to 3:
```kusto
.alter-merge cluster policy capacity '{ "MaterializedViewsCapacity": { "ClusterMinimumConcurrentOperations": 3 } }'
```
- If you explicitly change this policy, monitor the cluster's health and ensure that other workloads aren't affected by this change.
+ * If you explicitly change this policy, monitor the cluster's health and ensure that other workloads aren't affected by this change.
::: moniker-end
1. Check if there are failures during the materialization process using [.show materialized-view failures](materialized-view-show-failures-command.md#show-materialized-view-failures).
* If the error is permanent, the system automatically disables the materialized view. To verify whether its disabled, use the [.show materialized-view](materialized-view-show-command.md) command to check if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
-
An example of a permanent failure is a change in the schema of the source table that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
* If the failure is transient, the system automatically retries the operation, but the failure can delay the materialization and result in an increase in the materialized view age. This type of failure occurs, for example, when hitting memory limits or with a query time-out. For more recommendations, see the following recommendations on how to troubleshoot transient failures.
@@ -112,8 +111,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
If the view isn’t fully in the hot cache, materialization can experience disk misses, significantly slowing down the process.
Increasing the caching policy for the materialized view helps avoid cache misses. For more information, see [hot and cold cache and caching policy](../cache-policy.md) and [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
-
- * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics` with the [.show queries](../show-queries-command.md) command. If the number of scanned extents is high and the `MinDataScannedTime` is old, the materialization cycle needs to scan all, or most, of the *materialized* part of the view. The scan is needed to find intersections with the *delta*. For more information about the *delta* and the *materialized* part, see [How materialized views work](materialized-view-overview.md#how-materialized-views-work). See the following recommendations for ways to reduce the amount of data scanned in materialized cycles by minimizing the intersection with the *delta*.
+ * Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics` with the [.show queries](../show-queries-command.md) command. If the number of scanned extents is high and the `MinDataScannedTime` is old, the materialization cycle needs to scan all, or most, of the *materialized* part of the view. The scan is needed to find intersections with the *delta*. For more information about the *delta* and the *materialized* part, see [How materialized views work](materialized-view-overview.md#how-materialized-views-work). The following recommendations provide ways to reduce the amount of data scanned in materialized cycles by minimizing the intersection with the *delta*.
1. If the materialization cycle scans a large amount of data, potentially including cold cache, consider the following changes to the materialized view definition:
* Include a `datetime` group-by key in the view definition. This can significantly reduce the view's scanned data scanned, **as long as there is no late arriving data in this column**. For more information, see [Performance tips](materialized-view-create.md#performance-tips). You need to create a new materialized view since updates to group-by keys aren't supported.
From cb0b0c7b2962e48ed4b801c5ae313ba9f104d9b7 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 09:54:45 +0200
Subject: [PATCH 12/25] moniker issue
---
.../materialized-views-monitoring.md | 20 +++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 1e63daef54..f05e445065 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -3,7 +3,7 @@ title: Monitor materialized views
description: This article describes how to monitor materialized views.
ms.reviewer: yifats
ms.topic: reference
-ms.date: 01/26/2025
+ms.date: 01/27/2025
---
# Monitor materialized views
@@ -28,7 +28,7 @@ Monitor the materialized view's health in the following ways:
## Troubleshooting unhealthy materialized views
If the `MaterializedViewAge` metric constantly increases, and the `MaterializedViewHealth` metric shows that the view is unhealthy, follow these recommendations to identify the root cause:
-
+
:::moniker range="azure-data-explorer"
1. Check the number of materialized views using the [.show capacity](../show-capacity-command.md) command and the current capacity for materialized views:
@@ -53,7 +53,6 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
* If you explicitly change this policy, monitor the cluster's health and ensure that other workloads aren't affected by this change.
::: moniker-end
-
1. Check if there are failures during the materialization process using [.show materialized-view failures](materialized-view-show-failures-command.md#show-materialized-view-failures).
* If the error is permanent, the system automatically disables the materialized view. To verify whether its disabled, use the [.show materialized-view](materialized-view-show-command.md) command to check if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
An example of a permanent failure is a change in the schema of the source table that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
@@ -118,13 +117,11 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
* Use a `lookback` as part of the view definition. For more information, see [.create materialized view supported properties](materialized-view-create.md#supported-properties).
:::moniker range="azure-data-explorer"
-1. Check whether there's enough ingestion capacity by checking if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the resources available (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
-
+1. Check whether there's enough ingestion capacity by checking if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
::: moniker-end
:::moniker range="microsoft-fabric"
-1. Check whether there's enough ingestion capacity by checking if the[`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the resources available.
-
+1. Check whether there's enough ingestion capacity by checking if the[`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources.
::: moniker-end
1. If the materialized view is still unhealthy, then the service doesn't have sufficient capacity and/or resources to materialize all the data on time. Consider the following options:
@@ -132,23 +129,26 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
* Scale out the cluster by increasing the min instance count. [Optimized autoscale](/azure/data-explorer/manage-cluster-horizontal-scaling#optimized-autoscale-recommended-option) doesn't take materialized views into consideration and doesn't scale out the cluster automatically if materialized views are unhealthy. You need to set the minimum instance count to provide the cluster with more resources to accommodate materialized views.
::: moniker-end
:::moniker range="microsoft-fabric"
-
* Scale out the Eventhouse to provide the Eventhouse with more resources to accommodate materialized views. For more information, see [Enable minimum consumption](/fabric/real-time-intelligence/manage-monitor-eventhouse#enable-minimum-consumption).
::: moniker-end
* Divide the materialized view into several smaller views, each covering a subset of the data. For instance, you can split them based on a high cardinality key from the materialized view's group-by keys. All views are based on same source table, and each view filters by `SourceTable | where hash(key, number_of_views) == i` where `i ∈ {0,1,…,number_of_views-1}`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) that [unions](../../query/union-operator.md) all the smaller materialized views. Use this function in queries to access the combined data.
- While splitting the view might consume more CPUs, it reduces the memory peak in materialization cycles. This can help if the single view is failing due to memory limits.
+ While splitting the view might consume more CPUs, it reduces the memory peak in materialization cycles. Reducing the memory peak can help if the single view is failing due to memory limits.
## MaterializedViewResult metric
The `MaterializedViewResult` metric provides information about the result of a materialization cycle and can be used to identify issues in the materialized view health status. The metric includes the `Database` and `MaterializedViewName` and a `Result` dimension.
The `Result` dimension can have one of the following values:
-
+
* **Success**: The materialization completed successfully.
* **SourceTableNotFound**: The source table of the materialized view was dropped, so the materialized view is disabled automatically.
* **SourceTableSchemaChange**: The schema of the source table changed in a way that isn’t compatible with the materialized view definition. Since the materialized view query no longer matches the materialized view schema, the materialized view is disabled automatically.
+:::moniker range="azure-data-explorer"
* **InsufficientCapacity**: The instance doesn't have sufficient capacity to materialize the materialized view, due to a lack of [ingestion capacity](../capacity-policy.md#ingestion-capacity). While insufficient capacity failures can be transient, if they reoccur often, try scaling out the instance or increasing the relevant capacity in the policy.
+::: moniker-end
+* **InsufficientCapacity**: The instance doesn't have sufficient capacity to materialize the materialized view, due to a lack of ingestion capacity. While insufficient capacity failures can be transient, if they reoccur often, try scaling out the instance or increasing capacity. For more information, see [Plan your capacity size](/fabric/enterprise/plan-capacity).
+:::moniker range="microsoft-fabric"
* **InsufficientResources:** The database doesn't have sufficient resources (CPU/memory) to materialize the materialized view. While insufficient resource errors might be transient, if they reoccur often, try scaling up or scaling out. For more ideas, see [Troubleshooting unhealthy materialized views](#troubleshooting-unhealthy-materialized-views).
## Materialized views in follower databases
From 972a9c2df05631c26ca8847a0f8310f6f36340c7 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 10:15:29 +0200
Subject: [PATCH 13/25] edits
---
.../materialized-views/materialized-views-monitoring.md | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index f05e445065..052ae4f9a9 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -12,12 +12,15 @@ ms.date: 01/27/2025
Monitor the materialized view's health in the following ways:
::: moniker range="azure-data-explorer"
* Monitor [materialized views metrics](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) in the [Azure portal](https://portal.azure.com/) with [Azure Monitor](/azure/data-explorer/monitor-data-explorer-reference#metrics).
+
+ * Use the materialized view age metric, `MaterializedViewAgeSeconds`, as the primary metric to monitor the freshness of the view.
::: moniker-end
:::moniker range="microsoft-fabric"
* Monitor [materialized view metrics](/fabric/real-time-intelligence/monitor-metrics#metric-specific-dimension-column) in your Microsoft Fabric workspace. For more information, see [Enable monitoring in your workspace](/fabric/get-started/enable-workspace-monitoring).
-::: moniker-end
* Use the materialized view age metric, `MaterializedViewAgeSeconds`, as the primary metric to monitor the freshness of the view.
+::: moniker-end
+
* Monitor the `IsHealthy` property using [`.show materialized-view`](materialized-view-show-command.md#show-materialized-views).
* Check for failures using [`.show materialized-view failures`](materialized-view-show-failures-command.md#show-materialized-view-failures).
@@ -147,8 +150,9 @@ The `Result` dimension can have one of the following values:
:::moniker range="azure-data-explorer"
* **InsufficientCapacity**: The instance doesn't have sufficient capacity to materialize the materialized view, due to a lack of [ingestion capacity](../capacity-policy.md#ingestion-capacity). While insufficient capacity failures can be transient, if they reoccur often, try scaling out the instance or increasing the relevant capacity in the policy.
::: moniker-end
-* **InsufficientCapacity**: The instance doesn't have sufficient capacity to materialize the materialized view, due to a lack of ingestion capacity. While insufficient capacity failures can be transient, if they reoccur often, try scaling out the instance or increasing capacity. For more information, see [Plan your capacity size](/fabric/enterprise/plan-capacity).
:::moniker range="microsoft-fabric"
+* **InsufficientCapacity**: The instance doesn't have sufficient capacity to materialize the materialized view, due to a lack of ingestion capacity. While insufficient capacity failures can be transient, if they reoccur often, try scaling out the instance or increasing capacity. For more information, see [Plan your capacity size](/fabric/enterprise/plan-capacity).
+::: moniker-end
* **InsufficientResources:** The database doesn't have sufficient resources (CPU/memory) to materialize the materialized view. While insufficient resource errors might be transient, if they reoccur often, try scaling up or scaling out. For more ideas, see [Troubleshooting unhealthy materialized views](#troubleshooting-unhealthy-materialized-views).
## Materialized views in follower databases
From 1925e318a3517b489d1bcfa73a3cc0e919df679b Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 11:25:29 +0200
Subject: [PATCH 14/25] edits
---
.../materialized-views-monitoring.md | 28 +++++++++----------
1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 052ae4f9a9..755fb23943 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -3,7 +3,7 @@ title: Monitor materialized views
description: This article describes how to monitor materialized views.
ms.reviewer: yifats
ms.topic: reference
-ms.date: 01/27/2025
+ms.date: 01/28/2025
---
# Monitor materialized views
@@ -11,14 +11,12 @@ ms.date: 01/27/2025
Monitor the materialized view's health in the following ways:
::: moniker range="azure-data-explorer"
-* Monitor [materialized views metrics](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) in the [Azure portal](https://portal.azure.com/) with [Azure Monitor](/azure/data-explorer/monitor-data-explorer-reference#metrics).
+* Monitor [materialized views metrics](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) in the [Azure portal](https://portal.azure.com/) with [Azure Monitor](/azure/data-explorer/monitor-data-explorer-reference#metrics). Use the materialized view age metric, `MaterializedViewAgeSeconds`, as the primary metric to monitor the freshness of the view.
- * Use the materialized view age metric, `MaterializedViewAgeSeconds`, as the primary metric to monitor the freshness of the view.
::: moniker-end
:::moniker range="microsoft-fabric"
-* Monitor [materialized view metrics](/fabric/real-time-intelligence/monitor-metrics#metric-specific-dimension-column) in your Microsoft Fabric workspace. For more information, see [Enable monitoring in your workspace](/fabric/get-started/enable-workspace-monitoring).
+* Monitor [materialized view metrics](/fabric/real-time-intelligence/monitor-metrics#metric-specific-dimension-column) in your Microsoft Fabric workspace. Use the materialized view age metric, `MaterializedViewAgeSeconds` as the primary metric to monitor the freshness of the view. For more information, see [Enable monitoring in your workspace](/fabric/get-started/enable-workspace-monitoring).
- * Use the materialized view age metric, `MaterializedViewAgeSeconds`, as the primary metric to monitor the freshness of the view.
::: moniker-end
* Monitor the `IsHealthy` property using [`.show materialized-view`](materialized-view-show-command.md#show-materialized-views).
@@ -34,7 +32,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
:::moniker range="azure-data-explorer"
-1. Check the number of materialized views using the [.show capacity](../show-capacity-command.md) command and the current capacity for materialized views:
+* Check the number of materialized views using the [.show capacity](../show-capacity-command.md) command and the current capacity for materialized views:
```kusto
.show capacity
@@ -56,12 +54,12 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
* If you explicitly change this policy, monitor the cluster's health and ensure that other workloads aren't affected by this change.
::: moniker-end
-1. Check if there are failures during the materialization process using [.show materialized-view failures](materialized-view-show-failures-command.md#show-materialized-view-failures).
+* Check if there are failures during the materialization process using [.show materialized-view failures](materialized-view-show-failures-command.md#show-materialized-view-failures).
* If the error is permanent, the system automatically disables the materialized view. To verify whether its disabled, use the [.show materialized-view](materialized-view-show-command.md) command to check if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
An example of a permanent failure is a change in the schema of the source table that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
* If the failure is transient, the system automatically retries the operation, but the failure can delay the materialization and result in an increase in the materialized view age. This type of failure occurs, for example, when hitting memory limits or with a query time-out. For more recommendations, see the following recommendations on how to troubleshoot transient failures.
-1. Analyze the materialization process using the [.show commands-and-queries](../commands-and-queries.md) command. Replace *Databasename* and *ViewName* to filter for a specific view:
+* Analyze the materialization process using the [.show commands-and-queries](../commands-and-queries.md) command. Replace *Databasename* and *ViewName* to filter for a specific view:
```kusto
.show commands-and-queries
@@ -115,28 +113,28 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
Increasing the caching policy for the materialized view helps avoid cache misses. For more information, see [hot and cold cache and caching policy](../cache-policy.md) and [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
* Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics` with the [.show queries](../show-queries-command.md) command. If the number of scanned extents is high and the `MinDataScannedTime` is old, the materialization cycle needs to scan all, or most, of the *materialized* part of the view. The scan is needed to find intersections with the *delta*. For more information about the *delta* and the *materialized* part, see [How materialized views work](materialized-view-overview.md#how-materialized-views-work). The following recommendations provide ways to reduce the amount of data scanned in materialized cycles by minimizing the intersection with the *delta*.
-1. If the materialization cycle scans a large amount of data, potentially including cold cache, consider the following changes to the materialized view definition:
+* If the materialization cycle scans a large amount of data, potentially including cold cache, consider the following changes to the materialized view definition:
* Include a `datetime` group-by key in the view definition. This can significantly reduce the view's scanned data scanned, **as long as there is no late arriving data in this column**. For more information, see [Performance tips](materialized-view-create.md#performance-tips). You need to create a new materialized view since updates to group-by keys aren't supported.
* Use a `lookback` as part of the view definition. For more information, see [.create materialized view supported properties](materialized-view-create.md#supported-properties).
:::moniker range="azure-data-explorer"
-1. Check whether there's enough ingestion capacity by checking if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
+* Check whether there's enough ingestion capacity by checking if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
::: moniker-end
:::moniker range="microsoft-fabric"
-1. Check whether there's enough ingestion capacity by checking if the[`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources.
+* Check whether there's enough ingestion capacity by checking if the[`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources.
::: moniker-end
-1. If the materialized view is still unhealthy, then the service doesn't have sufficient capacity and/or resources to materialize all the data on time. Consider the following options:
+* If the materialized view is still unhealthy, then the service doesn't have sufficient capacity or resources to materialize all the data on time. Consider the following options:
:::moniker range="azure-data-explorer"
* Scale out the cluster by increasing the min instance count. [Optimized autoscale](/azure/data-explorer/manage-cluster-horizontal-scaling#optimized-autoscale-recommended-option) doesn't take materialized views into consideration and doesn't scale out the cluster automatically if materialized views are unhealthy. You need to set the minimum instance count to provide the cluster with more resources to accommodate materialized views.
::: moniker-end
:::moniker range="microsoft-fabric"
- * Scale out the Eventhouse to provide the Eventhouse with more resources to accommodate materialized views. For more information, see [Enable minimum consumption](/fabric/real-time-intelligence/manage-monitor-eventhouse#enable-minimum-consumption).
+ * Scale out the Eventhouse to provide it with more resources to accommodate materialized views. For more information, see [Enable minimum consumption](/fabric/real-time-intelligence/manage-monitor-eventhouse#enable-minimum-consumption).
::: moniker-end
- * Divide the materialized view into several smaller views, each covering a subset of the data. For instance, you can split them based on a high cardinality key from the materialized view's group-by keys. All views are based on same source table, and each view filters by `SourceTable | where hash(key, number_of_views) == i` where `i ∈ {0,1,…,number_of_views-1}`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) that [unions](../../query/union-operator.md) all the smaller materialized views. Use this function in queries to access the combined data.
+ * Divide the materialized view into several smaller views, each covering a subset of the data. For instance, you can split them based on a high cardinality key from the materialized view's group-by keys. All views are based on the same source table, and each view filters by `SourceTable | where hash(key, number_of_views) == i`, where `i` is part of the set `{0,1,…,number_of_views-1}`. Then, you can define a [stored function](../../query/schema-entities/stored-functions.md) that [unions](../../query/union-operator.md) all the smaller materialized views. Use this function in queries to access the combined data.
- While splitting the view might consume more CPUs, it reduces the memory peak in materialization cycles. Reducing the memory peak can help if the single view is failing due to memory limits.
+ While splitting the view might increase CPU usage, it reduces the memory peak in materialization cycles. Reducing the memory peak can help if the single view is failing due to memory limits.
## MaterializedViewResult metric
From eb633a4473fccac783d28f116ec53da0501a063b Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 12:58:43 +0200
Subject: [PATCH 15/25] edits
---
.../materialized-views-monitoring.md | 25 ++++++++++---------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 755fb23943..8cb5788b51 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -54,10 +54,11 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
* If you explicitly change this policy, monitor the cluster's health and ensure that other workloads aren't affected by this change.
::: moniker-end
+
* Check if there are failures during the materialization process using [.show materialized-view failures](materialized-view-show-failures-command.md#show-materialized-view-failures).
* If the error is permanent, the system automatically disables the materialized view. To verify whether its disabled, use the [.show materialized-view](materialized-view-show-command.md) command to check if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
- An example of a permanent failure is a change in the schema of the source table that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
- * If the failure is transient, the system automatically retries the operation, but the failure can delay the materialization and result in an increase in the materialized view age. This type of failure occurs, for example, when hitting memory limits or with a query time-out. For more recommendations, see the following recommendations on how to troubleshoot transient failures.
+ An example of a permanent failure is a change in the source table schema that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
+ * If the failure is transient, the system automatically retries the operation. However, the failure can delay the materialization and increase the age of the materialized view. This type of failure occurs, for example, when hitting memory limits or with a query time-out. See the following recommendations for more ways to troubleshoot transient failures.
* Analyze the materialization process using the [.show commands-and-queries](../commands-and-queries.md) command. Replace *Databasename* and *ViewName* to filter for a specific view:
@@ -66,7 +67,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
| where Database == "DatabaseName" and ClientActivityId startswith "DN.MaterializedViews;ViewName;"
```
- * Check the memory consumption in the `MemoryPeak` column to identify any operations that failed due to hitting memory limits, such as, [runaway queries](../../concepts/runaway-queries.md). By default, the materialization process is limited to a 15-GB memory peak per node. If the queries or commands executed during the materialization process exceed this value, materialization fails due to memory limits. To increase the memory peak per node, alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group). The following example alters the materialized views workload group to use a max of 64-GB memory peak per node during materialization:
+ * Check the memory consumption in the `MemoryPeak` column to identify any operations that failed due to hitting memory limits, such as, [runaway queries](../../concepts/runaway-queries.md). By default, the materialization process is limited to a 15-GB memory peak per node. If the queries or commands executed during the materialization process exceed this value, the materialization fails due to memory limits. To increase the memory peak per node, alter the [$materialized-views workload group](../workload-groups.md#materialized-views-workload-group). The following example alters the materialized views workload group to use a maximum of 64-GB memory peak per node during materialization:
```kusto
.alter-merge workload_group ['$materialized-views'] ```
@@ -80,9 +81,9 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
```
> [!NOTE]
- > `MaxMemoryPerQueryPerNode` can't be set to more than 50% of the total memory of each node.
+ > `MaxMemoryPerQueryPerNode` can't exceed 50% of the total memory available on each node.
- * Check if the materialization process is hitting cold cache. The following example shows cache statistics over the past day for the materialization process of the view `ViewName`:
+ * Check if the materialization process is hitting cold cache. The following example shows cache statistics over the past day for the materialized view, `ViewName`:
```kusto
.show commands-and-queries
@@ -108,26 +109,26 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
|---|---|---|---|---|---|
|26 GB|0 Bytes|0 Bytes|1 GB|0 Bytes|866 MB|
- If the view isn’t fully in the hot cache, materialization can experience disk misses, significantly slowing down the process.
+ * If the view isn’t fully in the hot cache, materialization can experience disk misses, significantly slowing down the process.
- Increasing the caching policy for the materialized view helps avoid cache misses. For more information, see [hot and cold cache and caching policy](../cache-policy.md) and [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
+ * Increasing the caching policy for the materialized view helps avoid cache misses. For more information, see [hot and cold cache and caching policy](../cache-policy.md) and [.alter materialized-view policy caching command](../alter-materialized-view-cache-policy-command.md).
* Check if the materialization is scanning old records by checking the `ScannedExtentsStatistics` with the [.show queries](../show-queries-command.md) command. If the number of scanned extents is high and the `MinDataScannedTime` is old, the materialization cycle needs to scan all, or most, of the *materialized* part of the view. The scan is needed to find intersections with the *delta*. For more information about the *delta* and the *materialized* part, see [How materialized views work](materialized-view-overview.md#how-materialized-views-work). The following recommendations provide ways to reduce the amount of data scanned in materialized cycles by minimizing the intersection with the *delta*.
-* If the materialization cycle scans a large amount of data, potentially including cold cache, consider the following changes to the materialized view definition:
- * Include a `datetime` group-by key in the view definition. This can significantly reduce the view's scanned data scanned, **as long as there is no late arriving data in this column**. For more information, see [Performance tips](materialized-view-create.md#performance-tips). You need to create a new materialized view since updates to group-by keys aren't supported.
+* If the materialization cycle scans a large amount of data, potentially including cold cache, consider making the following changes to the materialized view definition:
+ * Include a `datetime` group-by key in the view definition. This can significantly reduce the amount of data scanned, **as long as there is no late arriving data in this column**. For more information, see [Performance tips](materialized-view-create.md#performance-tips). You need to create a new materialized view since updates to group-by keys aren't supported.
* Use a `lookback` as part of the view definition. For more information, see [.create materialized view supported properties](materialized-view-create.md#supported-properties).
:::moniker range="azure-data-explorer"
-* Check whether there's enough ingestion capacity by checking if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
+* Check whether there's enough ingestion capacity by verifying if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) show `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
::: moniker-end
:::moniker range="microsoft-fabric"
-* Check whether there's enough ingestion capacity by checking if the[`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources.
+* Check whether there's enough ingestion capacity by verifying if the[`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources.
::: moniker-end
* If the materialized view is still unhealthy, then the service doesn't have sufficient capacity or resources to materialize all the data on time. Consider the following options:
:::moniker range="azure-data-explorer"
- * Scale out the cluster by increasing the min instance count. [Optimized autoscale](/azure/data-explorer/manage-cluster-horizontal-scaling#optimized-autoscale-recommended-option) doesn't take materialized views into consideration and doesn't scale out the cluster automatically if materialized views are unhealthy. You need to set the minimum instance count to provide the cluster with more resources to accommodate materialized views.
+ * Scale out the cluster by increasing the minimum instance count. [Optimized autoscale](/azure/data-explorer/manage-cluster-horizontal-scaling#optimized-autoscale-recommended-option) doesn't take materialized views into consideration and doesn't scale out the cluster automatically if materialized views are unhealthy. You need to set the minimum instance count to provide the cluster with more resources to accommodate materialized views.
::: moniker-end
:::moniker range="microsoft-fabric"
* Scale out the Eventhouse to provide it with more resources to accommodate materialized views. For more information, see [Enable minimum consumption](/fabric/real-time-intelligence/manage-monitor-eventhouse#enable-minimum-consumption).
From 467692b9e519d8ee0d4af1ba29e0cbfeee48b4b0 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 14:10:12 +0200
Subject: [PATCH 16/25] spaces
---
.../materialized-views/materialized-views-monitoring.md | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 8cb5788b51..0eab3f4bbf 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -18,8 +18,8 @@ Monitor the materialized view's health in the following ways:
* Monitor [materialized view metrics](/fabric/real-time-intelligence/monitor-metrics#metric-specific-dimension-column) in your Microsoft Fabric workspace. Use the materialized view age metric, `MaterializedViewAgeSeconds` as the primary metric to monitor the freshness of the view. For more information, see [Enable monitoring in your workspace](/fabric/get-started/enable-workspace-monitoring).
::: moniker-end
-
* Monitor the `IsHealthy` property using [`.show materialized-view`](materialized-view-show-command.md#show-materialized-views).
+
* Check for failures using [`.show materialized-view failures`](materialized-view-show-failures-command.md#show-materialized-view-failures).
> [!NOTE]
@@ -119,11 +119,11 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
* Use a `lookback` as part of the view definition. For more information, see [.create materialized view supported properties](materialized-view-create.md#supported-properties).
:::moniker range="azure-data-explorer"
-* Check whether there's enough ingestion capacity by verifying if either the[`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) show `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
+* Check whether there's enough ingestion capacity by verifying if either the [`MaterializedViewResult` metric](#materializedviewresult-metric) or [IngestionUtilization metric](/azure/data-explorer/monitor-data-explorer-reference#supported-metrics-for-microsoftkustoclusters) show `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources (preferred) or by altering the [ingestion capacity policy](../capacity-policy.md#ingestion-capacity).
::: moniker-end
:::moniker range="microsoft-fabric"
-* Check whether there's enough ingestion capacity by verifying if the[`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources.
+* Check whether there's enough ingestion capacity by verifying if the [`MaterializedViewResult` metric](#materializedviewresult-metric) shows `InsufficientCapacity` values. You can increase ingestion capacity by scaling the available resources.
::: moniker-end
* If the materialized view is still unhealthy, then the service doesn't have sufficient capacity or resources to materialize all the data on time. Consider the following options:
@@ -144,7 +144,9 @@ The `MaterializedViewResult` metric provides information about the result of a m
The `Result` dimension can have one of the following values:
* **Success**: The materialization completed successfully.
+
* **SourceTableNotFound**: The source table of the materialized view was dropped, so the materialized view is disabled automatically.
+
* **SourceTableSchemaChange**: The schema of the source table changed in a way that isn’t compatible with the materialized view definition. Since the materialized view query no longer matches the materialized view schema, the materialized view is disabled automatically.
:::moniker range="azure-data-explorer"
* **InsufficientCapacity**: The instance doesn't have sufficient capacity to materialize the materialized view, due to a lack of [ingestion capacity](../capacity-policy.md#ingestion-capacity). While insufficient capacity failures can be transient, if they reoccur often, try scaling out the instance or increasing the relevant capacity in the policy.
@@ -152,6 +154,7 @@ The `Result` dimension can have one of the following values:
:::moniker range="microsoft-fabric"
* **InsufficientCapacity**: The instance doesn't have sufficient capacity to materialize the materialized view, due to a lack of ingestion capacity. While insufficient capacity failures can be transient, if they reoccur often, try scaling out the instance or increasing capacity. For more information, see [Plan your capacity size](/fabric/enterprise/plan-capacity).
::: moniker-end
+
* **InsufficientResources:** The database doesn't have sufficient resources (CPU/memory) to materialize the materialized view. While insufficient resource errors might be transient, if they reoccur often, try scaling up or scaling out. For more ideas, see [Troubleshooting unhealthy materialized views](#troubleshooting-unhealthy-materialized-views).
## Materialized views in follower databases
From b34a59347c03389684169cc24eb3f553cee7ad7f Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 15:16:42 +0200
Subject: [PATCH 17/25] spaces
---
.../materialized-views-monitoring.md | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 0eab3f4bbf..2b2df18cd2 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -46,7 +46,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
|---|---|---|
|MaterializedView|1|0|
- * If there are many materialized views, concurrency level depends on the capacity shown in the `Total` column while the `Consumed` column shows how many materialized views are currently running. The [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) specifies the minimum and maximum number of concurrent operations. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. You can override the system's decision and increase the concurrency of materialization processes by setting the minimum number of concurrent operations in the policy. The following example changes the minimum concurrent operations to 3:
+ * The concurrency level of multiple materialized views depends on the capacity shown in the `Total` column, while the `Consumed` column shows the number of materialized views currently running. You can use the [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) to specify the minimum and maximum number of concurrent operations, overriding the system's default concurrency level. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. The following example overrides the system's decision and changes the minimum concurrent operations to 3:
```kusto
.alter-merge cluster policy capacity '{ "MaterializedViewsCapacity": { "ClusterMinimumConcurrentOperations": 3 } }'
@@ -56,8 +56,8 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
::: moniker-end
* Check if there are failures during the materialization process using [.show materialized-view failures](materialized-view-show-failures-command.md#show-materialized-view-failures).
- * If the error is permanent, the system automatically disables the materialized view. To verify whether its disabled, use the [.show materialized-view](materialized-view-show-command.md) command to check if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
- An example of a permanent failure is a change in the source table schema that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
+ * If the error is permanent, the system automatically disables the materialized view. To check if its disabled, use the [.show materialized-view](materialized-view-show-command.md) command and see if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
+ An example of a permanent failure is a source table schema change that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
* If the failure is transient, the system automatically retries the operation. However, the failure can delay the materialization and increase the age of the materialized view. This type of failure occurs, for example, when hitting memory limits or with a query time-out. See the following recommendations for more ways to troubleshoot transient failures.
* Analyze the materialization process using the [.show commands-and-queries](../commands-and-queries.md) command. Replace *Databasename* and *ViewName* to filter for a specific view:
@@ -174,3 +174,8 @@ Materialized views can be defined in [follower databases](materialized-views-lim
.show commands-and-queries
| where Database == "DatabaseName" and ClientActivityId startswith "DN.MaterializedViews;ViewName;"
```
+
+## Related content
+
+* [Materialized views](materialized-view-overview.md)
+*
\ No newline at end of file
From f7f059c6a530bbf0a95d1ce4a13c59791dd4a659 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 15:22:26 +0200
Subject: [PATCH 18/25] edits
---
.../materialized-views/materialized-views-monitoring.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 2b2df18cd2..5a096c4ab3 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -178,4 +178,4 @@ Materialized views can be defined in [follower databases](materialized-views-lim
## Related content
* [Materialized views](materialized-view-overview.md)
-*
\ No newline at end of file
+* [Materialized views use cases](materialized-view-use-cases.md)
\ No newline at end of file
From 1357ea134045a0ca899b464e5a853266be6c9b17 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 17:28:40 +0200
Subject: [PATCH 19/25] Update
data-explorer/kusto/query/array-sort-asc-function.md
---
data-explorer/kusto/query/array-sort-asc-function.md | 1 -
1 file changed, 1 deletion(-)
diff --git a/data-explorer/kusto/query/array-sort-asc-function.md b/data-explorer/kusto/query/array-sort-asc-function.md
index 7fcb2c957a..e5611e235d 100644
--- a/data-explorer/kusto/query/array-sort-asc-function.md
+++ b/data-explorer/kusto/query/array-sort-asc-function.md
@@ -155,4 +155,3 @@ print array_sort_asc(dynamic([null,"blue","yellow","green",null]), false)
* [Aggregation function types at a glance](aggregation-functions.md)
* [array_sort_desc()](array-sort-desc-function.md)
-*
From 97b34106ff336c4540c4afb60bb273be274cab75 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 28 Jan 2025 17:29:16 +0200
Subject: [PATCH 20/25] Update
data-explorer/kusto/query/array-sort-desc-function.md
---
data-explorer/kusto/query/array-sort-desc-function.md | 1 -
1 file changed, 1 deletion(-)
diff --git a/data-explorer/kusto/query/array-sort-desc-function.md b/data-explorer/kusto/query/array-sort-desc-function.md
index 348b2be90a..f972cd42f2 100644
--- a/data-explorer/kusto/query/array-sort-desc-function.md
+++ b/data-explorer/kusto/query/array-sort-desc-function.md
@@ -157,4 +157,3 @@ print array_sort_desc(dynamic([null,"blue","yellow","green",null]), false)
* [Aggregation function types at a glance](aggregation-functions.md)
* [array_sort_asc()](array-sort-asc-function.md)
-*
From 42c4255a40386dd005b8d54016618c49ede7f1c6 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Sun, 2 Feb 2025 21:16:20 +0200
Subject: [PATCH 21/25] edits
---
.../kusto/query/array-sort-asc-function.md | 50 +++++++++++-------
.../kusto/query/array-sort-desc-function.md | 52 +++++++++++--------
.../kusto/query/hll-aggregation-function.md | 4 +-
.../query/hll-if-aggregation-function.md | 4 +-
.../kusto/query/hll-merge-function.md | 13 +++--
5 files changed, 75 insertions(+), 48 deletions(-)
diff --git a/data-explorer/kusto/query/array-sort-asc-function.md b/data-explorer/kusto/query/array-sort-asc-function.md
index 7fcb2c957a..d88a112c6a 100644
--- a/data-explorer/kusto/query/array-sort-asc-function.md
+++ b/data-explorer/kusto/query/array-sort-asc-function.md
@@ -3,7 +3,7 @@ title: array_sort_asc()
description: Learn how to use the array_sort_asc() function to sort arrays in ascending order.
ms.reviewer: slneimer
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 02/02/2025
---
# array_sort_asc()
@@ -32,14 +32,20 @@ Returns the same number of arrays as in the input, with the first array sorted i
`null` is returned for every array that differs in length from the first one.
-If an array contains elements of different types, it's sorted in the following order:
+An array which contains elements of different types, is sorted in the following order:
* Numeric, `datetime`, and `timespan` elements
* String elements
* Guid elements
* All other elements
-## Example 1 - Sorting two arrays
+## Examples
+
+The examples in this section show how to use the syntax to help you get started.
+
+### Sort two arrays
+
+The following example sorts the initial array, `array1`, in ascending order. It then sorts `array2` to match the new order of `array1`.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -59,17 +65,19 @@ print array_sort_asc(array1,array2)
|[1,2,3,4,5]|["a","e","b","c","d"]|
> [!NOTE]
-> The output column names are generated automatically, based on the arguments to the function. To assign different names to the output columns, use the following syntax: `... | extend (out1, out2) = array_sort_asc(array1,array2)`
+> The output column names are generated automatically, based on the arguments to the function. To assign different names to the output columns, use the following syntax: `... | extend (out1, out2) = array_sort_asc(array1,array2)`.
+
+### Sort substrings
-## Example 2 - Sorting substrings
+The following example sorts a list of names in ascending order. It saves a list of names to a variable, `Names`, which is then splits into an array and sorted in ascending order. The query returns the names in ascending order.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
-> Run the query
+> Run the query
::: moniker-end
```kusto
-let Names = "John,Paul,George,Ringo";
+let Names = "John,Paul,Jane,Kao";
let SortedNames = strcat_array(array_sort_asc(split(Names, ",")), ",");
print result = SortedNames
```
@@ -78,9 +86,11 @@ print result = SortedNames
|result|
|---|
-|George,John,Paul,Ringo|
+|Jane,John,Kao,Paul|
-## Example 3 - Combining summarize and array_sort_asc
+### Combine summarize and array_sort_asc
+
+The following example uses the `summarize` operator and the `array_sort_asc` function to organize and sort commands by user in chronological order.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -111,43 +121,43 @@ datatable(command:string, command_time:datetime, user_id:string)
|user2|[
"rm",
"pwd"
]|
> [!NOTE]
-> If your data may contain `null` values, use [make_list_with_nulls](make-list-with-nulls-aggregation-function.md) instead of [make_list](make-list-aggregation-function.md).
+> If your data might contain `null` values, use [make_list_with_nulls](make-list-with-nulls-aggregation-function.md) instead of [make_list](make-list-aggregation-function.md).
-## Example 4 - Controlling location of `null` values
+### Control location of `null` values
By default, `null` values are put last in the sorted array. However, you can control it explicitly by adding a `bool` value as the last argument to `array_sort_asc()`.
-Example with default behavior:
+The following example shows the default behavior:
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
-> Run the query
+> Run the query
::: moniker-end
```kusto
-print array_sort_asc(dynamic([null,"blue","yellow","green",null]))
+print result=array_sort_asc(dynamic([null,"blue","yellow","green",null]))
```
**Output**
-|print_0|
+|result|
|---|
|["blue","green","yellow",null,null]|
-Example with non-default behavior:
+The following example uses nondefault behavior through the `false` parameter, which specifies that nulls should be placed at the beginning of the array:
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
-> Run the query
+> Run the query
::: moniker-end
```kusto
-print array_sort_asc(dynamic([null,"blue","yellow","green",null]), false)
+print result=array_sort_asc(dynamic([null,"blue","yellow","green",null]), false)
```
**Output**
-|`print_0`|
+|result|
|---|
|[null,null,"blue","green","yellow"]|
@@ -155,4 +165,4 @@ print array_sort_asc(dynamic([null,"blue","yellow","green",null]), false)
* [Aggregation function types at a glance](aggregation-functions.md)
* [array_sort_desc()](array-sort-desc-function.md)
-*
+* [strcat_array()](strcat-array-function.md)
diff --git a/data-explorer/kusto/query/array-sort-desc-function.md b/data-explorer/kusto/query/array-sort-desc-function.md
index 348b2be90a..2054e1bf4b 100644
--- a/data-explorer/kusto/query/array-sort-desc-function.md
+++ b/data-explorer/kusto/query/array-sort-desc-function.md
@@ -3,7 +3,7 @@ title: array_sort_desc()
description: Learn how to use the array_sort_desc() function to sort arrays in descending order.
ms.reviewer: slneimer
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 02/02/2025
---
# array_sort_desc()
@@ -34,14 +34,20 @@ Returns the same number of arrays as in the input, with the first array sorted i
`null` is returned for every array that differs in length from the first one.
-If an array contains elements of different types, it's sorted in the following order:
+An array which contains elements of different types, is sorted in the following order:
* Numeric, `datetime`, and `timespan` elements
* String elements
* Guid elements
* All other elements
-## Example 1 - Sorting two arrays
+## Examples
+
+The examples in this section show how to use the syntax to help you get started.
+
+### Sort two arrays
+
+The following example sorts the initial array, `array1`, in descending order. It then sorts `array2` to match the new order of `array1`.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -61,17 +67,19 @@ print array_sort_desc(array1,array2)
|[5,4,3,2,1]|["d","c","b","e","a"]|
> [!NOTE]
-> The output column names are generated automatically, based on the arguments to the function. To assign different names to the output columns, use the following syntax: `... | extend (out1, out2) = array_sort_desc(array1,array2)`
+> The output column names are generated automatically, based on the arguments to the function. To assign different names to the output columns, use the following syntax: `... | extend (out1, out2) = array_sort_desc(array1,array2)`.
+
+## Sort substrings
-## Example 2 - Sorting substrings
+The following example sorts a list of names in descending order. It saves a list of names to a variable, `Names`, which is then splits into an array and sorted in descending order. The query returns the names in descending order.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
-> Run the query
+> Run the query
::: moniker-end
```kusto
-let Names = "John, Paul, George, Ringo";
+let Names = "John,Paul,Jane,Kayo";
let SortedNames = strcat_array(array_sort_desc(split(Names, ",")), ",");
print result = SortedNames
```
@@ -80,9 +88,11 @@ print result = SortedNames
|result|
|---|
-|Ringo, Paul, John, George|
+|Paul,Kayo,John,Jane|
-## Example 3 - Combining summarize and array_sort_desc
+### Combine summarize and array_sort_desc
+
+The following example uses the `summarize` operator and the `array_sort_asc` function to organize and sort commands by user in descending chronological order.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -113,43 +123,43 @@ datatable(command:string, command_time:datetime, user_id:string)
|user2|[
"pwd",
"rm"
]|
> [!NOTE]
-> If your data may contain `null` values, use [make_list_with_nulls](make-list-with-nulls-aggregation-function.md) instead of [make_list](make-list-aggregation-function.md).
+> If your data can contain `null` values, use [make_list_with_nulls](make-list-with-nulls-aggregation-function.md) instead of [make_list](make-list-aggregation-function.md).
-## Example 4 - Controlling location of `null` values
+### Control location of `null` values
-By default, `null` values are put last in the sorted array. However, you can control it explicitly by adding a `bool` value as the last argument to `array_sort_desc()`.
+By default, `null` values are put last in the sorted array. However, you can control it explicitly by adding a `bool` value as the last argument to `array_sort_asc()`.
-Example with default behavior:
+The following example shows the default behavior:
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
-> Run the query
+> Run the query
::: moniker-end
```kusto
-print array_sort_desc(dynamic([null,"blue","yellow","green",null]))
+print result=array_sort_desc(dynamic([null,"blue","yellow","green",null]))
```
**Output**
-|`print_0`|
+|result|
|---|
|["yellow","green","blue",null,null]|
-Example with nondefault behavior:
+The following example uses nondefault behavior through the `false` parameter, which specifies that nulls should be placed at the beginning of the array:
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
-> Run the query
+> Run the query
::: moniker-end
```kusto
-print array_sort_desc(dynamic([null,"blue","yellow","green",null]), false)
+print result=array_sort_desc(dynamic([null,"blue","yellow","green",null]), false)
```
**Output**
-|`print_0`|
+|result|
|---|
|[null,null,"yellow","green","blue"]|
@@ -157,4 +167,4 @@ print array_sort_desc(dynamic([null,"blue","yellow","green",null]), false)
* [Aggregation function types at a glance](aggregation-functions.md)
* [array_sort_asc()](array-sort-asc-function.md)
-*
+* [strcat_array()](strcat-array-function.md)
diff --git a/data-explorer/kusto/query/hll-aggregation-function.md b/data-explorer/kusto/query/hll-aggregation-function.md
index 5ed2f31c13..080dc6d082 100644
--- a/data-explorer/kusto/query/hll-aggregation-function.md
+++ b/data-explorer/kusto/query/hll-aggregation-function.md
@@ -3,7 +3,7 @@ title: hll() (aggregation function)
description: Learn how to use the hll() function to calculate the results of the dcount() function.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 01/15/2025
+ms.date: 02/02/2025
---
# hll() (aggregation function)
@@ -77,4 +77,4 @@ The results table shown includes only the first 10 rows.
* [Aggregation function types at a glance](aggregation-functions.md)
* [Using hll() and tdigest()](using-hll-tdigest.md)
* [hll_if() (aggregation function)](hll-if-aggregation-function.md)
-* [hll_merge()](hll-merge-function.md)
+* [hll_merge() (aggregation function)](hll-merge-aggregation-function.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/hll-if-aggregation-function.md b/data-explorer/kusto/query/hll-if-aggregation-function.md
index 7f092be552..2745feefed 100644
--- a/data-explorer/kusto/query/hll-if-aggregation-function.md
+++ b/data-explorer/kusto/query/hll-if-aggregation-function.md
@@ -3,7 +3,7 @@ title: hll_if() (aggregation function)
description: Learn how to use the hll_if() function to calculate the intermediate results of the dcount() function.
ms.reviewer: ziham
ms.topic: reference
-ms.date: 01/15/2025
+ms.date: 02/02/2025
---
# hll_if() (aggregation function)
@@ -77,4 +77,4 @@ StormEvents
* [Aggregation function types at a glance](aggregation-functions.md)
* [Using hll() and tdigest()](using-hll-tdigest.md)
* [hll() (aggregation function)](hll-aggregation-function.md)
-* [hll_merge()](hll-merge-function.md)
+* [hll_merge() (aggregation function)](hll-merge-aggregation-function.md)
diff --git a/data-explorer/kusto/query/hll-merge-function.md b/data-explorer/kusto/query/hll-merge-function.md
index 4bf1473ca0..17e17b03b9 100644
--- a/data-explorer/kusto/query/hll-merge-function.md
+++ b/data-explorer/kusto/query/hll-merge-function.md
@@ -3,7 +3,7 @@ title: hll_merge()
description: Learn how to use the hll_merge() function toe merge HLL results.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 02/02/2025
---
# hll_merge()
@@ -14,8 +14,8 @@ Merges HLL results. This is the scalar version of the aggregate version [`hll_me
Read about the [underlying algorithm (*H*yper*L*og*L*og) and estimation accuracy](#estimation-accuracy).
> [!IMPORTANT]
-> The results of hll(), hll_if(), and hll_merge() can be stored and later retrieved. For example, you may want to create a daily unique users summary, which can then be used to calculate weekly counts.
-> However, the precise binary representation of these results may change over time. There's no guarantee that these functions will produce identical results for identical inputs, and therefore we don't advise relying on them.
+> The results of hll(), hll_if(), and hll_merge() can be stored and later retrieved. For example, you might want to create a daily unique users summary, which can then be used to calculate weekly counts.
+> However, the precise binary representation of these results can change over time. There's no guarantee that these functions produce identical results for identical inputs, and therefore we don't advise relying on them.
## Syntax
@@ -59,3 +59,10 @@ range x from 1 to 10 step 1
## Estimation accuracy
[!INCLUDE [data-explorer-estimation-accuracy](../includes/estimation-accuracy.md)]
+
+## Related content
+
+* [Using hll() and tdigest()](using-hll-tdigest.md)
+* [hll() (aggregation function)](hll-aggregation-function.md)
+* [hll_if() (aggregation function)](hll-if-aggregation-function.md)
+* [hll_merge() (aggregation function)](hll-merge-aggregation-function.md)
\ No newline at end of file
From b4d5a01a74e540924f093ecbc2f0ebc9d8bea7cf Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Mon, 3 Feb 2025 12:18:10 +0200
Subject: [PATCH 22/25] Edits
---
data-explorer/kusto/query/array-sort-asc-function.md | 4 ++--
data-explorer/kusto/query/array-sort-desc-function.md | 2 +-
.../kusto/query/count-distinct-aggregation-function.md | 4 ++--
.../query/count-distinctif-aggregation-function.md | 4 ++--
data-explorer/kusto/query/dcount-intersect-plugin.md | 6 ++++--
.../kusto/query/make-list-aggregation-function.md | 10 ++++++----
6 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/data-explorer/kusto/query/array-sort-asc-function.md b/data-explorer/kusto/query/array-sort-asc-function.md
index 340813d66f..63d7076791 100644
--- a/data-explorer/kusto/query/array-sort-asc-function.md
+++ b/data-explorer/kusto/query/array-sort-asc-function.md
@@ -3,7 +3,7 @@ title: array_sort_asc()
description: Learn how to use the array_sort_asc() function to sort arrays in ascending order.
ms.reviewer: slneimer
ms.topic: reference
-ms.date: 02/02/2025
+ms.date: 02/03/2025
---
# array_sort_asc()
@@ -144,7 +144,7 @@ print result=array_sort_asc(dynamic([null,"blue","yellow","green",null]))
|---|
|["blue","green","yellow",null,null]|
-The following example uses nondefault behavior through the `false` parameter, which specifies that nulls should be placed at the beginning of the array:
+The following example shows nondefault behavior using the `false` parameter, which specifies that nulls are placed at the beginning of the array.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
diff --git a/data-explorer/kusto/query/array-sort-desc-function.md b/data-explorer/kusto/query/array-sort-desc-function.md
index 2054e1bf4b..d00d70050e 100644
--- a/data-explorer/kusto/query/array-sort-desc-function.md
+++ b/data-explorer/kusto/query/array-sort-desc-function.md
@@ -146,7 +146,7 @@ print result=array_sort_desc(dynamic([null,"blue","yellow","green",null]))
|---|
|["yellow","green","blue",null,null]|
-The following example uses nondefault behavior through the `false` parameter, which specifies that nulls should be placed at the beginning of the array:
+The following example shows nondefault behavior using the `false` parameter, which specifies that nulls are placed at the beginning of the array.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
diff --git a/data-explorer/kusto/query/count-distinct-aggregation-function.md b/data-explorer/kusto/query/count-distinct-aggregation-function.md
index ad12a8280f..ee861be28f 100644
--- a/data-explorer/kusto/query/count-distinct-aggregation-function.md
+++ b/data-explorer/kusto/query/count-distinct-aggregation-function.md
@@ -3,7 +3,7 @@ title: count_distinct() (aggregation function) - (preview)
description: Learn how to use the count_distinct() (aggregation function) to count unique values specified by a scalar expression per summary group.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 01/15/2025
+ms.date: 02/03/2025
---
# count_distinct() (aggregation function) - (preview)
@@ -45,7 +45,7 @@ Long integer value indicating the number of unique values of *expr* per summary
## Example
-This example shows how many types of storm events happened in each state.
+The following example shows how many types of storm events happened in each state.
:::moniker range="azure-data-explorer"
Function performance can be degraded when operating on multiple data sources from different clusters.
diff --git a/data-explorer/kusto/query/count-distinctif-aggregation-function.md b/data-explorer/kusto/query/count-distinctif-aggregation-function.md
index f85a3f2e4f..f8715c3827 100644
--- a/data-explorer/kusto/query/count-distinctif-aggregation-function.md
+++ b/data-explorer/kusto/query/count-distinctif-aggregation-function.md
@@ -3,7 +3,7 @@ title: count_distinctif() (aggregation function) - (preview)
description: Learn how to use the count_distinctif() function to count unique values of a scalar expression in records for which the predicate evaluates to true.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 01/15/2025
+ms.date: 02/03/2025
---
# count_distinctif() (aggregation function) - (preview)
@@ -44,7 +44,7 @@ Integer value indicating the number of unique values of *expr* per summary group
## Example
-This example shows how many types of death-causing storm events happened in each state. Only storm events with a nonzero count of deaths are counted.
+The following example shows how many types of death-causing storm events happened in each state. Only storm events with a nonzero count of deaths are counted.
:::moniker range="azure-data-explorer"
> [!NOTE]
diff --git a/data-explorer/kusto/query/dcount-intersect-plugin.md b/data-explorer/kusto/query/dcount-intersect-plugin.md
index 90c600fb01..24e76ed919 100644
--- a/data-explorer/kusto/query/dcount-intersect-plugin.md
+++ b/data-explorer/kusto/query/dcount-intersect-plugin.md
@@ -3,7 +3,7 @@ title: dcount_intersect plugin
description: Learn how to use the dcount_intersect plugin to calculate the intersection between N sets based on hyper log log (hll) values.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 08/11/2024
+ms.date: 02/03/2025
---
# dcount_intersect plugin
@@ -66,4 +66,6 @@ range x from 1 to 100 step 1
## Related content
-* [Aggregation function types at a glance](aggregation-functions.md)
\ No newline at end of file
+* [dcount() (aggregation function)](dcount-aggregation-function.md)
+* [hll() (aggregation function)](hll-aggregation-function.md)
+* [evaluate plugin operator](evaluate-operator.md)
\ No newline at end of file
diff --git a/data-explorer/kusto/query/make-list-aggregation-function.md b/data-explorer/kusto/query/make-list-aggregation-function.md
index 1748331ee5..9ceab645ff 100644
--- a/data-explorer/kusto/query/make-list-aggregation-function.md
+++ b/data-explorer/kusto/query/make-list-aggregation-function.md
@@ -3,7 +3,7 @@ title: make_list() (aggregation function)
description: Learn how to use the make_list() function to create a dynamic JSON object array of all the values of the expressions in the group.
ms.reviewer: alexans
ms.topic: reference
-ms.date: 01/15/2025
+ms.date: 02/03/2025
adobe-target: true
---
# make_list() (aggregation function)
@@ -45,9 +45,11 @@ If the input to the `summarize` operator is sorted, the order of elements in the
## Examples
+The examples in this section show how to use the syntax to help you get started.
+
### One column
-The following example makes a list out of a single column:
+The following example uses the datatable, `shapes`, to return a list of shapes in a single column.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -79,7 +81,7 @@ shapes
### Using the 'by' clause
-The following example runs a query using the `by` clause:
+The following example uses the `make_list` function and the `by` clause to create two lists of objects grouped by whether they have an even or odd number of sides.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
@@ -112,7 +114,7 @@ shapes
### Packing a dynamic object
-The following examples show how to [pack](pack-function.md) a dynamic object in a column before making it a list.
+The following examples show how to [pack](pack-function.md) a dynamic object in a column before making it a list. It returns a column with a boolean table `isEvenSideCount` indicating whether the side count is even or odd and a `mylist` column that contains lists of packed bags int each category.
:::moniker range="azure-data-explorer"
> [!div class="nextstepaction"]
From 0ddf6128634ff879d55d5700f63a284e08bfd020 Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 4 Feb 2025 11:03:46 +0200
Subject: [PATCH 23/25] edits
---
.../materialized-views/materialized-views-monitoring.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index 5a096c4ab3..c340591d8c 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -3,7 +3,7 @@ title: Monitor materialized views
description: This article describes how to monitor materialized views.
ms.reviewer: yifats
ms.topic: reference
-ms.date: 01/28/2025
+ms.date: 02/04/2025
---
# Monitor materialized views
@@ -46,7 +46,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
|---|---|---|
|MaterializedView|1|0|
- * The concurrency level of multiple materialized views depends on the capacity shown in the `Total` column, while the `Consumed` column shows the number of materialized views currently running. You can use the [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) to specify the minimum and maximum number of concurrent operations, overriding the system's default concurrency level. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. The following example overrides the system's decision and changes the minimum concurrent operations to 3:
+ * The number of materialized views that can run concurrently depends on the capacity shown in the `Total` column, while the `Consumed` column shows the number of materialized views currently running. You can use the [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) to specify the minimum and maximum number of concurrent operations, overriding the system's default concurrency level. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. The following example overrides the system's decision and changes the minimum concurrent operations from 1 to 3:
```kusto
.alter-merge cluster policy capacity '{ "MaterializedViewsCapacity": { "ClusterMinimumConcurrentOperations": 3 } }'
@@ -56,7 +56,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
::: moniker-end
* Check if there are failures during the materialization process using [.show materialized-view failures](materialized-view-show-failures-command.md#show-materialized-view-failures).
- * If the error is permanent, the system automatically disables the materialized view. To check if its disabled, use the [.show materialized-view](materialized-view-show-command.md) command and see if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
+ * If the error is permanent, the system automatically disables the materialized view. To check if it's disabled, use the [.show materialized-view](materialized-view-show-command.md) command and see if the value in the `IsEnabled` column is `false`. Then check the [Journal](../journal.md) for the disabled event with the [.show journal](../journal.md#show-journal) command.
An example of a permanent failure is a source table schema change that makes it incompatible with the materialized view. For more information, see [.create materialized-view command](materialized-view-create.md#supported-properties).
* If the failure is transient, the system automatically retries the operation. However, the failure can delay the materialization and increase the age of the materialized view. This type of failure occurs, for example, when hitting memory limits or with a query time-out. See the following recommendations for more ways to troubleshoot transient failures.
From 4e9442dc2e2c1cbfefa97d5f9fa4e2d950762e5b Mon Sep 17 00:00:00 2001
From: Stacyrch140 <102548089+Stacyrch140@users.noreply.github.com>
Date: Tue, 4 Feb 2025 12:33:58 -0500
Subject: [PATCH 24/25] Fix typo in 'latitude' heading
---
data-explorer/kusto/query/arg-max-aggregation-function.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/data-explorer/kusto/query/arg-max-aggregation-function.md b/data-explorer/kusto/query/arg-max-aggregation-function.md
index cd9a9dd41f..75efc64aa0 100644
--- a/data-explorer/kusto/query/arg-max-aggregation-function.md
+++ b/data-explorer/kusto/query/arg-max-aggregation-function.md
@@ -37,7 +37,7 @@ Returns a row in the table that maximizes the specified expression *ExprToMaximi
## Examples
-### Find maximum latitute
+### Find maximum latitude
The following example finds the maximum latitude of a storm event in each state.
From b02c6ffea35acc1563de886bf61df06a1d35235e Mon Sep 17 00:00:00 2001
From: Meira Josephy <144697924+mjosephym@users.noreply.github.com>
Date: Tue, 4 Feb 2025 22:19:50 +0200
Subject: [PATCH 25/25] edits
---
.../materialized-views/materialized-views-monitoring.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
index c340591d8c..4107723d8d 100644
--- a/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
+++ b/data-explorer/kusto/management/materialized-views/materialized-views-monitoring.md
@@ -32,7 +32,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
:::moniker range="azure-data-explorer"
-* Check the number of materialized views using the [.show capacity](../show-capacity-command.md) command and the current capacity for materialized views:
+* Check the number of materialized views on the cluster, and the current capacity for materialized views:
```kusto
.show capacity
@@ -46,7 +46,7 @@ If the `MaterializedViewAge` metric constantly increases, and the `MaterializedV
|---|---|---|
|MaterializedView|1|0|
- * The number of materialized views that can run concurrently depends on the capacity shown in the `Total` column, while the `Consumed` column shows the number of materialized views currently running. You can use the [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) to specify the minimum and maximum number of concurrent operations, overriding the system's default concurrency level. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. The following example overrides the system's decision and changes the minimum concurrent operations from 1 to 3:
+ * The number of materialized views that can run concurrently depends on the capacity shown in the `Total` column, while the `Consumed` column shows the number of materialized views currently running. You can use the [Materialized views capacity policy](../capacity-policy.md#materialized-views-capacity-policy) to specify the minimum and maximum number of concurrent operations, overriding the system's default concurrency level. The system determines the current concurrency, shown in `Total`, based on the cluster's available resources. The following example overrides the system's decision and changes the minimum concurrent operations from one to three:
```kusto
.alter-merge cluster policy capacity '{ "MaterializedViewsCapacity": { "ClusterMinimumConcurrentOperations": 3 } }'