Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ bin/
obj/
*.DotSettings*
.DS_Store
.aider*
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,3 +62,5 @@ Would result in following JSON:
* [x] MatchPhrasePrefix query
* [x] Exists query
* [x] Type query
* [x] Wildcard query
* [x] `rewrite` and `boost` parameters
3 changes: 3 additions & 0 deletions docs/_config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
theme: jekyll-theme-minimal
title: Elasticsearch.FSharp DSL
description: Documentation for the Elasticsearch.FSharp DSL library.
194 changes: 194 additions & 0 deletions docs/aggregations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
# Aggregations

Aggregations allow you to group and extract statistics from your data. Add an `Aggs` element to your `Search` request. The `Aggs` element takes a list of `AggsFieldsBody`.

## F# DSL Example: Simple Average Aggregation

```fsharp
open Elasticsearch.FSharp.DSL

let query =
Search [
Aggs [
NamedAgg (
"average_price", // Name of the aggregation
Avg [ AggField "price" ] // Aggregation type and parameters
)
]
Query MatchAll
]
```

## JSON Output

```json
{
"query": {
"match_all": {}
},
"aggs": {
"average_price": {
"avg": {
"field": "price"
}
}
}
}
```

## Aggregation Structures (`AggsFieldsBody`)

- `NamedAgg of string * AggBody`: A simple named aggregation.
- `NamedComplexAgg of string * AggBody * AggsFieldsBody list`: A named aggregation that can have sub-aggregations.
- `FilterAgg of string * QueryBody * AggBody`: An aggregation that operates on a filtered subset of documents.
- `FilterComplexAgg of string * QueryBody * AggBody * AggsFieldsBody list`: A filtered aggregation that can have sub-aggregations.

## Aggregation Types (`AggBody`)

- `Avg of AggParam list`: Calculates the average of a numeric field.
- `WeightedAvg of AggParam list`: Calculates a weighted average.
- `Max of AggParam list`: Finds the maximum value.
- `Min of AggParam list`: Finds the minimum value.
- `Sum of AggParam list`: Calculates the sum.
- `Stats of AggParam list`: Returns multiple statistics (min, max, sum, count, avg).
- `AggTerms of AggParam list`: A multi-bucket aggregation that creates buckets based on field values.
- `AggDateHistogram of AggParam list`: A multi-bucket aggregation that builds buckets based on time intervals.
- `ValueCount of AggParam list`: Counts the number of documents that have a value for a field.

## Aggregation Parameters (`AggParam`)

- `AggField of string`: The field to aggregate on.
- `AggScript of ScriptField list`: Use a script to generate values for aggregation.
- `AggValue of string`: A specific value (usage depends on aggregation type).
- `AggWeight of AggWeightConfig`: For weighted average, specifies weight field or value.
- `WeightField of string`
- `WeightValueField of string`
- `Weight of string` (numeric weight as string)
- `AggInterval of string`: For date histograms (e.g., "month", "1d", "1h").
- `AggFormat of string`: Format for date histograms or other formatted values (e.g., "yyyy-MM-dd").
- `AggSize of int`: For terms aggregations, the number of buckets to return.

## Example: Weighted Average Aggregation

```fsharp
let query =
Search [
Aggs [
NamedAgg (
"weighted_score_avg",
WeightedAvg [
AggWeight (WeightValueField "score") // Field containing the value
AggWeight (WeightField "weight") // Field containing the weight
]
)
]
]
```

JSON Output:
```json
{
"aggs": {
"weighted_score_avg": {
"weighted_avg": {
"value": { "field": "score" },
"weight": { "field": "weight" }
}
}
}
}
```

## Example: Value Count Aggregation

```fsharp
let query =
Search [
Aggs [
NamedAgg (
"type_count",
ValueCount [ AggField "document_type.keyword" ]
)
]
]
```

JSON Output:
```json
{
"aggs": {
"type_count": {
"value_count": {
"field": "document_type.keyword"
}
}
}
}
```

## Complex (Nested) Aggregations

Aggregations can be nested using `NamedComplexAgg` or `FilterComplexAgg`.

```fsharp
open Elasticsearch.FSharp.DSL

let query =
Search [
Aggs [
NamedComplexAgg (
"products_by_category", // Outer aggregation name
AggTerms [ AggField "category.keyword"; AggSize 10 ], // Outer aggregation: terms on category
[ // Inner (sub) aggregations
NamedAgg (
"average_price_in_category", // Inner agg name
Avg [ AggField "price" ] // Inner agg: average price
),
FilterComplexAgg (
"sales_for_popular_items_in_category", // Inner filtered agg name
Match ("popularity", [MatchQueryField.MatchQuery "high"]), // Filter for this inner agg
Sum [ AggField "sales_count" ], // Aggregation (sum of sales)
[ // Further nested aggregation within the filtered one
NamedAgg(
"avg_rating_for_popular_sales",
Avg [ AggField "rating"]
)
]
)
]
)
]
Query MatchAll
]
```

JSON Output (structure based on `Complex aggs serializes correctly` test):
```json
{
"query": { "match_all": {} },
"aggs": {
"products_by_category": {
"terms": { "field": "category.keyword", "size": 10 },
"aggs": {
"average_price_in_category": {
"avg": { "field": "price" }
},
"sales_for_popular_items_in_category": {
"filter": { "match": { "popularity": { "query": "high" } } },
"aggs": {
"sales_for_popular_items_in_category": {
"sum": { "field": "sales_count" },
"aggs": {
"avg_rating_for_popular_sales": {
"avg": { "field": "rating" }
}
}
}
}
}
}
}
}
}
```
*Note: The name of the aggregation in the `FilterComplexAgg`'s "aggs" block is repeated in the JSON output by the current serializer. This is reflected in the example above, matching the behavior observed in tests.*
27 changes: 27 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Welcome to Elasticsearch.FSharp DSL Documentation

This library provides an F# Domain Specific Language (DSL) for constructing Elasticsearch queries.

## Getting Started

This documentation provides examples of how to use the DSL to construct various Elasticsearch requests. Ensure you have the `Elasticsearch.FSharp` package installed in your project.

## Documentation Sections

- [Search DSL Overview](./search-dsl.md): Learn about the core components of a search request.
- [Queries](./queries/index.md): Detailed information on various query types.
- [Aggregations](./aggregations.md): How to use aggregations.
- [Sorting](./sort.md): Sorting your search results.
- [Pagination](./pagination.md): Controlling 'from' and 'size' of results.
- [Script Fields](./script-fields.md): Using script fields in your search.
- [Source Filtering](./source-filtering.md): Controlling which parts of the source document are returned.
- [Mapping Generation](./mapping.md): Generating Elasticsearch mappings from F# types.

## Examples

Throughout this documentation, you will find F# code examples and the corresponding JSON output they generate for Elasticsearch. All F# examples assume you have the relevant namespaces open, primarily:

```fsharp
open Elasticsearch.FSharp.DSL
open Elasticsearch.FSharp.DSL.Serialization // For toJson function if used directly
```
151 changes: 151 additions & 0 deletions docs/mapping.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# Mapping Generation

The library provides utilities to generate Elasticsearch mapping definitions from F# types decorated with specific attributes. This is useful for setting up your indices with the correct field types and properties before indexing data.

## Defining an Entity

Use attributes from the `Elasticsearch.FSharp.Mapping.Attributes` namespace to define how your F# types map to Elasticsearch fields.

```fsharp
open Elasticsearch.FSharp.Mapping.Attributes

[<ElasticType("custom_entity_name")>] // Optional: sets the Elasticsearch type name.
// If includeTypeName=true in ToJson, this is used. Otherwise, _doc or no type.
type TestEntity = {
[<ElasticField("long")>] // Specifies the Elasticsearch field type
id: int64

[<ElasticField("text")>]
[<ElasticSubField("raw", fieldType = "keyword")>] // Defines a multi-field for 'title'
[<ElasticSubField("en", fieldType = "text", analyzer = "english")>]
title: string

[<ElasticField("text", name = "overridden_field_name")>] // 'name' overrides the F# property name in JSON
originalPropertyName: string

[<ElasticField("keyword", ignoreAbove = 256u)>]
tag: string

[<ElasticField("integer", ignoreMalformed = true)>]
count: int
}
```

## Generating Mapping JSON

Use `generateElasticMapping` from `Elasticsearch.FSharp.Mapping.DSL` and `ToJson()` or `ToPutMappingsJson()` from `Elasticsearch.FSharp.Mapping.Json`.

```fsharp
open Elasticsearch.FSharp.Mapping.DSL
open Elasticsearch.FSharp.Mapping.Json

// Generate the mapping structure from the F# type
let mapping = generateElasticMapping typeof<TestEntity>

// Generates JSON for creating an index with mapping (includes "mappings" and type name wrapper by default)
let createIndexJson = mapping.ToJson()

// Generates JSON for creating an index, excluding the type name wrapper under "mappings"
let createIndexJsonNoType = mapping.ToJson(includeTypeName=false)

// To get JSON for the PUT mapping API (just the "properties" part, one string per top-level property)
let putMappingJsonParts = mapping.ToPutMappingsJson()
```

## Example JSON Output (`mapping.ToJson(includeTypeName=false)`)

For the `TestEntity` above, `mapping.ToJson(includeTypeName=false)` would produce something like:

```json
{
"mappings": {
"properties": {
"id": { "type": "long" },
"title": {
"type": "text",
"fields": {
"raw": { "type": "keyword" },
"en": { "type": "text", "analyzer": "english" }
}
},
"overridden_field_name": { "type": "text" },
"tag": { "type": "keyword", "ignore_above": 256 },
"count": { "type": "integer", "ignore_malformed": true }
}
}
}
```
If `mapping.ToJson()` (or `mapping.ToJson(includeTypeName=true)`) is used, and `ElasticType` attribute is present, its value (`custom_entity_name`) would wrap the `properties` object. If `ElasticType` is absent, `_doc` would be used as the wrapper.

## Attributes

- `[<ElasticType("type_name")>]`: (Optional) Applied to the record/class type. Specifies the Elasticsearch document type name. This is used if `ToJson(includeTypeName=true)` is called.
- `[<ElasticField("field_type", ...)>]`: Applied to record fields/properties.
- `fieldType: string`: The Elasticsearch data type (e.g., "text", "keyword", "long", "date", "object", "nested").
- `name: string`: (Optional) Overrides the F# field name in the generated JSON mapping.
- `ignoreAbove: uint32`: (Optional) For `keyword` fields, sets `ignore_above`.
- `ignoreMalformed: bool`: (Optional) For numeric, date, geo fields, sets `ignore_malformed`.
- `useProperties: bool`: (Optional) For object types, indicates that the properties of the complex type should be mapped. Defaults to `true` if the field is a record/class type.
- `maxDepth: int`: (Optional) For recursive types, limits the depth of property mapping to prevent infinite recursion.
- `[<ElasticSubField("sub_field_name", ...)>]`: Applied to record fields/properties to define multi-fields (sub-fields).
- `subFieldName: string`: The name of the sub-field (e.g., "raw", "english").
- `fieldType: string`: The Elasticsearch type for this sub-field.
- `analyzer: string`: (Optional) Analyzer for text sub-fields.
- `ignoreMalformed: bool`: (Optional) For sub-fields.

## Including Index Settings

You can add index-level settings (like `number_of_shards` or `analysis` configurations) to the mapping programmatically:

```fsharp
open System.Collections.Generic
open Elasticsearch.FSharp.Utility // For Json.makeObject, etc. if building complex settings

let mappingWithSettings =
{ mapping with // Assuming 'mapping' is from generateElasticMapping
Settings =
[
"number_of_shards", MappingSetting.Setting "5" // Simple string setting
"number_of_replicas", MappingSetting.Setting "1"
"analysis", MappingSetting.AnalyzerBlock ( // For complex JSON blocks like analyzers
Json.makeObject [
Json.makeKeyValue "analyzer" (Json.makeObject [
Json.makeKeyValue "my_custom_analyzer" (Json.makeObject [
Json.makeKeyValue "tokenizer" (Json.quoteString "standard")
Json.makeKeyValue "filter" (Json.makeArray [Json.quoteString "lowercase"])
])
])
]
)
]
|> List.map (fun (k, v) -> KeyValuePair(k, v))
|> Dictionary
|> Some
}
let mappingJsonWithSettings = mappingWithSettings.ToJson(includeTypeName=false)
```

JSON Output with settings (`mappingJsonWithSettings`):
```json
{
"settings": {
"number_of_shards": "5",
"number_of_replicas": "1",
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase"]
}
}
}
},
"mappings": {
"properties": {
// ... properties from TestEntity ...
}
}
}
```

Refer to `tests/Elasticsearch.FSharp.Tests/Mapping.fs` for more detailed examples, including mapping of recursive types and various attribute usages.
Loading
Loading