From 96a440d60039fc5f2f1aa3095db6e652c9de210a Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 15:44:05 +0300 Subject: [PATCH 01/51] docs: reorganize integrations page to focus on alert forwarding MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove sinks/destinations from integrations overview - Restructure page to highlight "Forward Alerts to Robusta" - Add dedicated section for Prometheus, Nagios, and SolarWinds - Simplify description to focus on alert forwarding and AI analysis ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/configuration/index.rst | 40 +++++++++++++++--------------------- 1 file changed, 17 insertions(+), 23 deletions(-) diff --git a/docs/configuration/index.rst b/docs/configuration/index.rst index df7f65faf..f34d7bdaf 100644 --- a/docs/configuration/index.rst +++ b/docs/configuration/index.rst @@ -4,7 +4,7 @@ Integrations Overview ========================== -Robusta can receive alerts from many sources and send them to many destinations. +Robusta can receive alerts from many monitoring systems and enrich them with AI analysis. .. grid:: @@ -17,43 +17,37 @@ Robusta can receive alerts from many sources and send them to many destinations. Analyze alerts with `Holmes GPT `_. - .. grid-item-card:: :octicon:`book;1em;` Data Sources + .. grid-item-card:: :octicon:`book;1em;` Forward Alerts to Robusta :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/index :link-type: doc - Send data to Robusta from Prometheus, AlertManager, Grafana, Thanos and others. + Send alerts to Robusta from Prometheus, AlertManager, Grafana, Nagios, SolarWinds and others. - .. grid-item-card:: :octicon:`book;1em;` Sinks (Destinations) - :class-card: sd-bg-light sd-bg-text-light - :link: sinks/index - :link-type: doc - - Send notifications from Robusta to 15+ integrations like Slack, MS Teams, and Email. +Forward Alerts to Robusta +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Popular Sinks (Destinations) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. grid:: 1 1 2 4 +.. grid:: 1 1 2 3 :gutter: 3 - .. grid-item-card:: :octicon:`cpu;1em;` MS Teams + .. grid-item-card:: :octicon:`pulse;1em;` Prometheus & AlertManager :class-card: sd-bg-light sd-bg-text-light - :link: sinks/ms-teams + :link: alertmanager-integration/index :link-type: doc - .. grid-item-card:: :octicon:`cpu;1em;` Slack - :class-card: sd-bg-light sd-bg-text-light - :link: sinks/slack - :link-type: doc + In-cluster, centralized, or managed Prometheus services - .. grid-item-card:: :octicon:`cpu;1em;` Jira + .. grid-item-card:: :octicon:`bell;1em;` Nagios :class-card: sd-bg-light sd-bg-text-light - :link: sinks/jira + :link: alertmanager-integration/nagios :link-type: doc - .. grid-item-card:: :octicon:`cpu;1em;` Robusta UI + Forward Nagios alerts to Robusta via webhook + + .. grid-item-card:: :octicon:`bell;1em;` SolarWinds :class-card: sd-bg-light sd-bg-text-light - :link: sinks/RobustaUI + :link: alertmanager-integration/solarwinds :link-type: doc + + Forward SolarWinds alerts to Robusta via webhook From e6945b485676187f619d77f08599f8dbbf5ceabc Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 15:44:49 +0300 Subject: [PATCH 02/51] docs: move sinks from Integrations to Notifications & Routing section MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Move sinks reference from Integrations nav to Notifications & Routing nav - Place sinks as first item in Notifications & Routing section - Remove sinks from Integrations section to reduce confusion - Keep logical grouping of alert destinations with notification routing ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/index.rst b/docs/index.rst index 1a1c85dc4..401a0b724 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -33,7 +33,6 @@ configuration/index ๐Ÿช„ AI Analysis - HolmesGPT - ๐Ÿ”” Sinks ๐Ÿ”ฅ Prometheus/AlertManager Cost Savings - KRR K8s Misconfigurations - Popeye @@ -44,6 +43,7 @@ :caption: ๐Ÿ”” Notifications & Routing :hidden: + ๐Ÿ”” Sinks notification-routing/configuring-sinks Routing (Scopes) Grouping (Slack Threads) From 1a00d804f1fc50d8c8f5b7aa372851936d1af746 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 16:09:45 +0300 Subject: [PATCH 03/51] docs: fix integration page hierarchy and remove redundancy MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Redesign integrations overview to be truly high-level with categories - Rename "Integrating with Prometheus" to "Alert Sources" for broader scope - Remove duplicate content between overview and subpages - Focus overview on integration categories rather than detailed setup - Update navigation labels to match new page structure - Make user journey clearer: overview โ†’ specific alert source setup ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- .../alertmanager-integration/index.rst | 20 ++++++---- docs/configuration/index.rst | 37 +++++++++++-------- docs/index.rst | 2 +- run_runner_locally.sh | 2 + 4 files changed, 37 insertions(+), 24 deletions(-) diff --git a/docs/configuration/alertmanager-integration/index.rst b/docs/configuration/alertmanager-integration/index.rst index cbaf70406..f6feac74f 100644 --- a/docs/configuration/alertmanager-integration/index.rst +++ b/docs/configuration/alertmanager-integration/index.rst @@ -1,7 +1,7 @@ :hide-toc: -Integrating with Prometheus +Alert Sources ================================ .. toctree:: :hidden: @@ -22,18 +22,22 @@ Integrating with Prometheus solarwinds -Robusta works best when integrated with Prometheus and AlertManager. When properly setup, Robusta will: +Robusta can receive alerts from various monitoring systems. Choose the integration that matches your monitoring setup: -1. Show your existing Prometheus alerts in Robusta, enriched with extra information -2. Fetch relevant metrics from Prometheus and show them on related alerts -3. Fetch metrics from Prometheus and show them in the Robusta UI (optional, only relevant for UI users) +**Prometheus/AlertManager** - The most popular choice. When integrated with Prometheus, Robusta will: -If you installed Robusta's :ref:`Embedded Prometheus Stack`, then everything is pre-integrated and not setup is necessary. If not, you will need follow a guide below. +1. Show your existing Prometheus alerts, enriched with extra information +2. Fetch relevant metrics from Prometheus and show them on related alerts +3. Display metrics in the Robusta UI (optional, only relevant for UI users) + +**Other Systems** - Robusta also supports webhook-based integrations for legacy and enterprise monitoring systems. + +If you installed Robusta's :ref:`Embedded Prometheus Stack`, then Prometheus is pre-integrated and no setup is necessary. Otherwise, choose a guide below. .. _alertmanager-setup-options: -Setup Instructions -^^^^^^^^^^^^^^^^^^ +Prometheus & AlertManager Setup +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. grid:: 1 1 2 3 :gutter: 3 diff --git a/docs/configuration/index.rst b/docs/configuration/index.rst index f34d7bdaf..8d17308ca 100644 --- a/docs/configuration/index.rst +++ b/docs/configuration/index.rst @@ -3,51 +3,58 @@ Integrations Overview ========================== +Robusta connects to your existing monitoring infrastructure to receive alerts and enrich them with AI analysis and automated responses. -Robusta can receive alerts from many monitoring systems and enrich them with AI analysis. - +Key Integration Categories +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. grid:: :gutter: 3 - .. grid-item-card:: :octicon:`book;1em;` AI Analysis + .. grid-item-card:: :octicon:`pulse;1em;` Alert Sources :class-card: sd-bg-light sd-bg-text-light - :link: holmesgpt/index + :link: alertmanager-integration/index :link-type: doc - Analyze alerts with `Holmes GPT `_. + Connect monitoring systems like Prometheus, Nagios, and SolarWinds to forward alerts to Robusta. - .. grid-item-card:: :octicon:`book;1em;` Forward Alerts to Robusta + .. grid-item-card:: :octicon:`brain;1em;` AI Analysis :class-card: sd-bg-light sd-bg-text-light - :link: alertmanager-integration/index + :link: holmesgpt/index :link-type: doc - Send alerts to Robusta from Prometheus, AlertManager, Grafana, Nagios, SolarWinds and others. + Automatically investigate alerts using Holmes GPT with access to logs, metrics, and Kubernetes context. + .. grid-item-card:: :octicon:`tools;1em;` Additional Tools + :class-card: sd-bg-light sd-bg-text-light + :link: resource-recommender + :link-type: doc -Forward Alerts to Robusta -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + Extend Robusta with cost optimization (KRR) and cluster misconfiguration detection. + +Popular Alert Sources +^^^^^^^^^^^^^^^^^^^^^^ .. grid:: 1 1 2 3 :gutter: 3 - .. grid-item-card:: :octicon:`pulse;1em;` Prometheus & AlertManager + .. grid-item-card:: :octicon:`pulse;1em;` Prometheus :class-card: sd-bg-light sd-bg-text-light - :link: alertmanager-integration/index + :link: alertmanager-integration/alert-manager :link-type: doc - In-cluster, centralized, or managed Prometheus services + Most popular - works with any Prometheus setup .. grid-item-card:: :octicon:`bell;1em;` Nagios :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/nagios :link-type: doc - Forward Nagios alerts to Robusta via webhook + Legacy monitoring systems via webhook .. grid-item-card:: :octicon:`bell;1em;` SolarWinds :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/solarwinds :link-type: doc - Forward SolarWinds alerts to Robusta via webhook + Enterprise monitoring systems via webhook diff --git a/docs/index.rst b/docs/index.rst index 401a0b724..0634b77cc 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -33,7 +33,7 @@ configuration/index ๐Ÿช„ AI Analysis - HolmesGPT - ๐Ÿ”ฅ Prometheus/AlertManager + ๐Ÿ”ฅ Alert Sources Cost Savings - KRR K8s Misconfigurations - Popeye configuration/exporting/exporting-data diff --git a/run_runner_locally.sh b/run_runner_locally.sh index b915e7b2a..3dd8880bd 100755 --- a/run_runner_locally.sh +++ b/run_runner_locally.sh @@ -89,3 +89,5 @@ export REPO_LOCAL_BASE_DIR=./deployment/git_playbooks export INSTALLATION_NAMESPACE=default mirrord exec -f mirrord.json -- poetry run python3 -m robusta.runner.main +#mirrord exec -f mirrord.json -- poetry run memray run -m robusta.runner.main +#poetry run python3 -m robusta.runner.main \ No newline at end of file From 6453e545382dfd3967611bbabef9afda1ea09bb3 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 17:39:09 +0300 Subject: [PATCH 04/51] docs: major reorganization - separate Alert Sources, AI Analysis, and Automation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Rename "Integrations" section to "Alert Sources" - much clearer purpose - Create dedicated "AI Analysis" section for Holmes GPT only - Move KRR, Popeye, and data export to "Automation" section - Update Alert Sources overview to focus purely on monitoring integrations - Remove mixed content categories that confused users - Improve logical flow: Alert Sources โ†’ AI Analysis โ†’ Notifications โ†’ Automation ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/configuration/index.rst | 51 +++++++++++++----------------------- docs/index.rst | 18 ++++++++----- 2 files changed, 30 insertions(+), 39 deletions(-) diff --git a/docs/configuration/index.rst b/docs/configuration/index.rst index 8d17308ca..a00add0ea 100644 --- a/docs/configuration/index.rst +++ b/docs/configuration/index.rst @@ -1,60 +1,45 @@ :hide-toc: -Integrations Overview +Alert Sources Overview ========================== -Robusta connects to your existing monitoring infrastructure to receive alerts and enrich them with AI analysis and automated responses. +Connect your existing monitoring systems to Robusta to receive alerts for enrichment, analysis, and automated responses. -Key Integration Categories -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Robusta supports alerts from various monitoring platforms through different integration methods: -.. grid:: - :gutter: 3 - - .. grid-item-card:: :octicon:`pulse;1em;` Alert Sources - :class-card: sd-bg-light sd-bg-text-light - :link: alertmanager-integration/index - :link-type: doc - - Connect monitoring systems like Prometheus, Nagios, and SolarWinds to forward alerts to Robusta. - - .. grid-item-card:: :octicon:`brain;1em;` AI Analysis - :class-card: sd-bg-light sd-bg-text-light - :link: holmesgpt/index - :link-type: doc +**Prometheus/AlertManager Integration** - The most common setup, supporting: +- In-cluster and external Prometheus instances +- Managed Prometheus services (AWS, Azure, Google Cloud) +- Prometheus-compatible systems (VictoriaMetrics, Thanos, Mimir) - Automatically investigate alerts using Holmes GPT with access to logs, metrics, and Kubernetes context. +**Webhook-based Integrations** - For legacy and enterprise monitoring systems: +- Nagios +- SolarWinds +- Any system that can send HTTP webhooks - .. grid-item-card:: :octicon:`tools;1em;` Additional Tools - :class-card: sd-bg-light sd-bg-text-light - :link: resource-recommender - :link-type: doc - - Extend Robusta with cost optimization (KRR) and cluster misconfiguration detection. - -Popular Alert Sources -^^^^^^^^^^^^^^^^^^^^^^ +Getting Started +^^^^^^^^^^^^^^^ .. grid:: 1 1 2 3 :gutter: 3 - .. grid-item-card:: :octicon:`pulse;1em;` Prometheus + .. grid-item-card:: :octicon:`pulse;1em;` Prometheus & AlertManager :class-card: sd-bg-light sd-bg-text-light - :link: alertmanager-integration/alert-manager + :link: alertmanager-integration/index :link-type: doc - Most popular - works with any Prometheus setup + Most popular - comprehensive setup guide for all Prometheus variants .. grid-item-card:: :octicon:`bell;1em;` Nagios :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/nagios :link-type: doc - Legacy monitoring systems via webhook + Legacy monitoring systems via webhook integration .. grid-item-card:: :octicon:`bell;1em;` SolarWinds :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/solarwinds :link-type: doc - Enterprise monitoring systems via webhook + Enterprise monitoring systems via webhook integration diff --git a/docs/index.rst b/docs/index.rst index 0634b77cc..9254421d3 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -28,15 +28,18 @@ .. toctree:: :maxdepth: 4 - :caption: ๐Ÿ”Œ Integrations + :caption: ๐Ÿšจ Alert Sources :hidden: configuration/index - ๐Ÿช„ AI Analysis - HolmesGPT - ๐Ÿ”ฅ Alert Sources - Cost Savings - KRR - K8s Misconfigurations - Popeye - configuration/exporting/exporting-data + ๐Ÿ”ฅ Prometheus & AlertManager + +.. toctree:: + :maxdepth: 4 + :caption: ๐Ÿค– AI Analysis + :hidden: + + configuration/holmesgpt/index .. toctree:: :maxdepth: 4 @@ -57,6 +60,9 @@ :hidden: playbook-reference/index + Cost Savings - KRR + K8s Misconfigurations - Popeye + configuration/exporting/exporting-data .. toctree:: :maxdepth: 4 From d26f620102e569bfc875d8de00af436b94a5ae68 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 18:03:56 +0300 Subject: [PATCH 05/51] docs: improve notifications navigation with proper hierarchy MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Create new notifications overview page as entry point - Reorder navigation: Overview โ†’ Configuring Sinks โ†’ Sink Reference - Fix information hierarchy so users start with concepts, not specific sinks - Prevent sphinx menu expansion pushing basic setup pages down - Add clear getting started flow and common workflows - Update sink reference title to match new structure ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/configuration/sinks/index.rst | 2 +- docs/index.rst | 3 +- docs/notification-routing/index.rst | 70 +++++++++++++++++++++++++++++ 3 files changed, 73 insertions(+), 2 deletions(-) create mode 100644 docs/notification-routing/index.rst diff --git a/docs/configuration/sinks/index.rst b/docs/configuration/sinks/index.rst index a50c2f9a5..522288a4e 100644 --- a/docs/configuration/sinks/index.rst +++ b/docs/configuration/sinks/index.rst @@ -2,7 +2,7 @@ :hide-toc: -Sinks Reference +Sink Reference ================== .. toctree:: diff --git a/docs/index.rst b/docs/index.rst index 9254421d3..57f09d7be 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -46,8 +46,9 @@ :caption: ๐Ÿ”” Notifications & Routing :hidden: - ๐Ÿ”” Sinks + notification-routing/index notification-routing/configuring-sinks + ๐Ÿ“ง Sink Reference Routing (Scopes) Grouping (Slack Threads) notification-routing/routing-by-time diff --git a/docs/notification-routing/index.rst b/docs/notification-routing/index.rst new file mode 100644 index 000000000..db36c96b4 --- /dev/null +++ b/docs/notification-routing/index.rst @@ -0,0 +1,70 @@ +:hide-toc: + +Notifications & Routing Overview +================================= + +Robusta can send notifications to various destinations and route them intelligently based on alert type, namespace, severity, and more. + +Key Concepts +^^^^^^^^^^^^ + +**Sinks** - Destinations where notifications are sent (Slack, Teams, Email, etc.) + +**Routing** - Rules that determine which alerts go to which sinks + +**Grouping** - Thread alerts together to reduce noise (especially in Slack) + +**Silencing** - Temporarily disable specific notifications + +Getting Started +^^^^^^^^^^^^^^^ + +.. grid:: 1 1 2 2 + :gutter: 3 + + .. grid-item-card:: :octicon:`gear;1em;` Configure Sinks + :class-card: sd-bg-light sd-bg-text-light + :link: configuring-sinks + :link-type: doc + + Start here - learn how to set up your first notification destination + + .. grid-item-card:: :octicon:`mail;1em;` Popular Sinks + :class-card: sd-bg-light sd-bg-text-light + :link: ../configuration/sinks/slack + :link-type: doc + + Slack, Teams, Email - quick setup for common destinations + + .. grid-item-card:: :octicon:`workflow;1em;` Smart Routing + :class-card: sd-bg-light sd-bg-text-light + :link: routing-with-scopes + :link-type: doc + + Send different alerts to different teams automatically + + .. grid-item-card:: :octicon:`comment-discussion;1em;` Reduce Noise + :class-card: sd-bg-light sd-bg-text-light + :link: notification-grouping + :link-type: doc + + Group related alerts in Slack threads to avoid spam + +Common Workflows +^^^^^^^^^^^^^^^^ + +**Basic Setup**: Configure a single sink (like Slack) to receive all notifications + +**Team Routing**: Route alerts to different channels based on namespace or labels + +**Noise Reduction**: Enable grouping and silencing for cleaner notifications + +**Advanced Routing**: Use scopes and filters for complex routing scenarios + +Next Steps +^^^^^^^^^^ + +1. **Configure Your First Sink** - Start with :doc:`configuring-sinks` +2. **Choose Your Destination** - Browse all available sinks in :doc:`../configuration/sinks/index` +3. **Set Up Routing** - Configure intelligent routing with :doc:`routing-with-scopes` +4. **Reduce Noise** - Enable grouping with :doc:`notification-grouping` \ No newline at end of file From 2db93410f04e87930272caf46e7d92dfbac6dceb Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 20:06:39 +0300 Subject: [PATCH 06/51] docs: move Nagios and SolarWinds to top-level Alert Sources MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Move Nagios and SolarWinds from Prometheus subpages to top-level Alert Sources - Remove confusing navigation hierarchy where non-Prometheus systems appeared under Prometheus - Clean up Prometheus page to focus purely on Prometheus/AlertManager integrations - Remove "Other Alerting Systems" section since they're now top-level - Create clearer navigation: Overview โ†’ Prometheus โ†’ Nagios โ†’ SolarWinds ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- .../alertmanager-integration/index.rst | 30 +------------------ docs/index.rst | 2 ++ 2 files changed, 3 insertions(+), 29 deletions(-) diff --git a/docs/configuration/alertmanager-integration/index.rst b/docs/configuration/alertmanager-integration/index.rst index f6feac74f..3b739fa7c 100644 --- a/docs/configuration/alertmanager-integration/index.rst +++ b/docs/configuration/alertmanager-integration/index.rst @@ -18,20 +18,14 @@ Alert Sources embedded-prometheus troubleshooting-alertmanager alert-custom-prometheus - nagios - solarwinds -Robusta can receive alerts from various monitoring systems. Choose the integration that matches your monitoring setup: - -**Prometheus/AlertManager** - The most popular choice. When integrated with Prometheus, Robusta will: +Robusta works best when integrated with Prometheus and AlertManager. When properly setup, Robusta will: 1. Show your existing Prometheus alerts, enriched with extra information 2. Fetch relevant metrics from Prometheus and show them on related alerts 3. Display metrics in the Robusta UI (optional, only relevant for UI users) -**Other Systems** - Robusta also supports webhook-based integrations for legacy and enterprise monitoring systems. - If you installed Robusta's :ref:`Embedded Prometheus Stack`, then Prometheus is pre-integrated and no setup is necessary. Otherwise, choose a guide below. .. _alertmanager-setup-options: @@ -107,25 +101,3 @@ Prometheus & AlertManager Setup All-in-one package of Robusta + kube-prometheus-stack (optional) - -Other Alerting Systems -^^^^^^^^^^^^^^^^^^^^^^ - -Robusta can also receive alerts from non-prometheus monitoring tools like Nagios and SolarWinds: - -.. grid:: 1 1 2 3 - :gutter: 3 - - .. grid-item-card:: :octicon:`bell;1em;` Nagios - :class-card: sd-bg-light sd-bg-text-light - :link: nagios - :link-type: doc - - Send Nagios alerts to Robusta using a webhook-based integration. - - .. grid-item-card:: :octicon:`bell;1em;` SolarWinds - :class-card: sd-bg-light sd-bg-text-light - :link: solarwinds - :link-type: doc - - Forward SolarWinds alerts to Robusta via webhook for centralized visibility. \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index 57f09d7be..f23949769 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -33,6 +33,8 @@ configuration/index ๐Ÿ”ฅ Prometheus & AlertManager + ๐Ÿ”” Nagios + ๐ŸŒ SolarWinds .. toctree:: :maxdepth: 4 From 7909e2d320d898fdd4bbe4719c5387dad527fd1e Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 20:08:31 +0300 Subject: [PATCH 07/51] docs: make Alert Sources overview more concise and actionable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove verbose explanations and get straight to the point - Lead with clear action: "Choose your monitoring system" - Make card descriptions more direct and useful - Add practical note about webhook compatibility for other systems - Cut word count by ~60% while improving clarity ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/configuration/index.rst | 25 +++++++------------------ 1 file changed, 7 insertions(+), 18 deletions(-) diff --git a/docs/configuration/index.rst b/docs/configuration/index.rst index a00add0ea..0b2d04f20 100644 --- a/docs/configuration/index.rst +++ b/docs/configuration/index.rst @@ -3,22 +3,9 @@ Alert Sources Overview ========================== -Connect your existing monitoring systems to Robusta to receive alerts for enrichment, analysis, and automated responses. +Forward alerts from your monitoring system to Robusta for enrichment and automation. -Robusta supports alerts from various monitoring platforms through different integration methods: - -**Prometheus/AlertManager Integration** - The most common setup, supporting: -- In-cluster and external Prometheus instances -- Managed Prometheus services (AWS, Azure, Google Cloud) -- Prometheus-compatible systems (VictoriaMetrics, Thanos, Mimir) - -**Webhook-based Integrations** - For legacy and enterprise monitoring systems: -- Nagios -- SolarWinds -- Any system that can send HTTP webhooks - -Getting Started -^^^^^^^^^^^^^^^ +Choose your monitoring system: .. grid:: 1 1 2 3 :gutter: 3 @@ -28,18 +15,20 @@ Getting Started :link: alertmanager-integration/index :link-type: doc - Most popular - comprehensive setup guide for all Prometheus variants + **Most popular** - Works with any Prometheus setup (in-cluster, managed services, VictoriaMetrics, etc.) .. grid-item-card:: :octicon:`bell;1em;` Nagios :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/nagios :link-type: doc - Legacy monitoring systems via webhook integration + **Legacy systems** - Forward alerts via webhook .. grid-item-card:: :octicon:`bell;1em;` SolarWinds :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/solarwinds :link-type: doc - Enterprise monitoring systems via webhook integration + **Enterprise monitoring** - Forward alerts via webhook + +Don't see your system? Robusta accepts alerts from any system that can send HTTP webhooks. From 70ace66b6284dc85d3a8fd0b0e279d31a18f8596 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 20:09:27 +0300 Subject: [PATCH 08/51] docs: remove negative labels like 'legacy' and 'enterprise' MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Change 'Legacy systems' to 'Nagios monitoring' - Change 'Enterprise monitoring' to 'SolarWinds monitoring' - Keep descriptions neutral and product-focused - Avoid terms that might make users feel bad about their tech choices ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/configuration/index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/configuration/index.rst b/docs/configuration/index.rst index 0b2d04f20..9635672ad 100644 --- a/docs/configuration/index.rst +++ b/docs/configuration/index.rst @@ -22,13 +22,13 @@ Choose your monitoring system: :link: alertmanager-integration/nagios :link-type: doc - **Legacy systems** - Forward alerts via webhook + **Nagios monitoring** - Forward alerts via webhook .. grid-item-card:: :octicon:`bell;1em;` SolarWinds :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/solarwinds :link-type: doc - **Enterprise monitoring** - Forward alerts via webhook + **SolarWinds monitoring** - Forward alerts via webhook Don't see your system? Robusta accepts alerts from any system that can send HTTP webhooks. From 8f447cda8680ef8fe2afc14e9eae3575b558e13c Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 20:10:45 +0300 Subject: [PATCH 09/51] docs: add helpful link to webhook documentation for other systems MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Link 'HTTP webhooks' to webhook triggers documentation - Provides users with concrete next steps if their system isn't listed - Makes the fallback option actionable rather than just informational ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/configuration/index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/configuration/index.rst b/docs/configuration/index.rst index 9635672ad..d1904f63e 100644 --- a/docs/configuration/index.rst +++ b/docs/configuration/index.rst @@ -15,7 +15,7 @@ Choose your monitoring system: :link: alertmanager-integration/index :link-type: doc - **Most popular** - Works with any Prometheus setup (in-cluster, managed services, VictoriaMetrics, etc.) + **Most popular** - Works with any Prometheus-compatible stack .. grid-item-card:: :octicon:`bell;1em;` Nagios :class-card: sd-bg-light sd-bg-text-light @@ -31,4 +31,4 @@ Choose your monitoring system: **SolarWinds monitoring** - Forward alerts via webhook -Don't see your system? Robusta accepts alerts from any system that can send HTTP webhooks. +Don't see your system? Robusta accepts alerts from any system that can send :doc:`HTTP webhooks <../playbook-reference/triggers/webhook>`. From a4310029ecc2d72ab9ad1033fe3e3c74dd422791 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 20:13:09 +0300 Subject: [PATCH 10/51] docs: fix webhook link to point to correct API documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Link HTTP webhooks to exporting-data.rst which contains POST /api/alerts endpoint - This is the correct endpoint for sending alerts from any system via HTTP - Previous link was to webhook triggers which is for different use case ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/configuration/index.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/configuration/index.rst b/docs/configuration/index.rst index d1904f63e..e59f6b131 100644 --- a/docs/configuration/index.rst +++ b/docs/configuration/index.rst @@ -15,20 +15,20 @@ Choose your monitoring system: :link: alertmanager-integration/index :link-type: doc - **Most popular** - Works with any Prometheus-compatible stack + **Most popular**. Works with any Prometheus-compatible stack .. grid-item-card:: :octicon:`bell;1em;` Nagios :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/nagios :link-type: doc - **Nagios monitoring** - Forward alerts via webhook + **Nagios monitoring**. Forward alerts via webhook .. grid-item-card:: :octicon:`bell;1em;` SolarWinds :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/solarwinds :link-type: doc - **SolarWinds monitoring** - Forward alerts via webhook + **SolarWinds monitoring**. Forward alerts via webhook -Don't see your system? Robusta accepts alerts from any system that can send :doc:`HTTP webhooks <../playbook-reference/triggers/webhook>`. +Don't see your system? Robusta accepts alerts from any system that can send :doc:`HTTP webhooks `. From a8178f7533597f3119ec2c39e3ce95b945f37d15 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 20:13:57 +0300 Subject: [PATCH 11/51] docs: move webhook API to Alert Sources section where it belongs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Move exporting-data.rst from Automation to Alert Sources as "Custom Webhooks" - This contains the POST /api/alerts endpoint for sending alerts TO Robusta - Makes it discoverable for users wanting to integrate custom monitoring systems - Logical placement with other alert source integrations ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/index.rst b/docs/index.rst index f23949769..832a7cb2a 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -35,6 +35,7 @@ ๐Ÿ”ฅ Prometheus & AlertManager ๐Ÿ”” Nagios ๐ŸŒ SolarWinds + ๐Ÿ”— Custom Webhooks .. toctree:: :maxdepth: 4 @@ -65,7 +66,6 @@ playbook-reference/index Cost Savings - KRR K8s Misconfigurations - Popeye - configuration/exporting/exporting-data .. toctree:: :maxdepth: 4 From a1cfef61696dfdfd559f98ab021d3999eb29c5ee Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 20:35:43 +0300 Subject: [PATCH 12/51] docs: switch from horizontal tabs to left sidebar navigation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove navigation.tabs and navigation.tabs.sticky features - Add navigation.sections and navigation.expand for better sidebar - This provides more room for additional top-level sections - Makes navigation more scalable for future section additions ๐Ÿค– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/conf.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/conf.py b/docs/conf.py index 948ce3e83..a126ecd6f 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -273,8 +273,8 @@ "features": [ "navigation.instant", "navigation.top", - "navigation.tabs", - "navigation.tabs.sticky", + "navigation.sections", + "navigation.expand", "search.share", "toc.follow", "toc.sticky", From 2fdd287c04f1c07276aca68178ae9129ed6853c0 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 20:56:54 +0300 Subject: [PATCH 13/51] switch to sidebar navigation --- docs/conf.py | 4 +--- docs/index.rst | 2 -- docs/setup-robusta/installation/all-in-one-installation.rst | 1 - .../installation/extend-prometheus-installation.rst | 1 - docs/setup-robusta/installation/standalone-installation.rst | 1 - 5 files changed, 1 insertion(+), 8 deletions(-) diff --git a/docs/conf.py b/docs/conf.py index a126ecd6f..e05cbeb91 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -273,13 +273,11 @@ "features": [ "navigation.instant", "navigation.top", - "navigation.sections", - "navigation.expand", "search.share", "toc.follow", "toc.sticky", ], - "globaltoc_collapse": False, + "globaltoc_collapse": True, "social": [ { "icon": "fontawesome/brands/github", diff --git a/docs/index.rst b/docs/index.rst index 832a7cb2a..01c72914c 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,5 +1,3 @@ -:hide-navigation: -:hide-toc: .. toctree:: :maxdepth: 4 diff --git a/docs/setup-robusta/installation/all-in-one-installation.rst b/docs/setup-robusta/installation/all-in-one-installation.rst index 345e7cb5c..f17323dbd 100644 --- a/docs/setup-robusta/installation/all-in-one-installation.rst +++ b/docs/setup-robusta/installation/all-in-one-installation.rst @@ -1,5 +1,4 @@ :tocdepth: 2 -:globaltoc_collapse: false .. _install-all-in-one: diff --git a/docs/setup-robusta/installation/extend-prometheus-installation.rst b/docs/setup-robusta/installation/extend-prometheus-installation.rst index 980c6625e..fcfd7e49b 100644 --- a/docs/setup-robusta/installation/extend-prometheus-installation.rst +++ b/docs/setup-robusta/installation/extend-prometheus-installation.rst @@ -1,5 +1,4 @@ :tocdepth: 2 -:globaltoc_collapse: false .. _install-existing-prometheus: diff --git a/docs/setup-robusta/installation/standalone-installation.rst b/docs/setup-robusta/installation/standalone-installation.rst index 87a68a686..05c15dc3b 100644 --- a/docs/setup-robusta/installation/standalone-installation.rst +++ b/docs/setup-robusta/installation/standalone-installation.rst @@ -1,5 +1,4 @@ :tocdepth: 2 -:globaltoc_collapse: false .. _install-barebones: From 745ed9776550b53ece13b43ad6b61e5eddf71166 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 21:14:09 +0300 Subject: [PATCH 14/51] clarify on pro features --- docs/configuration/alertmanager-integration/nagios.rst | 3 +++ docs/configuration/alertmanager-integration/solarwinds.rst | 3 +++ 2 files changed, 6 insertions(+) diff --git a/docs/configuration/alertmanager-integration/nagios.rst b/docs/configuration/alertmanager-integration/nagios.rst index 0ec64bf48..dd7db7940 100644 --- a/docs/configuration/alertmanager-integration/nagios.rst +++ b/docs/configuration/alertmanager-integration/nagios.rst @@ -1,6 +1,9 @@ Nagios Integration with Robusta =============================== +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + This guide explains how to set up Nagios to send alert webhooks to Robusta. Requirements diff --git a/docs/configuration/alertmanager-integration/solarwinds.rst b/docs/configuration/alertmanager-integration/solarwinds.rst index b691d5f08..6885bf414 100644 --- a/docs/configuration/alertmanager-integration/solarwinds.rst +++ b/docs/configuration/alertmanager-integration/solarwinds.rst @@ -1,6 +1,9 @@ SolarWinds Integration with Robusta =================================== +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + This guide explains how to configure SolarWinds to send alert webhooks directly to Robusta. Requirements From 7993d67c2c4869be16a255a36fcb584f767330eb Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 15 Jul 2025 21:14:22 +0300 Subject: [PATCH 15/51] put robusta pro features in the right place --- .../exporting/custom-webhooks.rst | 59 ++++++++++++++++ .../exporting/exporting-data.rst | 15 +++-- .../exporting/robusta-pro-features.rst | 67 +++++++++++++++++++ docs/index.rst | 32 +++++---- 4 files changed, 151 insertions(+), 22 deletions(-) create mode 100644 docs/configuration/exporting/custom-webhooks.rst create mode 100644 docs/configuration/exporting/robusta-pro-features.rst diff --git a/docs/configuration/exporting/custom-webhooks.rst b/docs/configuration/exporting/custom-webhooks.rst new file mode 100644 index 000000000..8e5c174c2 --- /dev/null +++ b/docs/configuration/exporting/custom-webhooks.rst @@ -0,0 +1,59 @@ +Custom Webhooks +=============== + +Send alerts to Robusta from any monitoring system that supports HTTP webhooks. + +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + +Overview +-------- + +Robusta can receive alerts from any monitoring system that can send HTTP webhooks. This makes it easy to integrate with systems like Nagios, SolarWinds, or custom monitoring solutions. + +Webhook Endpoint +---------------- + +Send alerts to Robusta using the following endpoint: + +.. code-block:: bash + + POST https://api.robusta.dev/api/alerts + +Authentication +-------------- + +You'll need your API key and account ID: + +1. **Account ID**: Found in your ``generated_values.yaml`` file +2. **API Key**: Generate this in the Robusta platform under **Settings** โ†’ **API Keys** โ†’ **New API Key** + +For detailed API documentation including request format, authentication, and examples, see :doc:`Alert History Import and Export API `. + +Quick Example +------------- + +Here's a simple example of sending a custom alert: + +.. code-block:: bash + + curl --location --request POST 'https://api.robusta.dev/api/alerts' \ + --header 'Authorization: Bearer YOUR_API_KEY' \ + --header 'Content-Type: application/json' \ + --data-raw '{ + "account_id": "YOUR_ACCOUNT_ID", + "alerts": [ + { + "title": "Test Service Down", + "description": "The Test Service is not responding.", + "source": "monitoring-system", + "priority": "high", + "aggregation_key": "test-service-issues" + } + ] + }' + +Next Steps +---------- + +For complete API documentation including all available fields and response formats, see :doc:`Alert History Import and Export API `. \ No newline at end of file diff --git a/docs/configuration/exporting/exporting-data.rst b/docs/configuration/exporting/exporting-data.rst index 516e0e7a9..d772cf6f3 100644 --- a/docs/configuration/exporting/exporting-data.rst +++ b/docs/configuration/exporting/exporting-data.rst @@ -1,12 +1,17 @@ Alert History Import and Export API ============================================== -The Robusta SaaS platform exposes several HTTP APIs: +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. -* :ref:`API to export alerts ` -* :ref:`API to fetch aggregate alert statistics ` -* :ref:`API to send alerts ` -* :ref:`API to send configuration changes ` +The Robusta SaaS platform exposes several HTTP APIs for exporting data and sending alerts: + +* :ref:`API to export alerts ` - Export historical alert data +* :ref:`API to fetch aggregate alert statistics ` - Get aggregated alert statistics +* :ref:`API to send alerts ` - Send custom alerts programmatically +* :ref:`API to send configuration changes ` - Track configuration changes + +For a simpler webhook integration guide, see :doc:`Custom Webhooks `. There is an quick-start `Prometheus report-generator `_ on GitHub that demonstrates how to use the export APIs. diff --git a/docs/configuration/exporting/robusta-pro-features.rst b/docs/configuration/exporting/robusta-pro-features.rst new file mode 100644 index 000000000..1ed09381f --- /dev/null +++ b/docs/configuration/exporting/robusta-pro-features.rst @@ -0,0 +1,67 @@ +Robusta Pro Features +==================== + +.. note:: + These features are available with the Robusta SaaS platform and self-hosted commercial plans. They are not available in the open-source version. + +Robusta Pro provides a comprehensive monitoring platform that includes the open-source runner plus a full SaaS UI, advanced integrations, and enterprise APIs. Most users choose Robusta Pro to get the complete Robusta experience with all capabilities and minimal setup. + +Custom Alert Ingestion +----------------------- + +Send alerts to Robusta from any monitoring system using HTTP webhooks. + +:doc:`Custom Webhooks ` + Send alerts from any system that supports HTTP webhooks, including custom monitoring solutions. + +:doc:`Nagios Integration <../alertmanager-integration/nagios>` + Forward alerts from Nagios to Robusta for enrichment and automation. + +:doc:`SolarWinds Integration <../alertmanager-integration/solarwinds>` + Configure SolarWinds to send alert webhooks directly to Robusta. + +Data Export and Reporting APIs +------------------------------- + +Export alert history and generate reports using Robusta's REST APIs. + +:doc:`Alert History Import and Export API ` + Comprehensive API for exporting alert history, generating reports, and sending custom alerts programmatically. + +Features include: + +* **Alert Export API**: Export historical alert data with filtering by time range, alert name, and account +* **Alert Reporting API**: Get aggregated statistics and counts for different alert types +* **Custom Alert API**: Send alerts programmatically from external systems +* **Configuration Changes API**: Track configuration changes in your environment + +AI Analysis +----------- + +Robusta Pro includes advanced AI-powered investigation capabilities to help you understand and resolve alerts faster. + +:doc:`AI Analysis (Holmes GPT) <../holmesgpt/index>` + Use AI to investigate Kubernetes alerts, analyze logs, and get remediation suggestions automatically. + +Additional Pro Features +----------------------- + +Beyond the APIs and AI analysis listed above, Robusta Pro includes: + +* **Full SaaS UI**: Complete web interface for managing alerts, playbooks, and configuration +* **Managed Prometheus Alerts**: Create and customize Prometheus alerts with templates, without needing to know PromQL +* **Advanced Analytics**: Historical alert data, trends, and reporting dashboards +* **Enterprise Support**: Dedicated support for production deployments + +For more details on the differences between open-source and SaaS, see :doc:`Open Source vs SaaS <../../how-it-works/oss-vs-saas>`. + +Getting Started +--------------- + +To access these features: + +1. **Robusta SaaS**: `Sign up for free `_ to get started with the full platform +2. **Self-hosted Commercial**: Contact support@robusta.dev for enterprise plans with self-hosted UI +3. **API Access**: Generate API keys in the Robusta platform under **Settings** โ†’ **API Keys** + +For detailed API documentation and examples, see :doc:`Alert History Import and Export API `. \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index 01c72914c..633ddabcd 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,16 +1,11 @@ +:hide-toc: .. toctree:: - :maxdepth: 4 - :caption: Home + :maxdepth: 1 + :caption: ๐Ÿ“– Overview :hidden: self - -.. toctree:: - :maxdepth: 4 - :caption: How it works - :hidden: - how-it-works/architecture how-it-works/oss-vs-saas how-it-works/coverage @@ -33,14 +28,8 @@ ๐Ÿ”ฅ Prometheus & AlertManager ๐Ÿ”” Nagios ๐ŸŒ SolarWinds - ๐Ÿ”— Custom Webhooks + ๐Ÿ”— Custom Webhooks -.. toctree:: - :maxdepth: 4 - :caption: ๐Ÿค– AI Analysis - :hidden: - - configuration/holmesgpt/index .. toctree:: :maxdepth: 4 @@ -65,6 +54,15 @@ Cost Savings - KRR K8s Misconfigurations - Popeye +.. toctree:: + :maxdepth: 4 + :caption: ๐Ÿ’ผ Robusta Pro Features + :hidden: + + configuration/exporting/robusta-pro-features + configuration/holmesgpt/index + configuration/exporting/exporting-data + .. toctree:: :maxdepth: 4 :caption: Help @@ -74,8 +72,8 @@ contributing community-tutorials -Better Prometheus Alerts (and more) for Kubernetes -===================================================== +Welcome to Robusta +==================== .. grid:: 1 1 2 2 :margin: 0 From 8fde0e54c28f0e086f0b2d690016f32370135cad Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Mon, 21 Jul 2025 06:36:06 +0300 Subject: [PATCH 16/51] docs --- .../alertmanager-integration/index.rst | 64 ++++++------------- .../exporting/custom-webhooks.rst | 6 +- .../exporting/robusta-pro-features.rst | 18 +++--- docs/configuration/index.rst | 16 ++--- docs/how-it-works/oss-vs-saas.rst | 15 ++--- docs/index.rst | 57 ++++++++++------- 6 files changed, 79 insertions(+), 97 deletions(-) diff --git a/docs/configuration/alertmanager-integration/index.rst b/docs/configuration/alertmanager-integration/index.rst index 3b739fa7c..4412a9e69 100644 --- a/docs/configuration/alertmanager-integration/index.rst +++ b/docs/configuration/alertmanager-integration/index.rst @@ -1,8 +1,8 @@ :hide-toc: -Alert Sources -================================ +Prometheus & AlertManager +========================= .. toctree:: :hidden: :maxdepth: 1 @@ -16,88 +16,60 @@ Alert Sources victoria-metrics grafana-alert-manager embedded-prometheus - troubleshooting-alertmanager - alert-custom-prometheus -Robusta works best when integrated with Prometheus and AlertManager. When properly setup, Robusta will: +Connect Robusta to your Prometheus setup to get enriched alerts with logs, events, and metrics. -1. Show your existing Prometheus alerts, enriched with extra information -2. Fetch relevant metrics from Prometheus and show them on related alerts -3. Display metrics in the Robusta UI (optional, only relevant for UI users) +**Already using Robusta's embedded Prometheus?** No setup needed - skip this page. -If you installed Robusta's :ref:`Embedded Prometheus Stack`, then Prometheus is pre-integrated and no setup is necessary. Otherwise, choose a guide below. - -.. _alertmanager-setup-options: - -Prometheus & AlertManager Setup -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +**Choose your setup:** .. grid:: 1 1 2 3 :gutter: 3 - .. grid-item-card:: :octicon:`book;1em;` In-cluster Prometheus + .. grid-item-card:: Prometheus in same cluster :class-card: sd-bg-light sd-bg-text-light :link: alert-manager :link-type: doc - Prometheus, running in the same K8s cluster as Robusta - - .. grid-item-card:: :octicon:`book;1em;` Centralized Prometheus + .. grid-item-card:: Prometheus outside cluster :class-card: sd-bg-light sd-bg-text-light :link: outofcluster-prometheus :link-type: doc - Prometheus, Thanos, Mimir, etc, not running in the same K8s cluster as Robusta - - .. grid-item-card:: :octicon:`book;1em;` Azure Managed Prometheus - :class-card: sd-bg-light sd-bg-text-light - :link: azure-managed-prometheus - :link-type: doc - - Azure Monitor managed service for Prometheus - - .. grid-item-card:: :octicon:`book;1em;` AWS Managed Prometheus + .. grid-item-card:: AWS Managed Prometheus :class-card: sd-bg-light sd-bg-text-light :link: eks-managed-prometheus :link-type: doc - Amazon Managed Service for Prometheus - - .. grid-item-card:: :octicon:`book;1em;` Coralogix + .. grid-item-card:: Azure Managed Prometheus :class-card: sd-bg-light sd-bg-text-light - :link: coralogix_managed_prometheus + :link: azure-managed-prometheus :link-type: doc - Coralogix Managed Prometheus - - .. grid-item-card:: :octicon:`book;1em;` Google Managed Prometheus + .. grid-item-card:: Google Managed Prometheus :class-card: sd-bg-light sd-bg-text-light :link: google-managed-prometheus :link-type: doc - Google Managed Prometheus (GMP) + .. grid-item-card:: Coralogix + :class-card: sd-bg-light sd-bg-text-light + :link: coralogix_managed_prometheus + :link-type: doc - .. grid-item-card:: :octicon:`book;1em;` Victoria Metrics + .. grid-item-card:: VictoriaMetrics :class-card: sd-bg-light sd-bg-text-light :link: victoria-metrics :link-type: doc - VictoriaMetrics, running in the same K8s cluster as Robusta - - - .. grid-item-card:: :octicon:`book;1em;` Grafana AlertManager + .. grid-item-card:: Grafana Alerts :class-card: sd-bg-light sd-bg-text-light :link: grafana-alert-manager :link-type: doc - Special instructions when using Grafana alerts - - .. grid-item-card:: :octicon:`book;1em;` Embedded Prometheus + .. grid-item-card:: Install Prometheus with Robusta :class-card: sd-bg-light sd-bg-text-light :link: embedded-prometheus :link-type: doc - All-in-one package of Robusta + kube-prometheus-stack (optional) - diff --git a/docs/configuration/exporting/custom-webhooks.rst b/docs/configuration/exporting/custom-webhooks.rst index 8e5c174c2..265c76570 100644 --- a/docs/configuration/exporting/custom-webhooks.rst +++ b/docs/configuration/exporting/custom-webhooks.rst @@ -1,15 +1,15 @@ Custom Webhooks =============== -Send alerts to Robusta from any monitoring system that supports HTTP webhooks. +Send alerts from any monitoring system to Robusta via HTTP webhooks. .. note:: - This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + This feature requires Robusta Pro (SaaS or self-hosted commercial plans). Overview -------- -Robusta can receive alerts from any monitoring system that can send HTTP webhooks. This makes it easy to integrate with systems like Nagios, SolarWinds, or custom monitoring solutions. +Forward alerts from any system that can send HTTP POST requests. Robusta will automatically enrich these alerts with Kubernetes context and apply your automation rules. Webhook Endpoint ---------------- diff --git a/docs/configuration/exporting/robusta-pro-features.rst b/docs/configuration/exporting/robusta-pro-features.rst index 1ed09381f..6c01f5ba0 100644 --- a/docs/configuration/exporting/robusta-pro-features.rst +++ b/docs/configuration/exporting/robusta-pro-features.rst @@ -4,7 +4,7 @@ Robusta Pro Features .. note:: These features are available with the Robusta SaaS platform and self-hosted commercial plans. They are not available in the open-source version. -Robusta Pro provides a comprehensive monitoring platform that includes the open-source runner plus a full SaaS UI, advanced integrations, and enterprise APIs. Most users choose Robusta Pro to get the complete Robusta experience with all capabilities and minimal setup. +Robusta Pro adds a web UI, additional integrations, and enterprise APIs to the open-source engine. Available as SaaS (we handle hosting) or self-hosted on-premise. Custom Alert Ingestion ----------------------- @@ -38,20 +38,20 @@ Features include: AI Analysis ----------- -Robusta Pro includes advanced AI-powered investigation capabilities to help you understand and resolve alerts faster. +Optional AI-powered alert investigation using HolmesGPT. -:doc:`AI Analysis (Holmes GPT) <../holmesgpt/index>` - Use AI to investigate Kubernetes alerts, analyze logs, and get remediation suggestions automatically. +:doc:`AI Analysis (HolmesGPT) <../holmesgpt/index>` + Automatically analyze Kubernetes alerts, logs, and metrics. Get potential root causes and remediation suggestions. Additional Pro Features ----------------------- -Beyond the APIs and AI analysis listed above, Robusta Pro includes: +Additional capabilities in Robusta Pro: -* **Full SaaS UI**: Complete web interface for managing alerts, playbooks, and configuration -* **Managed Prometheus Alerts**: Create and customize Prometheus alerts with templates, without needing to know PromQL -* **Advanced Analytics**: Historical alert data, trends, and reporting dashboards -* **Enterprise Support**: Dedicated support for production deployments +* **Web UI**: Manage alerts, playbooks, and configuration through a browser interface +* **Alert Templates**: Create Prometheus alerts without writing PromQL +* **Historical Data**: Query alert history and trends +* **Enterprise Support**: Production support and SLA options For more details on the differences between open-source and SaaS, see :doc:`Open Source vs SaaS <../../how-it-works/oss-vs-saas>`. diff --git a/docs/configuration/index.rst b/docs/configuration/index.rst index e59f6b131..727635d10 100644 --- a/docs/configuration/index.rst +++ b/docs/configuration/index.rst @@ -1,11 +1,11 @@ :hide-toc: -Alert Sources Overview -========================== +Alert Sources +============= -Forward alerts from your monitoring system to Robusta for enrichment and automation. +Connect your monitoring system to Robusta. When alerts fire, Robusta automatically enriches them with context and applies your automation rules. -Choose your monitoring system: +**Choose your setup:** .. grid:: 1 1 2 3 :gutter: 3 @@ -15,20 +15,20 @@ Choose your monitoring system: :link: alertmanager-integration/index :link-type: doc - **Most popular**. Works with any Prometheus-compatible stack + Standard Prometheus integration. Works with any PromQL-based stack. .. grid-item-card:: :octicon:`bell;1em;` Nagios :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/nagios :link-type: doc - **Nagios monitoring**. Forward alerts via webhook + Forward Nagios alerts via webhook integration. .. grid-item-card:: :octicon:`bell;1em;` SolarWinds :class-card: sd-bg-light sd-bg-text-light :link: alertmanager-integration/solarwinds :link-type: doc - **SolarWinds monitoring**. Forward alerts via webhook + Forward SolarWinds alerts via webhook integration. -Don't see your system? Robusta accepts alerts from any system that can send :doc:`HTTP webhooks `. +**Other systems?** Robusta accepts alerts from any monitoring system via :doc:`HTTP webhooks `. diff --git a/docs/how-it-works/oss-vs-saas.rst b/docs/how-it-works/oss-vs-saas.rst index 32820956a..4e2174114 100644 --- a/docs/how-it-works/oss-vs-saas.rst +++ b/docs/how-it-works/oss-vs-saas.rst @@ -1,18 +1,15 @@ Open Source vs SaaS ################################ -There are several ways to use Robusta: +Robusta has three deployment options: -- Robusta OSS: Send data to external destinations like Slack. No UI. -- Robusta OSS + `SaaS UI `_: The full experience. -- Robusta OSS + Self-hosted UI: The on-prem experience. +- **Open Source**: MIT-licensed engine that sends alerts to Slack, Teams, etc. No web UI. +- **SaaS**: Open source engine + hosted web UI with additional features. +- **Self-hosted**: Open source engine + on-premise web UI (enterprise plans). -Which option is right for me? -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +**Which should I choose?** -Most people use Robusta with the SaaS platform enabled. This gives you the full Robusta experience, with all capabilities and minimum hassle. - -That said, the choice is yours. You can use the open-source without the SaaS platform, or you can self-host the SaaS platform via our enterprise plans. +Most teams use the SaaS option for the complete feature set without infrastructure overhead. The open source version works well if you only need basic alert routing to external systems. Pricing ^^^^^^^^^^^^ diff --git a/docs/index.rst b/docs/index.rst index 633ddabcd..fa039a954 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -24,11 +24,11 @@ :caption: ๐Ÿšจ Alert Sources :hidden: - configuration/index - ๐Ÿ”ฅ Prometheus & AlertManager - ๐Ÿ”” Nagios - ๐ŸŒ SolarWinds - ๐Ÿ”— Custom Webhooks + Overview + Prometheus & AlertManager + Nagios + SolarWinds + Custom Webhooks .. toctree:: @@ -36,7 +36,7 @@ :caption: ๐Ÿ”” Notifications & Routing :hidden: - notification-routing/index + Overview notification-routing/configuring-sinks ๐Ÿ“ง Sink Reference Routing (Scopes) @@ -51,6 +51,7 @@ :hidden: playbook-reference/index + configuration/alertmanager-integration/alert-custom-prometheus Cost Savings - KRR K8s Misconfigurations - Popeye @@ -62,10 +63,11 @@ configuration/exporting/robusta-pro-features configuration/holmesgpt/index configuration/exporting/exporting-data + configuration/alertmanager-integration/troubleshooting-alertmanager .. toctree:: :maxdepth: 4 - :caption: Help + :caption: โ“ Help :hidden: help @@ -75,25 +77,36 @@ Welcome to Robusta ==================== -.. grid:: 1 1 2 2 +.. grid:: 1 1 1 2 :margin: 0 :padding: 0 + :gutter: 3 .. grid-item:: - Robusta extends Prometheus/VictoriaMetrics/Coralogix (and more) with features like: + Robusta enriches Prometheus alerts with Kubernetes context. - * :doc:`Smart Grouping ` - reduce notification spam with Slack threads ๐Ÿงต - * :ref:`AI Investigation ` - Kickstart your alert investigations with AI (optional) - * :ref:`Alert Enrichment ` - see pods log and other data alongside your alerts - * :ref:`Self-Healing ` - define auto-remediation rules for faster fixes - * :ref:`Advanced Routing ` by team, namespace, k8s metadata and more - * :ref:`K8s Problem-Detection ` - alert on OOMKills or failing Jobs without PromQL - * :ref:`Change Tracking ` - correlate alerts and Kubernetes rollouts - * :ref:`Auto-Resolve ` - send alerts, resolve them when updated (e.g. in Jira) - * :ref:`Dozens of Integrations ` - Slack, Teams, Jira, and more + When Prometheus alerts fire, Robusta automatically adds pod logs, events, and metrics. Connect to your existing Prometheus setup. - Bring your own Prometheus or install our :ref:`preconfigured bundle `. + .. raw:: html + +
+ + **What happens when an alert fires:** + + * **Auto-investigate** - Logs, events, and metrics attached + * **AI analysis** - Root causes and fixes suggested + * **Smart routing** - Right team, right channel + * **Self-healing** - Automatic remediation rules + * **Change tracking** - Correlate alerts with deployments + + .. raw:: html + +
+ + **Works with your existing setup** + + Connect to your existing Prometheus or install our all-in-one bundle (based on kube-prometheus-stack). .. grid-item:: @@ -119,10 +132,10 @@ Welcome to Robusta .. image:: /images/jira_example.png :width: 800px -Who uses Robusta? -------------------------------------- +Ready to get started? +--------------------- -Robusta is used in production by hundreds of teams, from cloud-native pioneers to the Fortune 500. +Join hundreds of teams already running Robusta in production. .. button-ref:: ../setup-robusta/installation/index :color: primary From ac0498625c25aae3db0272d93eafd49fbac3887e Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Mon, 21 Jul 2025 15:26:15 +0300 Subject: [PATCH 17/51] Update index.rst --- docs/index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index fa039a954..ba2cfb8a8 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -84,9 +84,9 @@ Welcome to Robusta .. grid-item:: - Robusta enriches Prometheus alerts with Kubernetes context. + Robusta OSS enriches Prometheus alerts with Kubernetes context. - When Prometheus alerts fire, Robusta automatically adds pod logs, events, and metrics. Connect to your existing Prometheus setup. + When Prometheus alerts fire, Robusta automatically adds pod logs, events, and metrics. Robusta Pro adds web UI, AI analysis, and support for non-Kubernetes monitoring systems. .. raw:: html From 6e568776eb8c1569227b17272e552d3704f33db4 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Mon, 21 Jul 2025 16:12:26 +0300 Subject: [PATCH 18/51] fixes --- docs/how-it-works/usage-faq.rst | 48 ++++----------------------- docs/index.rst | 50 ++++++----------------------- docs/notification-routing/index.rst | 49 ++++++++++++++++------------ 3 files changed, 43 insertions(+), 104 deletions(-) diff --git a/docs/how-it-works/usage-faq.rst b/docs/how-it-works/usage-faq.rst index 3db44585b..17ba1d2f6 100644 --- a/docs/how-it-works/usage-faq.rst +++ b/docs/how-it-works/usage-faq.rst @@ -4,49 +4,13 @@ Usage FAQ Does Robusta have Builtin Alerts? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Yes. Robusta includes built-in alerts based on Prometheus and direct APIServer monitoring. -These alerts work out of the box without any configuration. +Yes, you can install all-in-one bundle to install Robusta along with well-tested enriched alerts. -You can also :ref:`write your own Prometheus alerts `. +If you already have alerts, you can skip the bundle and send your existing alerts to Robusta instead. They'll be enriched with extra context. -What Events Can Robusta Listen to? -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Robusta listens to: - -* Prometheus alerts -* CrashLoopBackOffs -* OOMKills -* Job Failures -* Other APIServer errors -* Updates to Kubernetes Deployments and other resources - -See the full list in :ref:`All Triggers `. - -Want Robusta to respond to a custom event? Just send your event to Robusta by webhook. - -What Actions Can Robusta Take? -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Actions typically do one of the following: - -* Correlate existing observability data -* Perform high-fidelity data collection (e.g. fetch heap dumps) -* Remediate problems -* Silence false alarms - -See the full list in :ref:`All Actions `. - -For examples, see :ref:`What are Playbooks?`. - -Where Can Robusta Send Notifications? -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -In Robusta, destinations are called *sinks*. Here are some built-in sinks: - -* Chat apps: *Slack, MSTeams, Discord, and Telegram* -* Incident management tools: *PagerDuty and OpsGenie* -* Monitoring Platforms: *DataDog and the Robusta SaaS* +Can Robusta monitor alerts from outside Kubernetes? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Robusta OSS is designed for Kubernetes environments and requires both Kubernetes and Prometheus to provide the most value. -See the full list in :ref:`Sinks Reference`. +Robusta Pro extends beyond Kubernetes and can ingest alerts from any monitoring system via webhook, including Nagios, SolarWinds, and custom sources. These alerts get the same AI analysis and correlation features. diff --git a/docs/index.rst b/docs/index.rst index ba2cfb8a8..e5c0029c8 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -8,7 +8,6 @@ self how-it-works/architecture how-it-works/oss-vs-saas - how-it-works/coverage how-it-works/usage-faq how-it-works/alert-builtin-enrichment @@ -77,6 +76,8 @@ Welcome to Robusta ==================== +Robusta transforms basic Prometheus alerts into actionable insights with full Kubernetes context, and magical automation. + .. grid:: 1 1 1 2 :margin: 0 :padding: 0 @@ -84,53 +85,20 @@ Welcome to Robusta .. grid-item:: - Robusta OSS enriches Prometheus alerts with Kubernetes context. - - When Prometheus alerts fire, Robusta automatically adds pod logs, events, and metrics. Robusta Pro adds web UI, AI analysis, and support for non-Kubernetes monitoring systems. - - .. raw:: html - -
+ **How Robusta Improves Alerts:** - **What happens when an alert fires:** - - * **Auto-investigate** - Logs, events, and metrics attached - * **AI analysis** - Root causes and fixes suggested - * **Smart routing** - Right team, right channel + * **Auto-investigate** - Correlate alerts with logs and events + * **AI analysis** - Root causes and remediation steps + * **Smart routing** - Route based on context * **Self-healing** - Automatic remediation rules * **Change tracking** - Correlate alerts with deployments - .. raw:: html - -
- - **Works with your existing setup** - - Connect to your existing Prometheus or install our all-in-one bundle (based on kube-prometheus-stack). + Connect to your existing Prometheus or install our all-in-one bundle (based on kube-prometheus-stack). Need to go beyond Kubernetes? `Try Robusta Pro `_. .. grid-item:: - .. md-tab-set:: - - .. md-tab-item:: Alert Enrichment - - .. image:: /images/prometheus-alert-with-robusta.png - :width: 800px - - .. md-tab-item:: AI Investigation - - .. image:: /images/ai-analysis.png - :width: 800px - - .. md-tab-item:: Kubernetes Problems - - .. image:: /images/oomkillpod.png - :width: 800px - - .. md-tab-item:: JIRA Integration - - .. image:: /images/jira_example.png - :width: 800px + .. image:: /images/prometheus-alert-with-robusta.png + :width: 400px Ready to get started? --------------------- diff --git a/docs/notification-routing/index.rst b/docs/notification-routing/index.rst index db36c96b4..570c36897 100644 --- a/docs/notification-routing/index.rst +++ b/docs/notification-routing/index.rst @@ -8,13 +8,13 @@ Robusta can send notifications to various destinations and route them intelligen Key Concepts ^^^^^^^^^^^^ -**Sinks** - Destinations where notifications are sent (Slack, Teams, Email, etc.) +:doc:`Sinks <../configuration/sinks/index>` - Destinations where notifications are sent (Slack, Teams, Email, etc.) -**Routing** - Rules that determine which alerts go to which sinks +:doc:`Routing ` - Rules that determine which alerts go to which sinks -**Grouping** - Thread alerts together to reduce noise (especially in Slack) +:doc:`Grouping ` - Thread alerts together to reduce noise (especially in Slack) -**Silencing** - Temporarily disable specific notifications +:doc:`Silencing ` - Temporarily disable specific notifications Getting Started ^^^^^^^^^^^^^^^ @@ -22,49 +22,56 @@ Getting Started .. grid:: 1 1 2 2 :gutter: 3 - .. grid-item-card:: :octicon:`gear;1em;` Configure Sinks + .. grid-item-card:: Configure Sinks :class-card: sd-bg-light sd-bg-text-light :link: configuring-sinks :link-type: doc Start here - learn how to set up your first notification destination - .. grid-item-card:: :octicon:`mail;1em;` Popular Sinks + .. grid-item-card:: Popular Sinks :class-card: sd-bg-light sd-bg-text-light :link: ../configuration/sinks/slack :link-type: doc Slack, Teams, Email - quick setup for common destinations - .. grid-item-card:: :octicon:`workflow;1em;` Smart Routing + .. grid-item-card:: Smart Routing :class-card: sd-bg-light sd-bg-text-light :link: routing-with-scopes :link-type: doc Send different alerts to different teams automatically - .. grid-item-card:: :octicon:`comment-discussion;1em;` Reduce Noise + .. grid-item-card:: Reduce Noise :class-card: sd-bg-light sd-bg-text-light :link: notification-grouping :link-type: doc Group related alerts in Slack threads to avoid spam -Common Workflows -^^^^^^^^^^^^^^^^ +Popular Sinks +^^^^^^^^^^^^^ -**Basic Setup**: Configure a single sink (like Slack) to receive all notifications +.. grid:: 1 1 2 4 + :gutter: 2 -**Team Routing**: Route alerts to different channels based on namespace or labels - -**Noise Reduction**: Enable grouping and silencing for cleaner notifications + .. grid-item-card:: Slack + :class-card: sd-bg-light sd-bg-text-light + :link: ../configuration/sinks/slack + :link-type: doc -**Advanced Routing**: Use scopes and filters for complex routing scenarios + .. grid-item-card:: Teams + :class-card: sd-bg-light sd-bg-text-light + :link: ../configuration/sinks/ms-teams + :link-type: doc -Next Steps -^^^^^^^^^^ + .. grid-item-card:: PagerDuty + :class-card: sd-bg-light sd-bg-text-light + :link: ../configuration/sinks/pagerduty + :link-type: doc -1. **Configure Your First Sink** - Start with :doc:`configuring-sinks` -2. **Choose Your Destination** - Browse all available sinks in :doc:`../configuration/sinks/index` -3. **Set Up Routing** - Configure intelligent routing with :doc:`routing-with-scopes` -4. **Reduce Noise** - Enable grouping with :doc:`notification-grouping` \ No newline at end of file + .. grid-item-card:: View All Sinks + :class-card: sd-bg-light sd-bg-text-light + :link: ../configuration/sinks/index + :link-type: doc \ No newline at end of file From 5ef13d18f0a97e0405b7e9a552dbbf117850493b Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 5 Aug 2025 08:54:38 +0300 Subject: [PATCH 19/51] Update disable-oomkill-notifications.rst --- docs/notification-routing/disable-oomkill-notifications.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/notification-routing/disable-oomkill-notifications.rst b/docs/notification-routing/disable-oomkill-notifications.rst index 4bfabc29c..b458c97bb 100644 --- a/docs/notification-routing/disable-oomkill-notifications.rst +++ b/docs/notification-routing/disable-oomkill-notifications.rst @@ -8,4 +8,4 @@ Configure Robusta to not send OOMKill notifications by disabling the built-in OO disabledPlaybooks: - PodOOMKill -Similarly you can to disable any built-in notification using the name of the playbook. Find all the built-in playbooks `here `_ and `here `_ \ No newline at end of file +Similarly you can to disable any built-in notification using the name of the playbook. Find all the built-in playbooks `here (prometheus) `_ and `here (kubernetes) `_ \ No newline at end of file From ce9512df06c8d9aa4e01b10969b417ab719d0931 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 5 Aug 2025 08:55:03 +0300 Subject: [PATCH 20/51] Update alert-builtin-enrichment.rst --- docs/how-it-works/alert-builtin-enrichment.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/how-it-works/alert-builtin-enrichment.rst b/docs/how-it-works/alert-builtin-enrichment.rst index 3ff138582..32e0720aa 100644 --- a/docs/how-it-works/alert-builtin-enrichment.rst +++ b/docs/how-it-works/alert-builtin-enrichment.rst @@ -25,14 +25,14 @@ Testing out Prometheus alerts 1. Deploy a broken pod that will be stuck in pending state: .. code-block:: bash - :name: cb-apply-crashpod + :name: cb-apply-pendingpod kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/pending_pods/pending_pod_resources.yaml 2. Trigger a Prometheus alert immediately, skipping the normal delays: .. code-block:: bash - :name: cb-apply-crashpod + :name: cb-trigger-prometheus-alert robusta playbooks trigger prometheus_alert alert_name=KubePodCrashLooping namespace=default pod_name=example-pod @@ -47,7 +47,7 @@ Testing out APIServer alerts Let's deploy a crashing pod: .. code-block:: bash - :name: cb-apply-crashpod + :name: cb-apply-crashpod-apiserver kubectl apply -f https://gist.githubusercontent.com/robusta-lab/283609047306dc1f05cf59806ade30b6/raw From d98ce3c7c4d5cbf6e7ab3d4591186605b99b5440 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 5 Aug 2025 17:39:56 +0300 Subject: [PATCH 21/51] revert changes --- docs/setup-robusta/installation/all-in-one-installation.rst | 1 + .../installation/extend-prometheus-installation.rst | 1 + docs/setup-robusta/installation/standalone-installation.rst | 1 + run_runner_locally.sh | 2 -- 4 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/setup-robusta/installation/all-in-one-installation.rst b/docs/setup-robusta/installation/all-in-one-installation.rst index f17323dbd..345e7cb5c 100644 --- a/docs/setup-robusta/installation/all-in-one-installation.rst +++ b/docs/setup-robusta/installation/all-in-one-installation.rst @@ -1,4 +1,5 @@ :tocdepth: 2 +:globaltoc_collapse: false .. _install-all-in-one: diff --git a/docs/setup-robusta/installation/extend-prometheus-installation.rst b/docs/setup-robusta/installation/extend-prometheus-installation.rst index fcfd7e49b..980c6625e 100644 --- a/docs/setup-robusta/installation/extend-prometheus-installation.rst +++ b/docs/setup-robusta/installation/extend-prometheus-installation.rst @@ -1,4 +1,5 @@ :tocdepth: 2 +:globaltoc_collapse: false .. _install-existing-prometheus: diff --git a/docs/setup-robusta/installation/standalone-installation.rst b/docs/setup-robusta/installation/standalone-installation.rst index 05c15dc3b..87a68a686 100644 --- a/docs/setup-robusta/installation/standalone-installation.rst +++ b/docs/setup-robusta/installation/standalone-installation.rst @@ -1,4 +1,5 @@ :tocdepth: 2 +:globaltoc_collapse: false .. _install-barebones: diff --git a/run_runner_locally.sh b/run_runner_locally.sh index 3dd8880bd..b915e7b2a 100755 --- a/run_runner_locally.sh +++ b/run_runner_locally.sh @@ -89,5 +89,3 @@ export REPO_LOCAL_BASE_DIR=./deployment/git_playbooks export INSTALLATION_NAMESPACE=default mirrord exec -f mirrord.json -- poetry run python3 -m robusta.runner.main -#mirrord exec -f mirrord.json -- poetry run memray run -m robusta.runner.main -#poetry run python3 -m robusta.runner.main \ No newline at end of file From 561561a6c7c0662db72be693f7f3e47a70ee0978 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 5 Aug 2025 17:58:44 +0300 Subject: [PATCH 22/51] small sphinx fixes --- .../embedded-prometheus.rst | 2 +- .../google-managed-prometheus.rst | 8 ++++---- .../alertmanager-integration/nagios.rst | 2 +- docs/configuration/holmesgpt/custom_toolsets.rst | 1 + .../holmesgpt/toolsets/opensearch_logs.rst | 2 +- docs/configuration/holmesgpt/toolsets/rabbitmq.rst | 2 +- docs/configuration/sinks/PagerDuty.rst | 4 ++-- docs/configuration/sinks/RobustaUI.rst | 4 ++-- docs/configuration/sinks/slack.rst | 14 +++++++++----- docs/index.rst | 2 +- .../notification-routing/notification-grouping.rst | 5 ++--- .../actions/event-enrichment.rst | 2 +- .../kubernetes-examples/playbook-track-secrets.rst | 1 + .../installation/_see_robusta_in_action-2.rst | 1 + 14 files changed, 28 insertions(+), 22 deletions(-) diff --git a/docs/configuration/alertmanager-integration/embedded-prometheus.rst b/docs/configuration/alertmanager-integration/embedded-prometheus.rst index 219c9f5be..da46dad65 100644 --- a/docs/configuration/alertmanager-integration/embedded-prometheus.rst +++ b/docs/configuration/alertmanager-integration/embedded-prometheus.rst @@ -54,5 +54,5 @@ Apply the change by performing a :ref:`Helm Upgrade `. Troubleshooting --------------------- -Encountering issues with your Prometheus? Follow this guide to resolve some :ref:`common errors `. +Encountering issues with your Prometheus? Follow this guide to resolve some :doc:`common errors `. diff --git a/docs/configuration/alertmanager-integration/google-managed-prometheus.rst b/docs/configuration/alertmanager-integration/google-managed-prometheus.rst index 2e997fcfe..326056395 100644 --- a/docs/configuration/alertmanager-integration/google-managed-prometheus.rst +++ b/docs/configuration/alertmanager-integration/google-managed-prometheus.rst @@ -9,10 +9,10 @@ Prerequisites **************** An instance of Google Managed Prometheus with the following components configured: -* Prometheus Frontend (`Instructions `_) -* Node Exporter (`Instructions `_) -* Scraping configuration for Kubelet and cAdvisor (`Instructions `_) -* Kube State Metrics (`Instructions `_) +* Prometheus Frontend (`Frontend Instructions `_) +* Node Exporter (`Node Exporter Instructions `_) +* Scraping configuration for Kubelet and cAdvisor (`Kubelet/cAdvisor Instructions `_) +* Kube State Metrics (`Kube State Metrics Instructions `_) Send Alerts to Robusta ******************************************** diff --git a/docs/configuration/alertmanager-integration/nagios.rst b/docs/configuration/alertmanager-integration/nagios.rst index dd7db7940..31f48ab0d 100644 --- a/docs/configuration/alertmanager-integration/nagios.rst +++ b/docs/configuration/alertmanager-integration/nagios.rst @@ -74,7 +74,7 @@ Ensure Robusta is part of a contact group or explicitly included in your alert d } Step 4: Create the Bash Command Script -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Save this as `notify-robusta.sh`, ensure it's executable (`chmod +x notify-robusta.sh`), and Nagios can access it. diff --git a/docs/configuration/holmesgpt/custom_toolsets.rst b/docs/configuration/holmesgpt/custom_toolsets.rst index 5453a8e99..9b75b158f 100644 --- a/docs/configuration/holmesgpt/custom_toolsets.rst +++ b/docs/configuration/holmesgpt/custom_toolsets.rst @@ -281,6 +281,7 @@ First `create a GitHub Personal Access Token with fine-grained permissions `_. diff --git a/docs/configuration/holmesgpt/toolsets/rabbitmq.rst b/docs/configuration/holmesgpt/toolsets/rabbitmq.rst index d6cf1ff70..c8cbf1367 100644 --- a/docs/configuration/holmesgpt/toolsets/rabbitmq.rst +++ b/docs/configuration/holmesgpt/toolsets/rabbitmq.rst @@ -1,4 +1,4 @@ -.. _toolset_prometheus: +.. _toolset_rabbitmq: RabbitMQ ======== diff --git a/docs/configuration/sinks/PagerDuty.rst b/docs/configuration/sinks/PagerDuty.rst index 587527226..6b3bd991b 100644 --- a/docs/configuration/sinks/PagerDuty.rst +++ b/docs/configuration/sinks/PagerDuty.rst @@ -55,7 +55,7 @@ Add the following code to your generated_values.yaml. This will send all alerts Save the file and run .. code-block:: bash - :name: cb-add-pagerduty-sink + :name: cb-add-pagerduty-sink-alerts helm upgrade robusta robusta/robusta --values=generated_values.yaml @@ -91,7 +91,7 @@ Add the following code to your generated_values.yaml file. This will send all ch Save the file and run .. code-block:: bash - :name: cb-add-pagerduty-sink + :name: cb-add-pagerduty-sink-changes helm upgrade robusta robusta/robusta --values=generated_values.yaml diff --git a/docs/configuration/sinks/RobustaUI.rst b/docs/configuration/sinks/RobustaUI.rst index 793601cfd..440840712 100644 --- a/docs/configuration/sinks/RobustaUI.rst +++ b/docs/configuration/sinks/RobustaUI.rst @@ -32,7 +32,7 @@ Use the ``robusta`` CLI to generate a token: Add a new sink to your Helm values (``generated_values.yaml``), under ``sinksConfig``, with the token you generated: .. code-block:: bash - :name: cb-robusta-ui-sink-config + :name: cb-robusta-ui-sink-config-basic sinksConfig: - robusta_sink: @@ -51,7 +51,7 @@ If you have many short-lived clusters, you can remove them from the UI automatic To do so, configure a shorter retention period by setting the ``ttl_hours`` in the Robusta UI sink settings: .. code-block:: bash - :name: cb-robusta-ui-sink-config + :name: cb-robusta-ui-sink-config-ttl sinksConfig: - robusta_sink: diff --git a/docs/configuration/sinks/slack.rst b/docs/configuration/sinks/slack.rst index 3cd7c7641..4294f7ee6 100644 --- a/docs/configuration/sinks/slack.rst +++ b/docs/configuration/sinks/slack.rst @@ -7,7 +7,7 @@ Robusta can proxy Prometheus alerts to Slack, adding powerful features like :ref :width: 600px :align: center -Optionally, Robusta can monitor Kubernetes directly and send notifications on deployment changes or Kubernetes errors. +Robusta can send both Prometheus alerts and direct Kubernetes notifications (pod crashes, deployment changes, etc.) to Slack. .. warning:: @@ -15,15 +15,19 @@ Optionally, Robusta can monitor Kubernetes directly and send notifications on de Follow `these steps `_ to upgrade. -Connecting Slack +Quick Start ------------------------------------------------ -When installing Robusta, run ``robusta gen-config`` and follow the prompt to create a Slack API key. This will use our `official +**Option 1: Automatic Setup (Recommended)** + +When installing Robusta, run ``robusta gen-config`` and follow the prompts. This automatically configures Slack using our `official Slack app `_. -**Note: Robusta can only write messages and doesn't require read permissions.** +Note: Robusta can only write messages and doesn't require read permissions. + +**Option 2: Manual Configuration** -You can also generate a Slack API key by running ``robusta integrations slack`` and setting the following Helm values in your ``generated_values.yaml``: +Generate a Slack API key by running ``robusta integrations slack``, then add to your ``generated_values.yaml``: .. code-block:: yaml diff --git a/docs/index.rst b/docs/index.rst index 10c952818..4bc576ea1 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -52,7 +52,7 @@ playbook-reference/index configuration/alertmanager-integration/alert-custom-prometheus Cost Savings - KRR - K8s Misconfigurations - Popeye + K8s Misconfigurations - Popeye .. toctree:: :maxdepth: 4 diff --git a/docs/notification-routing/notification-grouping.rst b/docs/notification-routing/notification-grouping.rst index 5d27af832..69953a65e 100644 --- a/docs/notification-routing/notification-grouping.rst +++ b/docs/notification-routing/notification-grouping.rst @@ -3,14 +3,13 @@ Notification Grouping (Slack Only) ========================================================= -You can consolidate alerts into Slack threads to reduce the number of notifications. -Each thread begins with a summary message that updates in real time as new alerts are received. +Reduce alert noise by grouping related notifications into Slack threads. Instead of flooding channels with individual alerts, Robusta creates summary messages with threaded details. .. image:: /images/notification-grouping.png :width: 600px :align: center -*Example: Alerts from a cluster are consolidated into a daily summary message, with individual alerts in the thread.* +*Example: Multiple alerts consolidated into a daily summary with individual alerts in the thread.* Configuring Notification Grouping ---------------------------------- diff --git a/docs/playbook-reference/actions/event-enrichment.rst b/docs/playbook-reference/actions/event-enrichment.rst index bbc46e723..756594cb5 100644 --- a/docs/playbook-reference/actions/event-enrichment.rst +++ b/docs/playbook-reference/actions/event-enrichment.rst @@ -1,7 +1,7 @@ Event Enrichment #################################### -The actions are used to gather extra data on errors, alerts, and other cluster events. +Enrichment actions automatically gather context when alerts fire. They fetch logs, metrics, events, and diagnostic data to help you understand and resolve issues faster. Use them as building blocks in your own automations, or write :ref:`your own enrichment actions in Python `. diff --git a/docs/playbook-reference/kubernetes-examples/playbook-track-secrets.rst b/docs/playbook-reference/kubernetes-examples/playbook-track-secrets.rst index c0492e8ec..d2463b220 100644 --- a/docs/playbook-reference/kubernetes-examples/playbook-track-secrets.rst +++ b/docs/playbook-reference/kubernetes-examples/playbook-track-secrets.rst @@ -1,4 +1,5 @@ .. _track-secrets-overview: + Track Kubernetes Secret Changes ############################################ diff --git a/docs/setup-robusta/installation/_see_robusta_in_action-2.rst b/docs/setup-robusta/installation/_see_robusta_in_action-2.rst index 76b0b9754..bf5e3fcf7 100644 --- a/docs/setup-robusta/installation/_see_robusta_in_action-2.rst +++ b/docs/setup-robusta/installation/_see_robusta_in_action-2.rst @@ -1,6 +1,7 @@ .. currently unused, I hope to clean this up and integrate it later :orphan: + See Robusta in action ------------------------------ From 4f1004f2c6098cfac0bda625f62c7fa381b88373 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 5 Aug 2025 18:07:32 +0300 Subject: [PATCH 23/51] unify prometheus terminology --- .../alertmanager-integration/alert-manager.rst | 2 +- .../alertmanager-integration/index.rst | 18 +++++++++--------- .../outofcluster-prometheus.rst | 4 ++-- 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/configuration/alertmanager-integration/alert-manager.rst b/docs/configuration/alertmanager-integration/alert-manager.rst index 06db76eaf..cc5ffa4cd 100644 --- a/docs/configuration/alertmanager-integration/alert-manager.rst +++ b/docs/configuration/alertmanager-integration/alert-manager.rst @@ -7,7 +7,7 @@ Here's how to integrate an existing Prometheus with Robusta in the same cluster: * Point Robusta at Prometheus so it can query metrics and silence alerts * Robusta will attempt auto-detection, so this is not always necessary! -If your Prometheus is in a different cluster, refer to :ref:`Centralized Prometheus`. +If your Prometheus is in a different cluster, refer to :ref:`External Prometheus`. Send Alerts to Robusta ============================ diff --git a/docs/configuration/alertmanager-integration/index.rst b/docs/configuration/alertmanager-integration/index.rst index 4412a9e69..3d8665958 100644 --- a/docs/configuration/alertmanager-integration/index.rst +++ b/docs/configuration/alertmanager-integration/index.rst @@ -28,29 +28,24 @@ Connect Robusta to your Prometheus setup to get enriched alerts with logs, event :gutter: 3 - .. grid-item-card:: Prometheus in same cluster + .. grid-item-card:: In-cluster Prometheus :class-card: sd-bg-light sd-bg-text-light :link: alert-manager :link-type: doc - .. grid-item-card:: Prometheus outside cluster + .. grid-item-card:: External Prometheus :class-card: sd-bg-light sd-bg-text-light :link: outofcluster-prometheus :link-type: doc - .. grid-item-card:: AWS Managed Prometheus - :class-card: sd-bg-light sd-bg-text-light - :link: eks-managed-prometheus - :link-type: doc - .. grid-item-card:: Azure Managed Prometheus :class-card: sd-bg-light sd-bg-text-light :link: azure-managed-prometheus :link-type: doc - .. grid-item-card:: Google Managed Prometheus + .. grid-item-card:: AWS Managed Prometheus :class-card: sd-bg-light sd-bg-text-light - :link: google-managed-prometheus + :link: eks-managed-prometheus :link-type: doc .. grid-item-card:: Coralogix @@ -58,6 +53,11 @@ Connect Robusta to your Prometheus setup to get enriched alerts with logs, event :link: coralogix_managed_prometheus :link-type: doc + .. grid-item-card:: Google Managed Prometheus + :class-card: sd-bg-light sd-bg-text-light + :link: google-managed-prometheus + :link-type: doc + .. grid-item-card:: VictoriaMetrics :class-card: sd-bg-light sd-bg-text-light :link: victoria-metrics diff --git a/docs/configuration/alertmanager-integration/outofcluster-prometheus.rst b/docs/configuration/alertmanager-integration/outofcluster-prometheus.rst index 16c3fcb35..bef4a5826 100644 --- a/docs/configuration/alertmanager-integration/outofcluster-prometheus.rst +++ b/docs/configuration/alertmanager-integration/outofcluster-prometheus.rst @@ -1,4 +1,4 @@ -Centralized Prometheus +External Prometheus ************************************** Follow this guide to connect Robusta to a central Prometheus (e.g. Thanos/Mimir), running outside the cluster monitored by Robusta. @@ -50,7 +50,7 @@ This integration lets your central Prometheus send alerts to Robusta, as if they Filtering Prometheus Queries by Cluster ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -If the same centralized Prometheus is used for many clusters, you will want to add a cluster name to all queries. +If the same external Prometheus is used for many clusters, you will want to add a cluster name to all queries. You can do so with the ``prometheus_url_query_string`` parameter, shown below: From a1db1bafe9dd80913571e6ab0e6cf87cb8dacbb0 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Tue, 5 Aug 2025 18:09:25 +0300 Subject: [PATCH 24/51] Update alert-custom-prometheus.rst --- .../alert-custom-prometheus.rst | 42 +++++++++---------- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/docs/configuration/alertmanager-integration/alert-custom-prometheus.rst b/docs/configuration/alertmanager-integration/alert-custom-prometheus.rst index 19d0e7d74..f97fb6182 100644 --- a/docs/configuration/alertmanager-integration/alert-custom-prometheus.rst +++ b/docs/configuration/alertmanager-integration/alert-custom-prometheus.rst @@ -23,37 +23,35 @@ Prerequisites * Kube-Prometheus-Stack, installed via Robusta or seperately. * Enable global rule selection for the Prometheus operator. Add the following config to your ``generated_values.yaml``. (By default Prometheus Operator picks up only certain new alerts, here we tell it to pick up all new alerts) - .. grid-item:: + .. md-tab-set:: - .. md-tab-set:: + .. md-tab-item:: Robusta Prometheus - .. md-tab-item:: Robusta Prometheus - - .. code-block:: yaml - - kube-prometheus-stack: - prometheus: - prometheusSpec: - ruleNamespaceSelector: {} # (1) - ruleSelector: {} # (2) - ruleSelectorNilUsesHelmValues: false # (3) - - .. code-annotations:: - 1. Add a namespace if you want Prometheus to identify rules created in specific namespaces. Leave ``{}`` to detect rules from any namespace. - 2. Add a label if you want Prometheus to detect rules with a specific selector. Leave ``{}`` to detect rules with any label. - 3. When set to `false`, Prometheus detects rules that are created directly, not just rules created using helm values file. - - .. md-tab-item:: Other Prometheus - - .. code-block:: yaml + .. code-block:: yaml + kube-prometheus-stack: prometheus: prometheusSpec: ruleNamespaceSelector: {} # (1) ruleSelector: {} # (2) ruleSelectorNilUsesHelmValues: false # (3) - .. code-annotations:: + .. code-annotations:: + 1. Add a namespace if you want Prometheus to identify rules created in specific namespaces. Leave ``{}`` to detect rules from any namespace. + 2. Add a label if you want Prometheus to detect rules with a specific selector. Leave ``{}`` to detect rules with any label. + 3. When set to `false`, Prometheus detects rules that are created directly, not just rules created using helm values file. + + .. md-tab-item:: Other Prometheus + + .. code-block:: yaml + + prometheus: + prometheusSpec: + ruleNamespaceSelector: {} # (1) + ruleSelector: {} # (2) + ruleSelectorNilUsesHelmValues: false # (3) + + .. code-annotations:: 1. Add a namespace if you want Prometheus to identify rules created in specific namespaces. Leave ``{}`` to detect rules from any namespace. 2. Add a label if you want Prometheus to detect rules with a specific selector. Leave ``{}`` to detect rules with any label. 3. When set to `false`, Prometheus detects rules that are created directly, not just rules created using helm values file. From 3f8e0309adc49d5e4373e9840bc7b82fd69e086c Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Wed, 6 Aug 2025 11:12:01 +0300 Subject: [PATCH 25/51] Update index.rst --- docs/configuration/holmesgpt/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/configuration/holmesgpt/index.rst b/docs/configuration/holmesgpt/index.rst index 722466c83..6b8e30a1f 100644 --- a/docs/configuration/holmesgpt/index.rst +++ b/docs/configuration/holmesgpt/index.rst @@ -15,7 +15,7 @@ AI Analysis Why use HolmesGPT? ^^^^^^^^^^^^^^^^^^^ -Robusta can integrate with `Holmes GPT `_ to analyze health issues on your cluster, and to run AI based root cause analysis for alerts. +Robusta integrates with `HolmesGPT `_ to provide AI-powered root cause analysis for your alerts. It automatically investigates issues by analyzing logs, metrics, and Kubernetes state. This requires a Robusta SaaS account, and for the Robusta UI sink to be enabled. (We have plans to support HolmesGPT in a pure OSS mode in the near future. Stay tuned!) From 9a5019ce0c0fe049b8019c9309d42064b96466d4 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Wed, 6 Aug 2025 11:13:00 +0300 Subject: [PATCH 26/51] Update index.rst --- docs/setup-robusta/installation/index.rst | 35 ++++++----------------- 1 file changed, 9 insertions(+), 26 deletions(-) diff --git a/docs/setup-robusta/installation/index.rst b/docs/setup-robusta/installation/index.rst index 47a906b42..90e708426 100644 --- a/docs/setup-robusta/installation/index.rst +++ b/docs/setup-robusta/installation/index.rst @@ -15,8 +15,6 @@ Installation Guides extend-prometheus-installation standalone-installation dev-setup - - .. grid:: 1 1 2 2 :gutter: 2 @@ -26,33 +24,18 @@ Installation Guides :link: all-in-one-installation :link-type: doc - .. grid-item-card:: Integrate with Existing Prometheus + Five minute setup. Great default alerts. Powered by Prometheus and Robusta. + + .. grid-item-card:: Add Robusta to Existing Prometheus :class-card: sd-bg-light sd-bg-text-light :link: extend-prometheus-installation :link-type: doc - .. grid-item:: - .. raw:: html - - Five minute setup. Great default alerts. Powered by Prometheus and Robusta. - - .. grid-item:: - .. raw:: html - - Make your existing alerts better. Attach pod logs. Automatic alert insights. - -Don't want Prometheus? Use :ref:`Robusta without Prometheus `. - + Make your existing alerts better. Attach pod logs. Automatic alert insights. -Already installed Robusta? See what you can do with it. -------------------------------------------------------------- - -`Route alerts to different teams based on namespace, alertname, and more `_ - -`Enhance Prometheus alerts with Robusta `_ - -`Define new Prometheus alerts `_ - -`Configure auto-remediate for Prometheus alerts `_ + .. grid-item-card:: Use Robusta's AI Agent with other monitoring tools + :class-card: sd-bg-light sd-bg-text-light + :link: install-barebones + :link-type: doc -`Track Kubernetes errors and changes using simple YAML `_ + Use Robusta's AI Agent alongside DataDog, NewRelic, SolarWinds, and more. From 84f0e2289d0a3c2ba034b65194becd7aaf4485a8 Mon Sep 17 00:00:00 2001 From: Robusta Runner Date: Mon, 11 Aug 2025 10:01:00 +0300 Subject: [PATCH 27/51] reorganize docs --- docs/conf.py | 30 +- .../_alertmanager-config.rst | 126 +-- .../_pull_integration.rst | 79 +- .../_testing_integration.rst | 2 +- .../alert-manager.rst | 10 +- .../azure-managed-prometheus.rst | 113 +-- .../coralogix_managed_prometheus.rst | 47 +- .../eks-managed-prometheus.rst | 66 +- .../google-managed-prometheus.rst | 29 +- .../grafana-alert-manager.rst | 39 +- .../outofcluster-prometheus.rst | 21 +- .../troubleshooting-alertmanager.rst | 91 +- .../victoria-metrics.rst | 45 +- .../exporting/alert-export-api.rst | 159 ++++ .../exporting/alert-statistics-api.rst | 132 +++ .../exporting/configuration-changes-api.rst | 190 +++++ .../exporting/custom-webhooks.rst | 4 +- .../exporting/exporting-data.rst | 787 +----------------- .../exporting/namespace-resources-api.rst | 135 +++ .../exporting/robusta-pro-features.rst | 20 +- .../exporting/send-alerts-api.rst | 199 +++++ .../holmesgpt/builtin_toolsets.rst | 149 ---- .../holmesgpt/custom_toolsets.rst | 557 ------------- .../holmesgpt/getting-started.rst | 179 ++++ docs/configuration/holmesgpt/index.rst | 419 ---------- .../configuration/holmesgpt/main-features.rst | 78 ++ docs/configuration/holmesgpt/permissions.rst | 41 - .../holmesgpt/remote_mcp_servers.rst | 205 ----- .../toolsets/_custom_toolset_appeal.inc.rst | 6 - .../_disable_default_logging_toolset.inc.rst | 31 - .../toolsets/_toolset_capabilities.inc.rst | 2 - .../toolsets/_toolset_configuration.inc.rst | 7 - .../_toolset_enabled_by_default.inc.rst | 5 - .../_toolsets_that_provide_logging.inc.rst | 7 - .../holmesgpt/toolsets/argocd.rst | 162 ---- docs/configuration/holmesgpt/toolsets/aws.rst | 124 --- .../holmesgpt/toolsets/confluence.rst | 62 -- .../holmesgpt/toolsets/coralogix_logs.rst | 143 ---- .../holmesgpt/toolsets/datadog_logs.rst | 200 ----- .../holmesgpt/toolsets/datetime.rst | 38 - .../holmesgpt/toolsets/docker.rst | 48 -- .../holmesgpt/toolsets/grafanaloki.rst | 241 ------ .../holmesgpt/toolsets/grafanatempo.rst | 219 ----- .../configuration/holmesgpt/toolsets/helm.rst | 59 -- .../holmesgpt/toolsets/internet.rst | 34 - .../holmesgpt/toolsets/kafka.rst | 82 -- .../holmesgpt/toolsets/kubernetes.rst | 235 ------ .../holmesgpt/toolsets/newrelic.rst | 49 -- .../holmesgpt/toolsets/notion.rst | 59 -- .../holmesgpt/toolsets/opensearch_logs.rst | 134 --- .../holmesgpt/toolsets/opensearch_status.rst | 80 -- .../holmesgpt/toolsets/prometheus.rst | 155 ---- .../holmesgpt/toolsets/rabbitmq.rst | 94 --- .../holmesgpt/toolsets/robusta.rst | 31 - .../configuration/holmesgpt/toolsets/slab.rst | 63 -- docs/configuration/index.rst | 10 +- docs/configuration/metric-providers-aws.rst | 159 ++++ docs/configuration/metric-providers-azure.rst | 129 +++ .../metric-providers-coralogix.rst | 138 +++ .../metric-providers-external.rst | 200 +++++ .../configuration/metric-providers-google.rst | 123 +++ .../metric-providers-in-cluster.rst | 126 +++ .../metric-providers-victoria.rst | 157 ++++ docs/configuration/metric-providers.rst | 181 ++++ docs/index.rst | 24 +- docs/setup-robusta/index.rst | 1 - .../installation/standalone-installation.rst | 11 +- 67 files changed, 2565 insertions(+), 5016 deletions(-) create mode 100644 docs/configuration/exporting/alert-export-api.rst create mode 100644 docs/configuration/exporting/alert-statistics-api.rst create mode 100644 docs/configuration/exporting/configuration-changes-api.rst create mode 100644 docs/configuration/exporting/namespace-resources-api.rst create mode 100644 docs/configuration/exporting/send-alerts-api.rst delete mode 100644 docs/configuration/holmesgpt/builtin_toolsets.rst delete mode 100644 docs/configuration/holmesgpt/custom_toolsets.rst create mode 100644 docs/configuration/holmesgpt/getting-started.rst delete mode 100644 docs/configuration/holmesgpt/index.rst create mode 100644 docs/configuration/holmesgpt/main-features.rst delete mode 100644 docs/configuration/holmesgpt/permissions.rst delete mode 100644 docs/configuration/holmesgpt/remote_mcp_servers.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/_custom_toolset_appeal.inc.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/_disable_default_logging_toolset.inc.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/_toolset_capabilities.inc.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/_toolset_configuration.inc.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/_toolset_enabled_by_default.inc.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/_toolsets_that_provide_logging.inc.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/argocd.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/aws.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/confluence.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/coralogix_logs.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/datadog_logs.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/datetime.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/docker.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/grafanaloki.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/grafanatempo.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/helm.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/internet.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/kafka.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/kubernetes.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/newrelic.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/notion.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/opensearch_logs.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/opensearch_status.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/prometheus.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/rabbitmq.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/robusta.rst delete mode 100644 docs/configuration/holmesgpt/toolsets/slab.rst create mode 100644 docs/configuration/metric-providers-aws.rst create mode 100644 docs/configuration/metric-providers-azure.rst create mode 100644 docs/configuration/metric-providers-coralogix.rst create mode 100644 docs/configuration/metric-providers-external.rst create mode 100644 docs/configuration/metric-providers-google.rst create mode 100644 docs/configuration/metric-providers-in-cluster.rst create mode 100644 docs/configuration/metric-providers-victoria.rst create mode 100644 docs/configuration/metric-providers.rst diff --git a/docs/conf.py b/docs/conf.py index e05cbeb91..1b6fb7abb 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -122,7 +122,35 @@ "tutorials/alert-custom-prometheus.html": "/master/configuration/alertmanager-integration/alert-custom-prometheus.html", "catalog/triggers/prometheus.html": "/master/configuration/alertmanager-integration/index.html", "playbook-reference/prometheus-examples/alert-remediation.html": "/master/playbook-reference/automatic-remediation-examples/index.html", - "configuration/ai-analysis.html": "/master/configuration/holmesgpt/index.html", + "configuration/ai-analysis.html": "/master/configuration/holmesgpt/main-features.html", + "configuration/holmesgpt/index.html": "/master/configuration/holmesgpt/main-features.html", + # AI Analysis pages redirects to holmesgpt.dev (docs have moved there) + "configuration/holmesgpt/builtin_toolsets.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/", + "configuration/holmesgpt/permissions.html": "https://holmesgpt.dev/data-sources/permissions/", + "configuration/holmesgpt/custom_toolsets.html": "https://holmesgpt.dev/data-sources/custom-toolsets/", + "configuration/holmesgpt/remote_mcp_servers.html": "https://holmesgpt.dev/data-sources/remote-mcp-servers/", + # Individual toolset page redirects + "configuration/holmesgpt/toolsets/argocd.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/argocd/", + "configuration/holmesgpt/toolsets/aws.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/aws/", + "configuration/holmesgpt/toolsets/confluence.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/confluence/", + "configuration/holmesgpt/toolsets/coralogix_logs.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/coralogix-logs/", + "configuration/holmesgpt/toolsets/datadog_logs.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/datadog/", + "configuration/holmesgpt/toolsets/datetime.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/datetime/", + "configuration/holmesgpt/toolsets/docker.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/docker/", + "configuration/holmesgpt/toolsets/grafanaloki.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/grafanaloki/", + "configuration/holmesgpt/toolsets/grafanatempo.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/grafanatempo/", + "configuration/holmesgpt/toolsets/helm.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/helm/", + "configuration/holmesgpt/toolsets/internet.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/internet/", + "configuration/holmesgpt/toolsets/kafka.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/kafka/", + "configuration/holmesgpt/toolsets/kubernetes.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/kubernetes/", + "configuration/holmesgpt/toolsets/newrelic.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/newrelic/", + "configuration/holmesgpt/toolsets/notion.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/notion/", + "configuration/holmesgpt/toolsets/opensearch_logs.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/opensearch-logs/", + "configuration/holmesgpt/toolsets/opensearch_status.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/opensearch-status/", + "configuration/holmesgpt/toolsets/prometheus.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/prometheus/", + "configuration/holmesgpt/toolsets/rabbitmq.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/rabbitmq/", + "configuration/holmesgpt/toolsets/robusta.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/robusta/", + "configuration/holmesgpt/toolsets/slab.html": "https://holmesgpt.dev/data-sources/builtin-toolsets/slab/", "coverage.html": "/master/how-it-works/coverage.html", "tutorials/python-profiling.html": "/master/playbook-reference/actions/python-troubleshooting.html#python-profiler", "tutorials/more-tutorials.html": "/master/community-tutorials.html", diff --git a/docs/configuration/alertmanager-integration/_alertmanager-config.rst b/docs/configuration/alertmanager-integration/_alertmanager-config.rst index 0aabfcc55..59f92578d 100644 --- a/docs/configuration/alertmanager-integration/_alertmanager-config.rst +++ b/docs/configuration/alertmanager-integration/_alertmanager-config.rst @@ -1,63 +1,63 @@ -.. admonition:: AlertManager config for sending alerts to Robusta - - .. tab-set:: - - .. tab-item:: kube-prometheus-stack (Prometheus Operator) - - Add the following to your `AlertManager's config Secret `_ - - Do not apply in other ways, they all `have limitations `_ and won't forward all alerts. - - .. code-block:: yaml - - receivers: - - name: 'robusta' - webhook_configs: - - url: 'http://-runner..svc.cluster.local/api/alerts' - send_resolved: true # (3) - - name: 'default-receiver' - - route: # (1) - routes: - - receiver: 'robusta' - group_by: [ '...' ] - group_wait: 1s - group_interval: 1s - matchers: - - severity =~ ".*" - repeat_interval: 4h - continue: true # (2) - receiver: 'default-receiver' - - .. code-annotations:: - 1. Put Robusta's route as the first route, to guarantee it receives alerts. If you can't do so, you must guarantee all previous routes set ``continue: true`` set. - 2. Keep sending alerts to receivers defined after Robusta. - 3. Important, so Robusta knows when alerts are resolved. - - .. tab-item:: Other Prometheus Installations - - Add the following to your AlertManager configuration, wherever it is defined. - - .. code-block:: yaml - - receivers: - - name: 'robusta' - webhook_configs: - - url: 'http://-runner..svc.cluster.local/api/alerts' - send_resolved: true # (3) - - route: # (1) - routes: - - receiver: 'robusta' - group_by: [ '...' ] - group_wait: 1s - group_interval: 1s - matchers: - - severity =~ ".*" - repeat_interval: 4h - continue: true # (2) - - .. code-annotations:: - 1. Put Robusta's route as the first route, to guarantee it receives alerts. If you can't do so, you must guarantee all previous routes set ``continue: true`` set. - 2. Keep sending alerts to receivers defined after Robusta. - 3. Important, so Robusta knows when alerts are resolved. +Configure your AlertManager to send alerts to Robusta: + +.. tab-set:: + + .. tab-item:: kube-prometheus-stack (Prometheus Operator) + + Add the following to your `AlertManager's config Secret `_. + + Do not apply in other ways, they all `have limitations `_ and won't forward all alerts. + + .. code-block:: yaml + + receivers: + - name: 'robusta' + webhook_configs: + - url: 'http://-runner..svc.cluster.local/api/alerts' + send_resolved: true # (3) + - name: 'default-receiver' + + route: # (1) + routes: + - receiver: 'robusta' + group_by: [ '...' ] + group_wait: 1s + group_interval: 1s + matchers: + - severity =~ ".*" + repeat_interval: 4h + continue: true # (2) + receiver: 'default-receiver' + + .. code-annotations:: + 1. Put Robusta's route as the first route, to guarantee it receives alerts. If you can't do so, you must guarantee all previous routes set ``continue: true``. + 2. Keep sending alerts to receivers defined after Robusta. + 3. Important, so Robusta knows when alerts are resolved. + + .. tab-item:: Other Prometheus Installations + + Add the following to your AlertManager configuration, wherever it is defined. + + .. code-block:: yaml + + receivers: + - name: 'robusta' + webhook_configs: + - url: 'http://-runner..svc.cluster.local/api/alerts' + send_resolved: true # (3) + + route: # (1) + routes: + - receiver: 'robusta' + group_by: [ '...' ] + group_wait: 1s + group_interval: 1s + matchers: + - severity =~ ".*" + repeat_interval: 4h + continue: true # (2) + + .. code-annotations:: + 1. Put Robusta's route as the first route, to guarantee it receives alerts. If you can't do so, you must guarantee all previous routes set ``continue: true``. + 2. Keep sending alerts to receivers defined after Robusta. + 3. Important, so Robusta knows when alerts are resolved. \ No newline at end of file diff --git a/docs/configuration/alertmanager-integration/_pull_integration.rst b/docs/configuration/alertmanager-integration/_pull_integration.rst index 35aec4e0a..d23deb08d 100644 --- a/docs/configuration/alertmanager-integration/_pull_integration.rst +++ b/docs/configuration/alertmanager-integration/_pull_integration.rst @@ -1,81 +1,10 @@ Configure Metric Querying ==================================== -Metrics querying lets Robusta pull metrics and create silences. +To enable Robusta to pull metrics and create silences, you need to configure Prometheus and AlertManager URLs. -If Robusta fails to auto-detect the Prometheus and Alertmanager urls - and you see related connection errors in the logs - configure the ``prometheus_url`` and ``alertmanager_url`` in your Helm values and :ref:`update Robusta ` +See :doc:`/configuration/metric-providers-in-cluster` for detailed configuration instructions. -.. code-block:: yaml +.. note:: - globalConfig: # This line should already exist - # Add the lines below - alertmanager_url: "http://ALERT_MANAGER_SERVICE_NAME.NAMESPACE.svc.cluster.local:9093" # (1) - prometheus_url: "http://PROMETHEUS_SERVICE_NAME.NAMESPACE.svc.cluster.local:9090" # (2) - - # If Prometheus has data for multiple clusters, tell Robusta how to query data for this cluster only - # prometheus_additional_labels: - # cluster: 'CLUSTER_NAME_HERE' - - # If using Grafana alerts, add this too - # grafana_api_key: # (3) - # alertmanager_flavor: grafana - - # If necessary, see docs below - # prometheus_auth: ... - # alertmanager_auth: ... - - # If using a multi-tenant prometheus or alertmanager, pass the org id to all queries - # prometheus_additional_headers: - # X-Scope-OrgID: - # alertmanager_additional_headers: - # X-Scope-OrgID: - -.. code-annotations:: - 1. Example: http://alertmanager-Helm_release_name-kube-prometheus-alertmanager.default.svc.cluster.local:9093. - 2. Example: http://Helm_Release_Name-kube-prometheus-prometheus.default.svc.cluster.local:9090 - 3. This is necessary for Robusta to create silences when using Grafana Alerts, because of minor API differences in the AlertManager embedded in Grafana. - -You can optionally setup authentication, SSL verification, and other parameters described below. - -Verify it Works -^^^^^^^^^^^^^^^^^ -Open any application in the Robusta UI. If CPU and memory graphs are shown, everything is working. - -If you don't use the Robusta UI, trigger a `demo OOMKill alert `_, -and verify that Robusta sends a Slack/Teams message with a memory graph included. If so, everything is configured properly. - -Optional Settings -============================= - -Authentication Headers -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -If Prometheus and/or AlertManager require authentication, add the following to ``generated_values.yaml``: - -.. code-block:: yaml - - globalConfig: - prometheus_auth: Bearer # Replace with your actual token or use any other auth header as needed - alertmanager_auth: Basic # Replace with your actual credentials, base64-encoded, or use any other auth header as needed - -These settings may be configured independently. - -SSL Verification -^^^^^^^^^^^^^^^^^^^^ -By default, Robusta does not verify the SSL certificate of the Prometheus server. - -To enable SSL verification, add the following to Robusta's ``generated_values.yaml``: - -.. code-block:: yaml - - runner: - additional_env_vars: - - name: PROMETHEUS_SSL_ENABLED - value: "true" - -If you have a custom Certificate Authority (CA) certificate, add one more setting: - -.. code-block:: yaml - - runner: - certificate: "" # base64-encoded certificate value + Robusta will attempt to auto-detect Prometheus and AlertManager URLs in your cluster. Manual configuration is only needed if auto-detection fails. \ No newline at end of file diff --git a/docs/configuration/alertmanager-integration/_testing_integration.rst b/docs/configuration/alertmanager-integration/_testing_integration.rst index 41bfda27b..20b1f831f 100644 --- a/docs/configuration/alertmanager-integration/_testing_integration.rst +++ b/docs/configuration/alertmanager-integration/_testing_integration.rst @@ -56,4 +56,4 @@ If everything is setup properly, this alert will reach Robusta. It will show up Robusta enriches alerts with Kubernetes and log data using Prometheus labels for mapping. Standard label names are used by default. If your setup differs, you can - :ref:`customize this mapping ` to fit your environment. + `customize this mapping `_ to fit your environment. diff --git a/docs/configuration/alertmanager-integration/alert-manager.rst b/docs/configuration/alertmanager-integration/alert-manager.rst index cc5ffa4cd..a76b9f7d5 100644 --- a/docs/configuration/alertmanager-integration/alert-manager.rst +++ b/docs/configuration/alertmanager-integration/alert-manager.rst @@ -1,13 +1,11 @@ -In-cluster Prometheus +In-cluster AlertManager Integration **************************************** -Here's how to integrate an existing Prometheus with Robusta in the same cluster: +This guide shows how to send alerts from an existing AlertManager to Robusta in the same cluster. -* Send alerts to Robusta by adding a receiver to AlertManager -* Point Robusta at Prometheus so it can query metrics and silence alerts - * Robusta will attempt auto-detection, so this is not always necessary! +If your AlertManager is in a different cluster, refer to :ref:`External Prometheus`. -If your Prometheus is in a different cluster, refer to :ref:`External Prometheus`. +For configuring metric querying and advanced Prometheus settings, see :doc:`/configuration/metric-providers-in-cluster`. Send Alerts to Robusta ============================ diff --git a/docs/configuration/alertmanager-integration/azure-managed-prometheus.rst b/docs/configuration/alertmanager-integration/azure-managed-prometheus.rst index 7bee765d7..ce6af2d32 100644 --- a/docs/configuration/alertmanager-integration/azure-managed-prometheus.rst +++ b/docs/configuration/alertmanager-integration/azure-managed-prometheus.rst @@ -1,7 +1,9 @@ -Azure managed Prometheus -************************* +Azure Managed Prometheus Alerts +********************************* -This guide walks you through integrating your Azure managed Prometheus with Robusta. You will need to configure two integrations: one to send alerts to Robusta and another to let Robusta query metrics and create silences. +This guide shows how to send alerts from Azure Managed Prometheus to Robusta. + +For configuring metric querying and advanced settings, see :doc:`/configuration/metric-providers-azure`. Send Alerts to Robusta =============================== @@ -22,107 +24,4 @@ This integration sends Azure Managed Prometheus alerts to Robusta. To configure Configure Metric Querying =============================== -Metrics querying lets Robusta pull metrics from Azure Managed Prometheus. - -This can be configured either of two ways: - -.. details:: Option #1: Create an Azure Active Directory authentication app - - **Pros:** - - Quick setup. Just need to create an app, get the credentials and add them to the manifests - - Other pods can't use the Service Principal without having the secret - **Cons:** - - Requires a service principal (Azure AD permission) - - Need the client secret in the kubernetes manifests - - Client secret expires, you need to manage its rotation - -.. details:: Option #2: Use kubelet Managed Identity - - **Pros:** - * Quick setup. Get the Managed Identity Client ID and add them to the manifests - * No need to manage secrets. Removing the password element decreases the risk of the credentials being compromised - **Cons:** - * Managed Identity is bound to the entire VMSS, which means that other pods can use it if they have the client ID - -Retrieve the Azure Prometheus query endpoint -============================================== - -Whichever method you choose, you will need an Azure Prometheus query endpoint: - -1. Go to `Azure Monitor workspaces `_ and choose your monitored workspace. -2. In your monitored workspace, `overview`, find the ``Query endpoint`` and copy it. -3. In your `generated_values.yaml` file add the query endpoint URL under ``globalConfig`` with a 443 port: - -.. code-block:: yaml - - globalConfig: # this line should already exist - prometheus_url: ":443" - -Option #1: Create an Azure authentication app -============================================== - -Create an Azure authentication app and get credentials for Robusta to access Prometheus data: - -1. Follow the Azure guide to `register an app with Azure Active Directory `_ - -2. In your generated_values.yaml file add environment variables from the previous step. - -.. code-block:: yaml - - runner: - additional_env_vars: - - name: PROMETHEUS_SSL_ENABLED - value: "true" - - name: AZURE_CLIENT_ID - value: "" - - name: AZURE_TENANT_ID - value: "" - - name: AZURE_CLIENT_SECRET - value: "" - -3. Complete the step `allow your app access to your workspace `_, so your app can query data from your Azure Monitor workspace. - -Option #2: Use Kubelet's Managed Identity -============================================== - -Instead of creating an Azure authentication app, you can use kubelet's Managed Identity to access Prometheus. -(As a variation on this, you can also create a new User Assigned Managed Identity and bind it to the underlying VMSS.) - -1. Get the AKS kubelet's Managed Identity Client ID: - -.. code-block:: bash - - az aks show -g -n --query identityProfile.kubeletidentity.clientId -o tsv - -2. In your generated_values.yaml file add the following environment variables from the previous step. - -.. code-block:: yaml - - runner: - additional_env_vars: - - name: PROMETHEUS_SSL_ENABLED - value: "true" - - name: AZURE_USE_MANAGED_ID - value: "true" - - name: AZURE_CLIENT_ID - value: "" - - name: AZURE_TENANT_ID - value: "" - -3. Give access to your Managed Identity on your workspace: - - a. Open the Access Control (IAM) page for your Azure Monitor workspace in the Azure portal. - b. Select Add role assignment. - c. Select Monitoring Data Reader and select Next. - d. For Assign access to, select Managed identity. - e. Select + Select members. - f. Select the Managed Identity you got from step 1. - g. Select Review + assign to save the configuration. - - -Optional Settings -================== - -**Prometheus flags checks** - -.. include:: ./_prometheus_flags_check.rst +To enable Robusta to pull metrics from Azure Managed Prometheus, see :doc:`/configuration/metric-providers-azure`. diff --git a/docs/configuration/alertmanager-integration/coralogix_managed_prometheus.rst b/docs/configuration/alertmanager-integration/coralogix_managed_prometheus.rst index 593349b77..a1c17d463 100644 --- a/docs/configuration/alertmanager-integration/coralogix_managed_prometheus.rst +++ b/docs/configuration/alertmanager-integration/coralogix_managed_prometheus.rst @@ -1,7 +1,9 @@ -Coralogix Managed Prometheus -******************************** +Coralogix Alerts +***************** -This guide walks you through integrating your Coralogix managed Prometheus with Robusta. You will need to configure two integrations: one to send alerts to Robusta and another to let Robusta query metrics and create silences. +This guide shows how to send alerts from Coralogix to Robusta. + +For configuring metric querying from Coralogix Prometheus, see :doc:`/configuration/metric-providers-coralogix`. Send Alerts to Robusta =============================== @@ -64,41 +66,4 @@ To configure it: Configure Metric Querying ============================== -Metrics querying lets Robusta pull metrics from Coralogix Managed Prometheus. - -1. Go to `Coralogix Documentation `_ and choose the relevant 'PromQL Endpoint' from their table. -2. In your `generated_values.yaml` file add the endpoint url: - -.. code-block:: yaml - - # this line should already exist - globalConfig: - prometheus_url: "" #for example https://prom-api.coralogix.com - # To add any labels that are relevant to the specific cluster uncomment and change the lines below (optional) - # prometheus_additional_labels: - # cluster: 'CLUSTER_NAME_HERE' - - -.. code-annotations:: - 1. This is necessary for Robusta to create silences when using Grafana Alerts, because of minor API differences in the AlertManager embedded in Grafana. - - -3. On the Coralogix site, go to Data Flow -> Api Keys and copy the 'Logs Query Key' - -.. note:: If one does not exist you will have to generate a new one by clicking 'GENERATE NEW API KEY' - -4. Create a secret in your cluster with your key logs_query_key and the value as the key you just copied - -5. In your generated_values.yaml file add the following environment variables from the previous step replacing MY_CORLOGIX_SECRET with your secret name. - -.. code-block:: yaml - - runner: - additional_env_vars: - - name: PROMETHEUS_SSL_ENABLED - value: "true" - - name: CORALOGIX_PROMETHEUS_TOKEN - valueFrom: - secretKeyRef: - name: MY_CORALOGIX_SECRET - key: logs_query_key +To enable Robusta to pull metrics from Coralogix Prometheus, see :doc:`/configuration/metric-providers-coralogix`. diff --git a/docs/configuration/alertmanager-integration/eks-managed-prometheus.rst b/docs/configuration/alertmanager-integration/eks-managed-prometheus.rst index 4329a5485..4d145da31 100644 --- a/docs/configuration/alertmanager-integration/eks-managed-prometheus.rst +++ b/docs/configuration/alertmanager-integration/eks-managed-prometheus.rst @@ -1,60 +1,20 @@ -AWS Managed Prometheus -************************* +AWS Managed Prometheus Alerts +****************************** -This guide walks you through integrating your AWS Managed Prometheus with Robusta. +AWS Managed Prometheus uses Amazon Managed Grafana for alerting. To send alerts to Robusta, configure your Grafana instance to forward alerts. -You'll need to configure two integrations: one to send alerts to Robusta and another to let Robusta query metrics and create silences. This guide only covers the integration to query metrics. +For configuring metric querying from AWS Managed Prometheus, see :doc:`/configuration/metric-providers-aws`. -Configure Metric Querying -=============================== - -Metrics querying lets Robusta pull metrics from AWS Managed Prometheus. - -1. Create an AWS access key, `See guide here `_. - -2. In your cluster, create a secret with your access key and secret access key, named `aws-secret-key`. - -3. Collect the URL for your AWS Managed Prometheus workspace. - -4. Append the following to your `generated_values.yaml` file. +Send Alerts to Robusta +====================== -.. code-block:: yaml +Since AWS Managed Prometheus doesn't have a built-in AlertManager, you'll need to: - globalConfig: - ... - prometheus_url: AWS_PROMETHEUS_URL +1. Set up Amazon Managed Grafana with your AMP workspace +2. Configure Grafana alerts to send to Robusta +3. See :doc:`grafana-alert-manager` for detailed Grafana alerting setup - # Create silences when using Grafana alerts (optional) - # grafana_api_key: # (1) - # alertmanager_flavor: grafana - - runner: - additional_env_vars: - - name: PROMETHEUS_SSL_ENABLED - value: "true" - - name: AWS_ACCESS_KEY - value: - - name: AWS_ACCESS_KEY - valueFrom: - secretKeyRef: - name: aws-secret-key - key: - - name: AWS_SECRET_ACCESS_KEY - valueFrom: - secretKeyRef: - name: aws-secret-key - key: - - name: AWS_SERVICE_NAME - value: "aps" # , it is usually aps - - name: AWS_REGION - value: - -.. code-annotations:: - 1. This is necessary for Robusta to create silences when using Grafana Alerts, because of minor API differences in the AlertManager embedded in Grafana. - -Optional Settings -================== - -**Prometheus flags checks** +Configure Metric Querying +========================= -.. include:: ./_prometheus_flags_check.rst +To enable Robusta to pull metrics from AWS Managed Prometheus, see :doc:`/configuration/metric-providers-aws`. diff --git a/docs/configuration/alertmanager-integration/google-managed-prometheus.rst b/docs/configuration/alertmanager-integration/google-managed-prometheus.rst index 326056395..da72c3352 100644 --- a/docs/configuration/alertmanager-integration/google-managed-prometheus.rst +++ b/docs/configuration/alertmanager-integration/google-managed-prometheus.rst @@ -1,9 +1,9 @@ -Google Managed Prometheus -========================== +Google Managed Prometheus Alerts +================================= -This guide walks you through integrating your `Google Managed Prometheus `_ with Robusta. +This guide shows how to send alerts from `Google Managed Prometheus `_ to Robusta. -You will need to configure two integrations: one to send alerts to Robusta and another to let Robusta query metrics and create silences. +For configuring metric querying from Google Managed Prometheus, see :doc:`/configuration/metric-providers-google`. Prerequisites **************** @@ -62,23 +62,4 @@ You know it works if you receive an alert from Robusta. Configure Metric Querying ****************************** -A pull integration lets Robusta pull metrics and create silences. - -Add the following to Robusta's configuration(``generated_values.yaml``) and :ref:`update Robusta `. - -.. code-block:: yaml - - globalConfig: # this line should already exist - prometheus_url: "http://frontend.default.svc.cluster.local:9090" - alertmanager_url: "http://alertmanager.gmp-system.svc.cluster.local:9093" - - -Verify it Works ---------------------- -Run the following command to create a Pod that triggers an OOMKilled alert - -.. code-block:: yaml - - kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/oomkill/oomkill_job.yaml - -You know it works if you receive an alert from Robusta with a graph. +To enable Robusta to pull metrics from Google Managed Prometheus, see :doc:`/configuration/metric-providers-google`. diff --git a/docs/configuration/alertmanager-integration/grafana-alert-manager.rst b/docs/configuration/alertmanager-integration/grafana-alert-manager.rst index 007694ac0..a6ec11884 100644 --- a/docs/configuration/alertmanager-integration/grafana-alert-manager.rst +++ b/docs/configuration/alertmanager-integration/grafana-alert-manager.rst @@ -1,5 +1,5 @@ -Grafana AlertManager -**************************************** +Grafana Alerts +************** Grafana can send alerts to the Robusta timeline for visualization and AI investigation. @@ -185,35 +185,8 @@ Alternatively, trigger a `demo OOMKill alert # Replace with your actual token or use any other auth header as needed - alertmanager_auth: Basic # Replace with your actual credentials, base64-encoded, or use any other auth header as needed - -These settings may be configured independently. - -SSL Verification -^^^^^^^^^^^^^^^^^^^^ -By default, Robusta does not verify the SSL certificate of the Prometheus server. - -To enable SSL verification, add the following to Robusta's ``generated_values.yaml``: - -.. code-block:: yaml - - runner: - additional_env_vars: - - name: PROMETHEUS_SSL_ENABLED - value: "true" - -If you have a custom Certificate Authority (CA) certificate, add one more setting: - -.. code-block:: yaml - - runner: - certificate: "" # base64-encoded certificate value +- :doc:`/configuration/metric-providers-in-cluster` for in-cluster Prometheus +- :doc:`/configuration/metric-providers-external` for external Prometheus +- Or the appropriate cloud provider metric documentation diff --git a/docs/configuration/alertmanager-integration/outofcluster-prometheus.rst b/docs/configuration/alertmanager-integration/outofcluster-prometheus.rst index bef4a5826..e4ba99738 100644 --- a/docs/configuration/alertmanager-integration/outofcluster-prometheus.rst +++ b/docs/configuration/alertmanager-integration/outofcluster-prometheus.rst @@ -1,9 +1,9 @@ -External Prometheus +External AlertManager Integration ************************************** -Follow this guide to connect Robusta to a central Prometheus (e.g. Thanos/Mimir), running outside the cluster monitored by Robusta. +This guide shows how to send alerts from a central AlertManager (e.g. Thanos/Mimir) running outside the cluster to Robusta. -You will need to configure two integrations: one to send alerts to Robusta and another to let Robusta query metrics and create silences. +For configuring metric querying and advanced Prometheus settings, see :doc:`/configuration/metric-providers-external`. Send Alerts to Robusta ============================== @@ -44,18 +44,3 @@ This integration lets your central Prometheus send alerts to Robusta, as if they 3. Enables sending resolved alerts to Robusta .. include:: ./_testing_integration.rst - -.. include:: ./_pull_integration.rst - -Filtering Prometheus Queries by Cluster -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -If the same external Prometheus is used for many clusters, you will want to add a cluster name to all queries. - -You can do so with the ``prometheus_url_query_string`` parameter, shown below: - -.. code-block:: yaml - - globalConfig: - # Additional query string parameters to be appended to the Prometheus connection URL (optional) - prometheus_url_query_string: "cluster=prod1&x=y" diff --git a/docs/configuration/alertmanager-integration/troubleshooting-alertmanager.rst b/docs/configuration/alertmanager-integration/troubleshooting-alertmanager.rst index ce098eacd..76ea56fd9 100644 --- a/docs/configuration/alertmanager-integration/troubleshooting-alertmanager.rst +++ b/docs/configuration/alertmanager-integration/troubleshooting-alertmanager.rst @@ -1,13 +1,13 @@ -Integrating AlertManager with the UI -************************************************* +Sending Alerts to the Robusta UI +================================= Why Send Your Alerts to Robusta? ---------------------------------------- +--------------------------------- Benefits include: * Persistent alert history on a filterable timeline -* Centralized view of alerts from all sources and AlertManager instances +* Centralized view of alerts from all your monitoring systems (multiple Prometheus instances, cloud services, custom tools) * AI investigation of alerts * Correlations between alerts and Kubernetes deploys * and more! @@ -15,26 +15,75 @@ Benefits include: .. image:: /images/robusta-ui-timeline.png :alt: Prometheus Alert History -How to Send Your Alerts To Robusta ---------------------------------------- +Setting Up Alert Integration +----------------------------- -Choose one of the following options: +To configure alert integration with your monitoring system, see :doc:`Alert Sources `. -1. :ref:`Enable Robusta's embedded kube-prometheus-stack stack ` -2. :ref:`Add a webhook to your existing AlertManager (or equivalent integration) `. +Common Troubleshooting Scenarios +--------------------------------- -Troubleshooting the embedded kube-prometheus-stack ------------------------------------------------------ +.. tab-set:: -1. Did you install Robusta in the last 10 minutes? If so, wait 10 minutes and see if the problem resolves on its own. -2. Check if all Prometheus and AlertManager related pods are running and healthy -3. If you see OOMKills, increase the memory limits for the relevant pods. -4. If you are still having trouble, please reach out on our `Slack community `_. + .. tab-item:: General Issues -Troubleshooting an external AlertManager webhook -------------------------------------------------------- + **Not receiving alerts in Robusta UI?** -1. Are there errors in your AlertManager logs? -2. Are there errors in the Prometheus Operator logs (if relevant)? -3. Is Robusta the first receiver in your AlertManager configuration? If not, are all previous receivers configured with ``continue: true``? -4. If you are still having trouble, please reach out on our `Slack community `_. + 1. **Just installed?** Wait 10 minutes after installation for all components to initialize + 2. **Check your specific integration:** Each alert source has its own troubleshooting guide on its documentation page + 3. **Verify authentication:** Ensure API keys and webhook URLs are correctly configured + + **Need to test your integration?** + + Refer to your specific alert source documentation for testing procedures. + + .. tab-item:: AlertManager + + **Not receiving alerts?** + + 1. **Verify routing configuration:** + + - Ensure Robusta is the first receiver in your AlertManager configuration, or + - All previous receivers have ``continue: true`` set + - See configuration examples in your specific alert source documentation + + 2. **Check logs for errors:** + + - Review AlertManager logs for webhook errors + - Check Prometheus Operator logs (if using kube-prometheus-stack) + - Look for errors in Robusta runner logs + + 3. **Check pod health (embedded Prometheus stack):** + + - Verify all Prometheus and AlertManager pods are running + - Look for OOMKills and increase memory limits if needed + - See :doc:`Embedded Prometheus troubleshooting ` + + 4. **Verify network connectivity (external AlertManager):** + + - Test connectivity to Robusta webhook endpoint + - Check firewall rules and network policies + - Ensure AlertManager can resolve DNS names + + **Alerts arriving but missing Kubernetes context?** + + Check :doc:`Alert Label Mapping ` to customize how Prometheus labels map to Kubernetes resources. + + +Testing Your Integration +------------------------ + +Each alert source has specific testing methods: + +* **Standard AlertManager**: Use ``robusta demo-alert`` command +* **Cloud Services**: Check the specific service's documentation for test procedures +* **Custom Systems**: Use the test features built into your monitoring platform + +Refer to your specific integration documentation for detailed testing steps. + +Need More Help? +--------------- + +* Check your specific alert source documentation for detailed troubleshooting +* Review logs in AlertManager, Prometheus Operator (if applicable), and Robusta runner +* Join our `Slack community `_ for direct support \ No newline at end of file diff --git a/docs/configuration/alertmanager-integration/victoria-metrics.rst b/docs/configuration/alertmanager-integration/victoria-metrics.rst index 2d8f9bd48..a9a4a3d07 100644 --- a/docs/configuration/alertmanager-integration/victoria-metrics.rst +++ b/docs/configuration/alertmanager-integration/victoria-metrics.rst @@ -1,9 +1,9 @@ -Victoria Metrics -******************** +VictoriaMetrics Alerts +********************** -This guide walks you through configuring `Victoria Metrics `_ with Robusta. +This guide shows how to send alerts from `VictoriaMetrics `_ to Robusta. -You will need to configure two integrations: one to send alerts to Robusta and another to let Robusta query metrics and create silences. +For configuring metric querying from VictoriaMetrics, see :doc:`/configuration/metric-providers-victoria`. Send Alerts to Robusta ============================ @@ -40,39 +40,4 @@ Add the following to your Victoria Metrics Alertmanager configuration (e.g., Hel Configure Metrics Querying ==================================== -Robusta can query metrics and create silences using Victoria Metrics. If both are in the same Kubernetes cluster, Robusta can auto-detect the Victoria Metrics service. To verify, go to the "Apps" tab in Robusta, select an application, and check for usage graphs. - -If auto-detection fails you must add the ``prometheus_url`` parameter and :ref:`update Robusta `. - -.. code-block:: yaml - - globalConfig: # this line should already exist - # add the lines below - alertmanager_url: "http://..svc.cluster.local:9093" # Example:"http://vmalertmanager-victoria-metrics-vm.default.svc.cluster.local:9093/" - prometheus_url: "http://VM_Metrics_SERVICE_NAME.NAMESPACE.svc.cluster.local:8429" # Example:"http://vmsingle-vmks-victoria-metrics-k8s-stack.default.svc.cluster.local:8429" - # Add any labels that are relevant to the specific cluster (optional) - # prometheus_additional_labels: - # cluster: 'CLUSTER_NAME_HERE' - - # Additional query string parameters to be appended to the Prometheus connection URL (optional) - # prometheus_url_query_string: "demo-query=example-data&another-query=value" - - # Create alert silencing when using Grafana alerts (optional) - # grafana_api_key: # (1) - # alertmanager_flavor: grafana - - # If using a multi-tenant prometheus or alertmanager, pass the org id to all queries - # prometheus_additional_headers: - # X-Scope-OrgID: - # alertmanager_additional_headers: - # X-Scope-OrgID: - -.. code-annotations:: - 1. This is necessary for Robusta to create silences when using Grafana Alerts, because of minor API differences in the AlertManager embedded in Grafana. - -Optional Settings -================== - -**Prometheus flags checks** - -.. include:: ./_prometheus_flags_check.rst +To enable Robusta to query metrics from VictoriaMetrics, see :doc:`/configuration/metric-providers-victoria`. diff --git a/docs/configuration/exporting/alert-export-api.rst b/docs/configuration/exporting/alert-export-api.rst new file mode 100644 index 000000000..7ea4bbc19 --- /dev/null +++ b/docs/configuration/exporting/alert-export-api.rst @@ -0,0 +1,159 @@ +Alert Export API +============================================== + +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + +Use this endpoint to export alert history data. You can filter the results based on specific criteria using query parameters such as ``alert_name``, ``account_id``, and time range. + +.. _alert-export-api: + +GET https://api.robusta.dev/api/query/alerts +------------------------------------------------------ + +Query Parameters +^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: 20 10 70 10 + :header-rows: 1 + + * - Parameter + - Type + - Description + - Required + * - ``account_id`` + - string + - The unique account identifier (found in your ``generated_values.yaml`` file). + - Yes + * - ``start_ts`` + - string + - Start timestamp for the alert history query (in ISO 8601 format, e.g., ``2024-09-02T04:02:05.032Z``). + - Yes + * - ``end_ts`` + - string + - End timestamp for the alert history query (in ISO 8601 format, e.g., ``2024-09-17T05:02:05.032Z``). + - Yes + * - ``alert_name`` + - string + - The name of the alert to filter by (e.g., ``CrashLoopBackoff``). + - No + +Example Request +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following ``curl`` command demonstrates how to export alert history data for the ``CrashLoopBackoff`` alert: + +.. code-block:: bash + + curl --location 'https://api.robusta.dev/api/query/alerts?alert_name=CrashLoopBackoff&account_id=ACCOUNT_ID&start_ts=2024-09-02T04%3A02%3A05.032Z&end_ts=2024-09-17T05%3A02%3A05.032Z' \ + --header 'Authorization: Bearer API-KEY' + +In the command, make sure to replace the following placeholders: + +- ``ACCOUNT_ID``: Your account ID, which can be found in your ``generated_values.yaml`` file. +- ``API-KEY``: Your API Key for authentication. You can generate this token in the platform by navigating to **Settings** -> **API Keys** -> **New API Key**, and creating a key with the "Read Alerts" permission. + +Request Headers +^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: 30 70 + :header-rows: 1 + + * - Header + - Description + * - ``Authorization`` + - Bearer token for authentication (e.g., ``Bearer TOKEN_HERE``). The token must have "Read Alerts" permission. + +Response Format +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The API will return a list of alerts in JSON format. Each alert object contains detailed information about the alert, including the name, priority, source, and related resource information. + +Example Response +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: json + + [ + { + "alert_name": "CrashLoopBackoff", + "title": "Crashing pod api-gateway-123abc in namespace prod", + "description": null, + "source": "kubernetes_api_server", + "priority": "HIGH", + "started_at": "2024-09-03T04:09:31.342818+00:00", + "resolved_at": null, + "cluster": "prod-cluster-1", + "namespace": "prod", + "app": "api-gateway", + "kind": null, + "resource_name": "api-gateway-123abc", + "resource_node": "gke-prod-cluster-1-node-1" + }, + { + "alert_name": "CrashLoopBackoff", + "title": "Crashing pod billing-service-xyz789 in namespace billing", + "description": null, + "source": "kubernetes_api_server", + "priority": "HIGH", + "started_at": "2024-09-03T04:09:31.496713+00:00", + "resolved_at": null, + "cluster": "prod-cluster-2", + "namespace": "billing", + "app": "billing-service", + "kind": null, + "resource_name": "billing-service-xyz789", + "resource_node": "gke-prod-cluster-2-node-3" + } + ] + +Response Fields +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: 25 10 70 + :header-rows: 1 + + * - Field + - Type + - Description + * - ``alert_name`` + - string + - Name of the alert (e.g., ``CrashLoopBackoff``). + * - ``title`` + - string + - A brief description of the alert event. + * - ``source`` + - string + - Source of the alert (e.g., ``kubernetes_api_server``). + * - ``priority`` + - string + - Priority level of the alert (e.g., ``HIGH``). + * - ``started_at`` + - string + - Timestamp when the alert was triggered, in ISO 8601 format. + * - ``resolved_at`` + - string + - Timestamp when the alert was resolved, or ``null`` if still unresolved. + * - ``cluster`` + - string + - The cluster where the alert originated. + * - ``namespace`` + - string + - Namespace where the alert occurred. + * - ``app`` + - string + - The application that triggered the alert. + * - ``resource_name`` + - string + - Name of the resource that caused the alert. + * - ``resource_node`` + - string + - The node where the resource is located. + +Quick Start Example +^^^^^^^^^^^^^^^^^^^ + +There is a quick-start `Prometheus report-generator `_ on GitHub that demonstrates how to use the export APIs. \ No newline at end of file diff --git a/docs/configuration/exporting/alert-statistics-api.rst b/docs/configuration/exporting/alert-statistics-api.rst new file mode 100644 index 000000000..9e79fd7b1 --- /dev/null +++ b/docs/configuration/exporting/alert-statistics-api.rst @@ -0,0 +1,132 @@ +Alert Statistics API +============================================== + +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + +Use this endpoint to retrieve aggregated alert data, including the count of each type of alert during a specified time range. Filters can be applied using query parameters such as `account_id` and the time range. + +.. _alert-reporting-api: + +GET https://api.robusta.dev/api/query/report +------------------------------------------------------------ + +Query Parameters +^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: 20 10 70 10 + :header-rows: 1 + + * - Parameter + - Type + - Description + - Required + * - ``account_id`` + - string + - The unique account identifier (found in your ``generated_values.yaml`` file). + - Yes + * - ``start_ts`` + - string + - Start timestamp for the query (in ISO 8601 format, e.g., ``2024-10-27T04:02:05.032Z``). + - Yes + * - ``end_ts`` + - string + - End timestamp for the query (in ISO 8601 format, e.g., ``2024-11-27T05:02:05.032Z``). + - Yes + + +Example Request +^^^^^^^^^^^^^^^^^^^^^^^ + +The following `curl` command demonstrates how to query aggregated alert data for a specified time range: + +.. code-block:: bash + + curl --location 'https://api.robusta.dev/api/query/report?account_id=XXXXXX-XXXX_XXXX_XXXXX7&start_ts=2024-10-27T04:02:05.032Z&end_ts=2024-11-27T05:02:05.032Z' \ + --header 'Authorization: Bearer API-KEY' + + +In the command, make sure to replace the following placeholders: + +- `account_id`: Your account ID, which can be found in your `generated_values.yaml` file. +- `API-KEY`: Your API Key for authentication. Generate this token in the platform by navigating to **Settings** -> **API Keys** -> **New API Key**, and creating a key with the "Read Alerts" permission. + + + +Request Headers +^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: 30 70 + :header-rows: 1 + + * - Header + - Description + * - ``Authorization`` + - Bearer token for authentication (e.g., ``Bearer TOKEN_HERE``). The token must have "Read Alerts" permission. + +Response Format +^^^^^^^^^^^^^^^^^^^^ + +The API will return a JSON array of aggregated alerts, with each object containing: + +- **`aggregation_key`**: The unique identifier of the alert type (e.g., `KubeJobFailed`). +- **`alert_count`**: The total count of occurrences of this alert type within the specified time range. + +Example Response +^^^^^^^^^^^^^^^^^^^^^^^^^ +.. code-block:: json + + [ + {"aggregation_key": "KubeJobFailed", "alert_count": 17413}, + {"aggregation_key": "KubePodNotReady", "alert_count": 11893}, + {"aggregation_key": "KubeDeploymentReplicasMismatch", "alert_count": 2410}, + {"aggregation_key": "KubeDeploymentRolloutStuck", "alert_count": 923}, + {"aggregation_key": "KubePodCrashLooping", "alert_count": 921}, + {"aggregation_key": "KubeContainerWaiting", "alert_count": 752}, + {"aggregation_key": "PrometheusRuleFailures", "alert_count": 188}, + {"aggregation_key": "KubeMemoryOvercommit", "alert_count": 187}, + {"aggregation_key": "PrometheusOperatorRejectedResources", "alert_count": 102}, + {"aggregation_key": "KubeletTooManyPods", "alert_count": 94}, + {"aggregation_key": "NodeMemoryHighUtilization", "alert_count": 23}, + {"aggregation_key": "TargetDown", "alert_count": 19}, + {"aggregation_key": "test123", "alert_count": 7}, + {"aggregation_key": "KubeAggregatedAPIDown", "alert_count": 4}, + {"aggregation_key": "KubeAggregatedAPIErrors", "alert_count": 4}, + {"aggregation_key": "KubeMemoryOvercommitTEST2", "alert_count": 1}, + {"aggregation_key": "TestAlert", "alert_count": 1}, + {"aggregation_key": "TestAlert2", "alert_count": 1}, + {"aggregation_key": "dsafd", "alert_count": 1}, + {"aggregation_key": "KubeMemoryOvercommitTEST", "alert_count": 1}, + {"aggregation_key": "vfd", "alert_count": 1} + ] + + + +Response Fields +^^^^^^^^^^^^^^^^^^^^ +.. list-table:: + :widths: 25 10 70 + :header-rows: 1 + + * - Field + - Type + - Description + * - ``aggregation_key`` + - string + - The unique key representing the type of alert (e.g., ``KubeJobFailed``). + * - ``alert_count`` + - integer + - The number of times this alert occurred within the specified time range. + +Notes +^^^^^^^^^^^^^^^ + +- Ensure that the `start_ts` and `end_ts` parameters are in ISO 8601 format and are correctly set to cover the desired time range. +- Use the correct `Authorization` token with sufficient permissions to access the alert data. + +Quick Start Example +^^^^^^^^^^^^^^^^^^^ + +There is a quick-start `Prometheus report-generator `_ on GitHub that demonstrates how to use the export APIs. \ No newline at end of file diff --git a/docs/configuration/exporting/configuration-changes-api.rst b/docs/configuration/exporting/configuration-changes-api.rst new file mode 100644 index 000000000..b8259a502 --- /dev/null +++ b/docs/configuration/exporting/configuration-changes-api.rst @@ -0,0 +1,190 @@ +Configuration Changes API +============================================== + +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + +Use this endpoint to send configuration changes to Robusta. You can send up to 1000 configuration changes in a single request. + +.. _send-configuration-changes-api: + +POST https://api.robusta.dev/api/config-changes +-------------------------------------------------------------------- + +Request Body Schema +^^^^^^^^^^^^^^^^^^^ + +The request body must include the following fields: + +.. list-table:: + :widths: 25 10 70 10 + :header-rows: 1 + + * - Field + - Type + - Description + - Required + * - ``account_id`` + - string + - The unique account identifier. + - Yes + * - ``config_changes`` + - list + - A list of configuration changes. + - Yes + +Configuration Change Schema +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Each configuration change in the ``config_changes`` list must follow the specific schema, which includes the following fields: + +.. list-table:: + :widths: 25 10 70 10 + :header-rows: 1 + + * - Field + - Type + - Description + - Required + * - ``title`` + - string + - A short description of the configuration change. + - Yes + * - ``old_config`` + - string + - The previous configuration value. + - Yes + * - ``new_config`` + - string + - The new configuration value. + - Yes + * - ``resource_name`` + - string + - The name of the resource affected by the configuration change. + - Yes + * - ``description`` + - string + - A detailed description of the configuration change (optional). + - No + * - ``source`` + - string + - The source of the configuration change (default: ``external``). + - No + * - ``cluster`` + - string + - The cluster where the configuration change occurred (default: ``external``). + - No + * - ``labels`` + - dict + - Extra labels for the alert (optional). + - No + * - ``annotations`` + - dict + - Extra annotations for the configuration change (optional). + - No + * - ``subject_name`` + - string + - The name of the subject related to the configuration change (optional). + - No + * - ``subject_namespace`` + - string + - The namespace of the subject related to the configuration change (optional). + - No + * - ``subject_node`` + - string + - The node where the subject related to the configuration change is located (optional). + - No + * - ``subject_type`` + - string + - The type of subject related to the configuration change (optional). + - No + * - ``service_key`` + - string + - A key identifying the service related to the configuration change (optional). + - No + * - ``fingerprint`` + - string + - A unique identifier for the configuration change (optional). + - No + +Example Request +^^^^^^^^^^^^^^^^^^^^ + +Here is an example of a ``POST`` request to send a list of configuration changes: + +.. code-block:: bash + + curl --location --request POST 'https://api.robusta.dev/api/config-changes' \ + --header 'Authorization: Bearer API-KEY' \ + --header 'Content-Type: application/json' \ + --data-raw '{ + "account_id": "ACCOUNT_ID", + "config_changes": [ + { + "title": "Updated test-service deployment", + "old_config": "apiVersion: apps/v1\nkind: Deployment\n....", + "new_config": "apiVersion: apps/v1...", + "resource_name": "test sercvice", + "description": "Changed deployemnt", + "source": "test-service", + "cluster": "prod-cluster-1", + "labels": { + "environment": "production" + }, + "annotations": { + "env1": "true" + }, + "subject_namespace": "prod", + "subject_node": "gke-prod-cluster-1-node-1" + } + ] + }' + +In this request, replace the following placeholders: + +- ``ACCOUNT_ID``: Your account ID, which can be found in your ``generated_values.yaml`` file. +- ``API-KEY``: Your API Key for authentication. You can generate this token by navigating to **Settings** -> **API Keys** -> **New API Key**. + +Request Headers +^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: 30 70 + :header-rows: 1 + + * - Header + - Description + * - ``Authorization`` + - Bearer token for authentication (e.g., ``Bearer TOKEN_HERE``). The token must have the necessary permissions to submit configuration changes. + * - ``Content-Type`` + - Must be set to ``application/json``. + +Response Format +^^^^^^^^^^^^^^^^^^^^ + +Success Response +"""""""""""""""" + +If the request is successful, the API will return the following response: + +.. code-block:: json + + { + "success": true + } + +- **Status Code**: `200 OK` + +Error Response +"""""""""""""" + +If there is an error in processing the request, the API will return the following format: + +.. code-block:: json + + { + "msg": "Error message here", + "error_code": 123 + } + +- **Status Code**: Varies based on the error (e.g., `400 Bad Request`, `500 Internal Server Error`). \ No newline at end of file diff --git a/docs/configuration/exporting/custom-webhooks.rst b/docs/configuration/exporting/custom-webhooks.rst index 265c76570..4a87695f4 100644 --- a/docs/configuration/exporting/custom-webhooks.rst +++ b/docs/configuration/exporting/custom-webhooks.rst @@ -28,7 +28,7 @@ You'll need your API key and account ID: 1. **Account ID**: Found in your ``generated_values.yaml`` file 2. **API Key**: Generate this in the Robusta platform under **Settings** โ†’ **API Keys** โ†’ **New API Key** -For detailed API documentation including request format, authentication, and examples, see :doc:`Alert History Import and Export API `. +For detailed API documentation including request format, authentication, and examples, see :doc:`Send Alerts API `. Quick Example ------------- @@ -56,4 +56,4 @@ Here's a simple example of sending a custom alert: Next Steps ---------- -For complete API documentation including all available fields and response formats, see :doc:`Alert History Import and Export API `. \ No newline at end of file +For complete API documentation including all available fields and response formats, see :doc:`Send Alerts API `. \ No newline at end of file diff --git a/docs/configuration/exporting/exporting-data.rst b/docs/configuration/exporting/exporting-data.rst index 296686caf..72b6a2f3c 100644 --- a/docs/configuration/exporting/exporting-data.rst +++ b/docs/configuration/exporting/exporting-data.rst @@ -1,778 +1,31 @@ -Alert History Import and Export API +Robusta API Reference ============================================== .. note:: - This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + These features are available with the Robusta SaaS platform and self-hosted commercial plans. They are not available in the open-source version. -The Robusta SaaS platform exposes several HTTP APIs for exporting data and sending alerts: +The Robusta platform exposes HTTP APIs for exporting data, sending alerts, and managing resources. -* :ref:`API to export alerts ` - Export historical alert data -* :ref:`API to fetch aggregate alert statistics ` - Get aggregated alert statistics -* :ref:`API to send alerts ` - Send custom alerts programmatically -* :ref:`API to send configuration changes ` - Track configuration changes +.. toctree:: + :maxdepth: 1 + + send-alerts-api + configuration-changes-api + alert-export-api + alert-statistics-api + namespace-resources-api -For a simpler webhook integration guide, see :doc:`Custom Webhooks `. +Getting Started +--------------- -There is an quick-start `Prometheus report-generator `_ on GitHub that demonstrates how to use the export APIs. +All APIs require authentication using an API key. Generate API keys in the Robusta UI: -.. _alert-export-api: +**Settings** โ†’ **API Keys** โ†’ **New API Key** -GET https://api.robusta.dev/api/query/alerts ------------------------------------------------------- +Assign appropriate permissions to your API key based on the APIs you plan to use. -Use this endpoint to export alert history data. You can filter the results based on specific criteria using query parameters such as ``alert_name``, ``account_id``, and time range. +Related Resources +----------------- -Query Parameters -^^^^^^^^^^^^^^^^^^^^^^ - -.. list-table:: - :widths: 20 10 70 10 - :header-rows: 1 - - * - Parameter - - Type - - Description - - Required - * - ``account_id`` - - string - - The unique account identifier (found in your ``generated_values.yaml`` file). - - Yes - * - ``start_ts`` - - string - - Start timestamp for the alert history query (in ISO 8601 format, e.g., ``2024-09-02T04:02:05.032Z``). - - Yes - * - ``end_ts`` - - string - - End timestamp for the alert history query (in ISO 8601 format, e.g., ``2024-09-17T05:02:05.032Z``). - - Yes - * - ``alert_name`` - - string - - The name of the alert to filter by (e.g., ``CrashLoopBackoff``). - - No - -Example Request -^^^^^^^^^^^^^^^^^^^^^^^^^ - -The following ``curl`` command demonstrates how to export alert history data for the ``CrashLoopBackoff`` alert: - -.. code-block:: bash - - curl --location 'https://api.robusta.dev/api/query/alerts?alert_name=CrashLoopBackoff&account_id=ACCOUNT_ID&start_ts=2024-09-02T04%3A02%3A05.032Z&end_ts=2024-09-17T05%3A02%3A05.032Z' \ - --header 'Authorization: Bearer API-KEY' - -In the command, make sure to replace the following placeholders: - -- ``ACCOUNT_ID``: Your account ID, which can be found in your ``generated_values.yaml`` file. -- ``API-KEY``: Your API Key for authentication. You can generate this token in the platform by navigating to **Settings** -> **API Keys** -> **New API Key**, and creating a key with the "Read Alerts" permission. - -Request Headers -^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. list-table:: - :widths: 30 70 - :header-rows: 1 - - * - Header - - Description - * - ``Authorization`` - - Bearer token for authentication (e.g., ``Bearer TOKEN_HERE``). The token must have "Read Alerts" permission. - -Response Format -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The API will return a list of alerts in JSON format. Each alert object contains detailed information about the alert, including the name, priority, source, and related resource information. - -Example Response -^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: json - - [ - { - "alert_name": "CrashLoopBackoff", - "title": "Crashing pod api-gateway-123abc in namespace prod", - "description": null, - "source": "kubernetes_api_server", - "priority": "HIGH", - "started_at": "2024-09-03T04:09:31.342818+00:00", - "resolved_at": null, - "cluster": "prod-cluster-1", - "namespace": "prod", - "app": "api-gateway", - "kind": null, - "resource_name": "api-gateway-123abc", - "resource_node": "gke-prod-cluster-1-node-1" - }, - { - "alert_name": "CrashLoopBackoff", - "title": "Crashing pod billing-service-xyz789 in namespace billing", - "description": null, - "source": "kubernetes_api_server", - "priority": "HIGH", - "started_at": "2024-09-03T04:09:31.496713+00:00", - "resolved_at": null, - "cluster": "prod-cluster-2", - "namespace": "billing", - "app": "billing-service", - "kind": null, - "resource_name": "billing-service-xyz789", - "resource_node": "gke-prod-cluster-2-node-3" - } - ] - -Response Fields -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. list-table:: - :widths: 25 10 70 - :header-rows: 1 - - * - Field - - Type - - Description - * - ``alert_name`` - - string - - Name of the alert (e.g., ``CrashLoopBackoff``). - * - ``title`` - - string - - A brief description of the alert event. - * - ``source`` - - string - - Source of the alert (e.g., ``kubernetes_api_server``). - * - ``priority`` - - string - - Priority level of the alert (e.g., ``HIGH``). - * - ``started_at`` - - string - - Timestamp when the alert was triggered, in ISO 8601 format. - * - ``resolved_at`` - - string - - Timestamp when the alert was resolved, or ``null`` if still unresolved. - * - ``cluster`` - - string - - The cluster where the alert originated. - * - ``namespace`` - - string - - Namespace where the alert occurred. - * - ``app`` - - string - - The application that triggered the alert. - * - ``resource_name`` - - string - - Name of the resource that caused the alert. - * - ``resource_node`` - - string - - The node where the resource is located. - -.. _alert-reporting-api: - -GET `https://api.robusta.dev/api/query/report` ------------------------------------------------------------- - -Use this endpoint to retrieve aggregated alert data, including the count of each type of alert during a specified time range. Filters can be applied using query parameters such as `account_id` and the time range. - - -Query Parameters -^^^^^^^^^^^^^^^^^^^^ - -.. list-table:: - :widths: 20 10 70 10 - :header-rows: 1 - - * - Parameter - - Type - - Description - - Required - * - ``account_id`` - - string - - The unique account identifier (found in your ``generated_values.yaml`` file). - - Yes - * - ``start_ts`` - - string - - Start timestamp for the query (in ISO 8601 format, e.g., ``2024-10-27T04:02:05.032Z``). - - Yes - * - ``end_ts`` - - string - - End timestamp for the query (in ISO 8601 format, e.g., ``2024-11-27T05:02:05.032Z``). - - Yes - - -Example Request -^^^^^^^^^^^^^^^^^^^^^^^ - -The following `curl` command demonstrates how to query aggregated alert data for a specified time range: - -.. code-block:: bash - - curl --location 'https://api.robusta.dev/api/query/report?account_id=XXXXXX-XXXX_XXXX_XXXXX7&start_ts=2024-10-27T04:02:05.032Z&end_ts=2024-11-27T05:02:05.032Z' \ - --header 'Authorization: Bearer API-KEY' - - -In the command, make sure to replace the following placeholders: - -- `account_id`: Your account ID, which can be found in your `generated_values.yaml` file. -- `API-KEY`: Your API Key for authentication. Generate this token in the platform by navigating to **Settings** -> **API Keys** -> **New API Key**, and creating a key with the "Read Alerts" permission. - - - -Request Headers -^^^^^^^^^^^^^^^^^^^^ - -.. list-table:: - :widths: 30 70 - :header-rows: 1 - - * - Header - - Description - * - ``Authorization`` - - Bearer token for authentication (e.g., ``Bearer TOKEN_HERE``). The token must have "Read Alerts" permission. - -Response Format -^^^^^^^^^^^^^^^^^^^^ - -The API will return a JSON array of aggregated alerts, with each object containing: - -- **`aggregation_key`**: The unique identifier of the alert type (e.g., `KubeJobFailed`). -- **`alert_count`**: The total count of occurrences of this alert type within the specified time range. - -Example Response -^^^^^^^^^^^^^^^^^^^^^^^^^ -.. code-block:: json - - [ - {"aggregation_key": "KubeJobFailed", "alert_count": 17413}, - {"aggregation_key": "KubePodNotReady", "alert_count": 11893}, - {"aggregation_key": "KubeDeploymentReplicasMismatch", "alert_count": 2410}, - {"aggregation_key": "KubeDeploymentRolloutStuck", "alert_count": 923}, - {"aggregation_key": "KubePodCrashLooping", "alert_count": 921}, - {"aggregation_key": "KubeContainerWaiting", "alert_count": 752}, - {"aggregation_key": "PrometheusRuleFailures", "alert_count": 188}, - {"aggregation_key": "KubeMemoryOvercommit", "alert_count": 187}, - {"aggregation_key": "PrometheusOperatorRejectedResources", "alert_count": 102}, - {"aggregation_key": "KubeletTooManyPods", "alert_count": 94}, - {"aggregation_key": "NodeMemoryHighUtilization", "alert_count": 23}, - {"aggregation_key": "TargetDown", "alert_count": 19}, - {"aggregation_key": "test123", "alert_count": 7}, - {"aggregation_key": "KubeAggregatedAPIDown", "alert_count": 4}, - {"aggregation_key": "KubeAggregatedAPIErrors", "alert_count": 4}, - {"aggregation_key": "KubeMemoryOvercommitTEST2", "alert_count": 1}, - {"aggregation_key": "TestAlert", "alert_count": 1}, - {"aggregation_key": "TestAlert2", "alert_count": 1}, - {"aggregation_key": "dsafd", "alert_count": 1}, - {"aggregation_key": "KubeMemoryOvercommitTEST", "alert_count": 1}, - {"aggregation_key": "vfd", "alert_count": 1} - ] - - - -Response Fields -^^^^^^^^^^^^^^^^^^^^ -.. list-table:: - :widths: 25 10 70 - :header-rows: 1 - - * - Field - - Type - - Description - * - ``aggregation_key`` - - string - - The unique key representing the type of alert (e.g., ``KubeJobFailed``). - * - ``alert_count`` - - integer - - The number of times this alert occurred within the specified time range. - -Notes -^^^^^^^^^^^^^^^ - -- Ensure that the `start_ts` and `end_ts` parameters are in ISO 8601 format and are correctly set to cover the desired time range. -- Use the correct `Authorization` token with sufficient permissions to access the alert data. - -.. _send-alerts-api: - -POST https://api.robusta.dev/api/alerts ----------------------------------------------------- -Use this endpoint to send alert data to Robusta. You can send up to 1000 alerts in a single request. - -Request Body Schema -^^^^^^^^^^^^^^^^^^^^^^^^ - -The request body must include the following fields: - -.. list-table:: - :widths: 25 10 70 10 - :header-rows: 1 - - * - Field - - Type - - Description - - Required - * - ``account_id`` - - string - - The unique account identifier. - - Yes - * - ``alerts`` - - list - - A list of alerts to be sent. - - Yes - -Each alert in the ``alerts`` list must follow the specific schema, which includes the following fields: - -.. list-table:: - :widths: 20 10 70 10 - :header-rows: 1 - - * - Field - - Type - - Description - - Required - * - ``title`` - - string - - A short description of the alert. - - Yes - * - ``description`` - - string - - A detailed description of the alert - - Yes - * - ``source`` - - string - - The source of the alert. - - Yes - * - ``priority`` - - string (one of: ``critical``, ``high``, ``medium``, ``error``, ``warning``, ``info``, ``low``, ``debug``) - - The priority level of the alert. - - Yes - * - ``aggregation_key`` - - string - - A key to group alerts that are related. - - Yes - * - ``failure`` - - boolean - - Indicates whether the alert represents a failure (default: ``false``). - - No - * - ``starts_at`` - - string (ISO 8601 timestamp) - - The timestamp when the alert started (optional). - - No - * - ``ends_at`` - - string (ISO 8601 timestamp) - - The timestamp when the alert ended (optional). - - No - * - ``labels`` - - dict - - Extra labels for the alert (optional). - - No - * - ``annotations`` - - dict - - Extra annotations for the alert (optional). - - No - * - ``cluster`` - - string - - Alert's cluster (default: ``external``) - - No - * - ``service_key`` - - string - - A key identifying the service related to the alert (optional). - - No - * - ``subject_type`` - - string - - The type of subject related to the alert (optional). - - No - * - ``subject_name`` - - string - - The name of the subject related to the alert (optional) - - No - * - ``subject_namespace`` - - string - - The namespace of the subject related to the alert (optional). - - No - * - ``subject_node`` - - string - - The node where the subject related to the alert is located (optional). - - No - * - ``fingerprint`` - - string - - A unique identifier for the alert (optional). - - No - -Example Request -^^^^^^^^^^^^^^^ - -Here is an example of a ``POST`` request to send a list of alerts: - -.. code-block:: bash - - curl --location --request POST 'https://api.robusta.dev/api/alerts' \ - --header 'Authorization: Bearer API-KEY' \ - --header 'Content-Type: application/json' \ - --data-raw '{ - "account_id": "ACCOUNT_ID", - "alerts": [ - { - "title": "Test Service Down", - "description": "The Test Service is not responding.", - "source": "monitoring-system", - "priority": "high", - "aggregation_key": "test-service-issues", - "failure": true, - "starts_at": "2024-10-07T10:00:00Z", - "labels": { - "environment": "production" - }, - "annotations": { - "env1": "true" - }, - "cluster": "prod-cluster-1", - "subject_namespace": "prod", - "subject_node": "gke-prod-cluster-1-node-1" - } - ] - }' - -In this request, replace the following placeholders: - -- ``ACCOUNT_ID``: Your account ID, which can be found in your ``generated_values.yaml`` file. -- ``API-KEY``: Your API Key for authentication. You can generate this token by navigating to **Settings** -> **API Keys** -> **New API Key**. - -Request Headers -^^^^^^^^^^^^^^^^^^^^ - -.. list-table:: - :widths: 30 70 - :header-rows: 1 - - * - Header - - Description - * - ``Authorization`` - - Bearer token for authentication (e.g., ``Bearer TOKEN_HERE``). The token must have the necessary permissions to submit alerts. - * - ``Content-Type`` - - Must be set to ``application/json``. - -Response Format -^^^^^^^^^^^^^^^^^^^^ - -*Success Response* - -If the request is successful, the API will return the following response: - -.. code-block:: json - - { - "success": true - } - -- **Status Code**: `200 OK` - -*Error Response* - -If there is an error in processing the request, the API will return the following format: - -.. code-block:: json - - { - "msg": "Error message here", - "error_code": 123 - } - -- **Status Code**: Varies based on the error (e.g., `400 Bad Request`, `500 Internal Server Error`). - -.. _send-configuration-changes-api: - -POST https://api.robusta.dev/api/config-changes --------------------------------------------------------------------- - -Use this endpoint to send configuration changes to Robusta. You can send up to 1000 configuration changes in a single request. - -Request Body Schema -^^^^^^^^^^^^^^^^^^^ - -The request body must include the following fields: - -.. list-table:: - :widths: 25 10 70 10 - :header-rows: 1 - - * - Field - - Type - - Description - - Required - * - ``account_id`` - - string - - The unique account identifier. - - Yes - * - ``config_changes`` - - list - - A list of configuration changes. - - Yes - -Each configuration change in the ``config_changes`` list must follow the specific schema, which includes the following fields: - -.. list-table:: - :widths: 25 10 70 10 - :header-rows: 1 - - * - Field - - Type - - Description - - Required - * - ``title`` - - string - - A short description of the configuration change. - - Yes - * - ``old_config`` - - string - - The previous configuration value. - - Yes - * - ``new_config`` - - string - - The new configuration value. - - Yes - * - ``resource_name`` - - string - - The name of the resource affected by the configuration change. - - Yes - * - ``description`` - - string - - A detailed description of the configuration change (optional). - - No - * - ``source`` - - string - - The source of the configuration change (default: ``external``). - - No - * - ``cluster`` - - string - - The cluster where the configuration change occurred (default: ``external``). - - No - * - ``labels`` - - dict - - Extra labels for the alert (optional). - - No - * - ``annotations`` - - dict - - Extra annotations for the configuration change (optional). - - No - * - ``subject_name`` - - string - - The name of the subject related to the configuration change (optional). - - No - * - ``subject_namespace`` - - string - - The namespace of the subject related to the configuration change (optional). - - No - * - ``subject_node`` - - string - - The node where the subject related to the configuration change is located (optional). - - No - * - ``subject_type`` - - string - - The type of subject related to the configuration change (optional). - - No - * - ``service_key`` - - string - - A key identifying the service related to the configuration change (optional). - - No - * - ``fingerprint`` - - string - - A unique identifier for the configuration change (optional). - - No - -Example Request -^^^^^^^^^^^^^^^^^^^^ - -Here is an example of a ``POST`` request to send a list of configuration changes: - -.. code-block:: bash - - curl --location --request POST 'https://api.robusta.dev/api/config-changes' \ - --header 'Authorization: Bearer API-KEY' \ - --header 'Content-Type: application/json' \ - --data-raw '{ - "account_id": "ACCOUNT_ID", - "config_changes": [ - { - "title": "Updated test-service deployment", - "old_config": "apiVersion: apps/v1\nkind: Deployment\n....", - "new_config": "apiVersion: apps/v1...", - "resource_name": "test sercvice", - "description": "Changed deployemnt", - "source": "test-service", - "cluster": "prod-cluster-1", - "labels": { - "environment": "production" - }, - "annotations": { - "env1": "true" - }, - "subject_namespace": "prod", - "subject_node": "gke-prod-cluster-1-node-1" - } - ] - }' - -In this request, replace the following placeholders: - -- ``ACCOUNT_ID``: Your account ID, which can be found in your ``generated_values.yaml`` file. -- ``API-KEY``: Your API Key for authentication. You can generate this token by navigating to **Settings** -> **API Keys** -> **New API Key**. - -Request Headers -^^^^^^^^^^^^^^^^^^^^ - -.. list-table:: - :widths: 30 70 - :header-rows: 1 - - * - Header - - Description - * - ``Authorization`` - - Bearer token for authentication (e.g., ``Bearer TOKEN_HERE``). The token must have the necessary permissions to submit configuration changes. - * - ``Content-Type`` - - Must be set to ``application/json``. - -Response Format -^^^^^^^^^^^^^^^^^^^^ - -*Success Response* - -If the request is successful, the API will return the following response: - -.. code-block:: json - - { - "success": true - } - -- **Status Code**: `200 OK` - -*Error Response* - -If there is an error in processing the request, the API will return the following format: - -.. code-block:: json - - { - "msg": "Error message here", - "error_code": 123 - } - -- **Status Code**: Varies based on the error (e.g., `400 Bad Request`, `500 Internal Server Error`). - -.. _namespaces-resources-api: - -POST https://api.robusta.dev/api/namespaces/resources ------------------------------------------------------- - -Use this endpoint to retrieve an **active count of specific Kubernetes resources** within a namespace. This is the same data displayed in the **Namespaces** tab of the Robusta UI. - -You can specify exactly which resource kinds you want to query in the request. - -This API relies on resource types configured in the Robusta UI sink. -Make sure to configure them as described in :ref:`cb-robusta-ui-sink-namespace-config`. - -Request Body Schema -^^^^^^^^^^^^^^^^^^^ - -The request body must include the following fields: - -.. list-table:: - :widths: 25 10 70 10 - :header-rows: 1 - - * - Field - - Type - - Description - - Required - * - ``namespace`` - - string - - The name of the namespace you want to inspect. - - Yes - * - ``account_id`` - - string - - The unique account identifier. - - Yes - * - ``cluster_name`` - - string - - The name of the cluster where the namespace resides. - - Yes - * - ``resources`` - - list - - A list of resource types to count, each including ``kind``, ``apiGroup``, and ``apiVersion``. - - Yes - -Each item in the ``resources`` list must include: - -* ``kind`` (e.g., `Deployments`) -* ``apiGroup`` (e.g., `apps`, or empty string for core group) -* ``apiVersion`` (e.g., `v1`, `v2`) - -Example Request -^^^^^^^^^^^^^^^^^^^^ - -Here is an example of a ``POST`` request to query the resource count in a namespace: - -.. code-block:: bash - - curl --location 'https://api.robusta.dev/api/namespaces/resources' \ - --header 'Authorization: Bearer API-KEY-HERE' \ - --header 'Content-Type: application/json' \ - --data '{ - "namespace": "your-namespace", - "account_id": "your-account-id", - "cluster_name": "your-cluster-name", - "resources": [ - {"kind": "Deployments", "apiGroup": "apps", "apiVersion": "v1"}, - {"kind": "Ingresses", "apiGroup": "networking.k8s.io", "apiVersion": "v1"}, - {"kind": "Services", "apiGroup": "", "apiVersion": "v1"}, - {"kind": "HorizontalPodAutoscalers", "apiGroup": "autoscaling", "apiVersion": "v2"}, - {"kind": "ReplicationControllers", "apiGroup": "", "apiVersion": "v1"} - ] - }' - -Replace: - -- ``API-KEY-HERE`` with your API Key from **Settings โ†’ API Keys โ†’ New API Key**. - Make sure the key has **Clusters โ†’ Read** permissions to access namespace resource data. -- ``your-account-id`` with the ID found in ``generated_values.yaml`` -- ``your-cluster-name`` and ``your-namespace`` accordingly - -Response Format -^^^^^^^^^^^^^^^^^^^^ - -*Success Response* - -If the request is successful, the API returns the following structure: - -.. code-block:: json - - { - "cluster": "your-cluster-name", - "namespace": "your-namespace", - "resources": [ - { - "apiGroup": "apps", - "apiVersion": "v1", - "count": 2, - "kind": "Deployments" - }, - { - "apiGroup": "", - "apiVersion": "v1", - "count": 5, - "kind": "Pods" - }, - ... - ] - } - -- **Status Code**: `200 OK` - -*Error Response* - -If an error occurs, you will receive a response in the following format: - -.. code-block:: json - - { - "msg": "Error message here", - "error_code": 456 - } - -- **Status Code**: Varies depending on the error (e.g., `400`, `403`, `500`) +* For webhook integration, see :doc:`Custom Webhooks ` +* Example implementation: `Prometheus report-generator `_ \ No newline at end of file diff --git a/docs/configuration/exporting/namespace-resources-api.rst b/docs/configuration/exporting/namespace-resources-api.rst new file mode 100644 index 000000000..b887fe9ef --- /dev/null +++ b/docs/configuration/exporting/namespace-resources-api.rst @@ -0,0 +1,135 @@ +Namespace Resources API +============================================== + +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + +Use this endpoint to retrieve an **active count of specific Kubernetes resources** within a namespace. This is the same data displayed in the **Namespaces** tab of the Robusta UI. + +You can specify exactly which resource kinds you want to query in the request. + +.. _namespaces-resources-api: + +POST https://api.robusta.dev/api/namespaces/resources +------------------------------------------------------ + +Prerequisites +^^^^^^^^^^^^^ + +This API relies on resource types configured in the Robusta UI sink. +Make sure to configure them as described in :ref:`cb-robusta-ui-sink-namespace-config`. + +Request Body Schema +^^^^^^^^^^^^^^^^^^^ + +The request body must include the following fields: + +.. list-table:: + :widths: 25 10 70 10 + :header-rows: 1 + + * - Field + - Type + - Description + - Required + * - ``namespace`` + - string + - The name of the namespace you want to inspect. + - Yes + * - ``account_id`` + - string + - The unique account identifier. + - Yes + * - ``cluster_name`` + - string + - The name of the cluster where the namespace resides. + - Yes + * - ``resources`` + - list + - A list of resource types to count, each including ``kind``, ``apiGroup``, and ``apiVersion``. + - Yes + +Resource Schema +^^^^^^^^^^^^^^^ + +Each item in the ``resources`` list must include: + +* ``kind`` (e.g., `Deployments`) +* ``apiGroup`` (e.g., `apps`, or empty string for core group) +* ``apiVersion`` (e.g., `v1`, `v2`) + +Example Request +^^^^^^^^^^^^^^^^^^^^ + +Here is an example of a ``POST`` request to query the resource count in a namespace: + +.. code-block:: bash + + curl --location 'https://api.robusta.dev/api/namespaces/resources' \ + --header 'Authorization: Bearer API-KEY-HERE' \ + --header 'Content-Type: application/json' \ + --data '{ + "namespace": "your-namespace", + "account_id": "your-account-id", + "cluster_name": "your-cluster-name", + "resources": [ + {"kind": "Deployments", "apiGroup": "apps", "apiVersion": "v1"}, + {"kind": "Ingresses", "apiGroup": "networking.k8s.io", "apiVersion": "v1"}, + {"kind": "Services", "apiGroup": "", "apiVersion": "v1"}, + {"kind": "HorizontalPodAutoscalers", "apiGroup": "autoscaling", "apiVersion": "v2"}, + {"kind": "ReplicationControllers", "apiGroup": "", "apiVersion": "v1"} + ] + }' + +Replace: + +- ``API-KEY-HERE`` with your API Key from **Settings โ†’ API Keys โ†’ New API Key**. + Make sure the key has **Clusters โ†’ Read** permissions to access namespace resource data. +- ``your-account-id`` with the ID found in ``generated_values.yaml`` +- ``your-cluster-name`` and ``your-namespace`` accordingly + +Response Format +^^^^^^^^^^^^^^^^^^^^ + +Success Response +"""""""""""""""" + +If the request is successful, the API returns the following structure: + +.. code-block:: json + + { + "cluster": "your-cluster-name", + "namespace": "your-namespace", + "resources": [ + { + "apiGroup": "apps", + "apiVersion": "v1", + "count": 2, + "kind": "Deployments" + }, + { + "apiGroup": "", + "apiVersion": "v1", + "count": 5, + "kind": "Pods" + }, + ... + ] + } + +- **Status Code**: `200 OK` + +Error Response +"""""""""""""" + +If an error occurs, you will receive a response in the following format: + +.. code-block:: json + + { + "msg": "Error message here", + "error_code": 456 + } + +- **Status Code**: Varies depending on the error (e.g., `400`, `403`, `500`) \ No newline at end of file diff --git a/docs/configuration/exporting/robusta-pro-features.rst b/docs/configuration/exporting/robusta-pro-features.rst index 6764daf54..797a98a0c 100644 --- a/docs/configuration/exporting/robusta-pro-features.rst +++ b/docs/configuration/exporting/robusta-pro-features.rst @@ -1,11 +1,19 @@ -Robusta Pro Features -==================== +Overview +======== .. note:: These features are available with the Robusta SaaS platform and self-hosted commercial plans. They are not available in the open-source version. Robusta Pro adds a web UI, additional integrations, and enterprise APIs to the open-source engine. Available as SaaS (we handle hosting) or self-hosted on-premise. +AI Analysis +----------- + +Automatically investigate and resolve issues with AI-powered analysis. + +:doc:`AI Analysis (HolmesGPT) <../holmesgpt/index>` + Automatically analyze Kubernetes alerts, logs, and metrics. Get potential root causes and remediation suggestions. + Custom Alert Ingestion ----------------------- @@ -35,14 +43,6 @@ Features include: * **Custom Alert API**: Send alerts programmatically from external systems * **Configuration Changes API**: Track configuration changes in your environment -AI Analysis ------------ - -Optional AI-powered alert investigation using HolmesGPT. - -:doc:`AI Analysis (HolmesGPT) <../holmesgpt/index>` - Automatically analyze Kubernetes alerts, logs, and metrics. Get potential root causes and remediation suggestions. - Additional Pro Features ----------------------- diff --git a/docs/configuration/exporting/send-alerts-api.rst b/docs/configuration/exporting/send-alerts-api.rst new file mode 100644 index 000000000..ef6e4db0a --- /dev/null +++ b/docs/configuration/exporting/send-alerts-api.rst @@ -0,0 +1,199 @@ +Send Alerts API +============================================== + +.. note:: + This feature is available with the Robusta SaaS platform and self-hosted commercial plans. It is not available in the open-source version. + +Use this endpoint to send alert data to Robusta. You can send up to 1000 alerts in a single request. + +.. _send-alerts-api: + +POST https://api.robusta.dev/api/alerts +---------------------------------------------------- + +Request Body Schema +^^^^^^^^^^^^^^^^^^^^^^^^ + +The request body must include the following fields: + +.. list-table:: + :widths: 25 10 70 10 + :header-rows: 1 + + * - Field + - Type + - Description + - Required + * - ``account_id`` + - string + - The unique account identifier. + - Yes + * - ``alerts`` + - list + - A list of alerts to be sent. + - Yes + +Alert Schema +^^^^^^^^^^^^ + +Each alert in the ``alerts`` list must follow the specific schema, which includes the following fields: + +.. list-table:: + :widths: 20 10 70 10 + :header-rows: 1 + + * - Field + - Type + - Description + - Required + * - ``title`` + - string + - A short description of the alert. + - Yes + * - ``description`` + - string + - A detailed description of the alert + - Yes + * - ``source`` + - string + - The source of the alert. + - Yes + * - ``priority`` + - string (one of: ``critical``, ``high``, ``medium``, ``error``, ``warning``, ``info``, ``low``, ``debug``) + - The priority level of the alert. + - Yes + * - ``aggregation_key`` + - string + - A key to group alerts that are related. + - Yes + * - ``failure`` + - boolean + - Indicates whether the alert represents a failure (default: ``false``). + - No + * - ``starts_at`` + - string (ISO 8601 timestamp) + - The timestamp when the alert started (optional). + - No + * - ``ends_at`` + - string (ISO 8601 timestamp) + - The timestamp when the alert ended (optional). + - No + * - ``labels`` + - dict + - Extra labels for the alert (optional). + - No + * - ``annotations`` + - dict + - Extra annotations for the alert (optional). + - No + * - ``cluster`` + - string + - Alert's cluster (default: ``external``) + - No + * - ``service_key`` + - string + - A key identifying the service related to the alert (optional). + - No + * - ``subject_type`` + - string + - The type of subject related to the alert (optional). + - No + * - ``subject_name`` + - string + - The name of the subject related to the alert (optional) + - No + * - ``subject_namespace`` + - string + - The namespace of the subject related to the alert (optional). + - No + * - ``subject_node`` + - string + - The node where the subject related to the alert is located (optional). + - No + * - ``fingerprint`` + - string + - A unique identifier for the alert (optional). + - No + +Example Request +^^^^^^^^^^^^^^^ + +Here is an example of a ``POST`` request to send a list of alerts: + +.. code-block:: bash + + curl --location --request POST 'https://api.robusta.dev/api/alerts' \ + --header 'Authorization: Bearer API-KEY' \ + --header 'Content-Type: application/json' \ + --data-raw '{ + "account_id": "ACCOUNT_ID", + "alerts": [ + { + "title": "Test Service Down", + "description": "The Test Service is not responding.", + "source": "monitoring-system", + "priority": "high", + "aggregation_key": "test-service-issues", + "failure": true, + "starts_at": "2024-10-07T10:00:00Z", + "labels": { + "environment": "production" + }, + "annotations": { + "env1": "true" + }, + "cluster": "prod-cluster-1", + "subject_namespace": "prod", + "subject_node": "gke-prod-cluster-1-node-1" + } + ] + }' + +In this request, replace the following placeholders: + +- ``ACCOUNT_ID``: Your account ID, which can be found in your ``generated_values.yaml`` file. +- ``API-KEY``: Your API Key for authentication. You can generate this token by navigating to **Settings** -> **API Keys** -> **New API Key**. + +Request Headers +^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: 30 70 + :header-rows: 1 + + * - Header + - Description + * - ``Authorization`` + - Bearer token for authentication (e.g., ``Bearer TOKEN_HERE``). The token must have the necessary permissions to submit alerts. + * - ``Content-Type`` + - Must be set to ``application/json``. + +Response Format +^^^^^^^^^^^^^^^^^^^^ + +Success Response +"""""""""""""""" + +If the request is successful, the API will return the following response: + +.. code-block:: json + + { + "success": true + } + +- **Status Code**: `200 OK` + +Error Response +"""""""""""""" + +If there is an error in processing the request, the API will return the following format: + +.. code-block:: json + + { + "msg": "Error message here", + "error_code": 123 + } + +- **Status Code**: Varies based on the error (e.g., `400 Bad Request`, `500 Internal Server Error`). \ No newline at end of file diff --git a/docs/configuration/holmesgpt/builtin_toolsets.rst b/docs/configuration/holmesgpt/builtin_toolsets.rst deleted file mode 100644 index 70472d035..000000000 --- a/docs/configuration/holmesgpt/builtin_toolsets.rst +++ /dev/null @@ -1,149 +0,0 @@ - -Builtin Toolsets -================ - -.. toctree:: - :hidden: - :maxdepth: 1 - - toolsets/argocd - toolsets/aws - toolsets/confluence - toolsets/coralogix_logs - toolsets/datadog_logs - toolsets/datetime - toolsets/docker - toolsets/grafanaloki - toolsets/grafanatempo - toolsets/helm - toolsets/internet - toolsets/kafka - toolsets/kubernetes - toolsets/newrelic - toolsets/notion - toolsets/opensearch_logs - toolsets/opensearch_status - toolsets/prometheus - toolsets/rabbitmq - toolsets/robusta - toolsets/slab - -Holmes allows you to define and configure integrations (toolsets) that fetch data from external sources. This data -will be automatically used in investigations when relevant. - -You can :doc:`write your own toolset ` or use the default Holmes toolsets listed below. - - -Builtin toolsets -^^^^^^^^^^^^^^^^ -Holmes comes with a set of builtin toolsets. Some of these toolsets are enabled by default, such as toolsets -to read Kubernetes resources and fetch logs. Some builtin toolsets are disabled by default and can be enabled -by the user by providing credentials or API keys to external systems. - -.. grid:: 1 1 2 3 - :gutter: 3 - - .. grid-item-card:: :octicon:`cpu;1em;` ArgoCD - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/argocd - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` AWS - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/aws - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Confluence - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/confluence - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Coralogix logs - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/coralogix_logs - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Datadog logs - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/datadog_logs - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Datetime - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/datetime - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Docker - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/docker - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Grafana Loki - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/grafanaloki - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Grafana Tempo - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/grafanatempo - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Helm - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/helm - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Internet - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/internet - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Kafka - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/kafka - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Kubernetes - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/kubernetes - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` New Relic - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/newrelic - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Notion - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/notion - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` OpenSearch logs - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/opensearch_logs - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` OpenSearch status - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/opensearch_status - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Prometheus - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/prometheus - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` RabbitMQ - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/rabbitmq - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Robusta - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/robusta - :link-type: doc - - .. grid-item-card:: :octicon:`cpu;1em;` Slab - :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/slab - :link-type: doc diff --git a/docs/configuration/holmesgpt/custom_toolsets.rst b/docs/configuration/holmesgpt/custom_toolsets.rst deleted file mode 100644 index 9b75b158f..000000000 --- a/docs/configuration/holmesgpt/custom_toolsets.rst +++ /dev/null @@ -1,557 +0,0 @@ - -Custom toolsets -=============== - -.. include:: ./toolsets/_custom_toolset_appeal.inc.rst - -Examples --------- - -Below are examples of custom toolsets and how to add them to Holmes: - - -Example 1: Grafana Toolset -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This toolset lets Holmes view Grafana dashboards and suggest relevant dashboards to the user: - -**Prerequisites:** - -- Grafana URL (e.g. http://localhost:3000 or https://grafana.example.com) -- Grafana service account token with **Basic role -> Viewer** and **Data sources -> Reader** permissions. Check out this `video `_ on creating a Grafana service account token. - -**Configuration:** - -.. md-tab-set:: - - .. md-tab-item:: Robusta Helm Chart - - **Helm Values:** - - .. code-block:: yaml - - holmes: - # provide environment variables the toolset needs - can be pulled from secrets or provided in plaintext - additionalEnvVars: - - name: GRAFANA_API_KEY - value: - - name: GRAFANA_URL - value: - - # define the toolset - toolsets: - grafana: - # this tool can only be enabled if these prerequisites are met - prerequisites: - # we need the GRAFANA_URL and GRAFANA_API_KEY environment variables to be set - - env: - - "GRAFANA_URL" - - "GRAFANA_API_KEY" - # curl must be installed - we check by running `curl --version` (if it's not installed, the command will fail) - - command: "curl --version" - - # human-readable description of the toolset (this is not seen by the AI model - its just for users) - description: "Grafana tools" - - # tools (capabilities) that will be provided to HolmesGPT when this toolset is enabled - tools: - - name: "grafana_get_dashboard" - # the LLM sees this description and uses it to decide when to use this tool - description: "Get list of grafana dashboards" - # the command that will be executed when this tool is used - # environment variables like GRAFANA_URL and GRAFANA_API_KEY can be used in the command - # they will not be exposed to the AI model, as the AI model doesn't see the command that was run - command: "curl \"${GRAFANA_URL}/api/search\" -H \"Authorization: Bearer ${GRAFANA_API_KEY}\"" - - - name: "grafana_get_url" - description: "Get the URL of a Grafana dashboard by UID, including the real URL of Grafana" - # in this command we use a variable called `{{ dashboard_uid }}` - # unlike enviroment variables that were provided by the user, variables like `{{ dashboard_uid }}` are provided by the AI model - # the AI model sees the tool description, decides to use this tool, and then provides a value for all {{ template_variables }} to invoke the tool - command: "echo \"${GRAFANA_URL}/d/{{ dashboard_uid }}\"" - - Update your Helm values with the provided YAML configuration, then apply the changes with Helm upgrade: - - .. code-block:: bash - - helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName= - - After the deployment is complete, you can open the HolmesGPT chat in the Robusta SaaS UI and ask questions like *what grafana dashboard should I look at to investigate high pod cpu?*. - - **Suggesting relevant dashboards during alert investigations:** Add runbook instructions to your alert in the Robusta UI, instructing Holmes to search for related Grafana dashboards. - - .. image:: /images/custom-grafana-toolset.png - :width: 600 - :align: center - - .. md-tab-item:: Holmes CLI - - **grafana_toolset.yaml:** - - .. code-block:: yaml - - toolsets: - grafana: - # this tool can only be enabled if these prerequisites are met - prerequisites: - # we need the GRAFANA_URL and GRAFANA_API_KEY environment variables to be set - - env: - - "GRAFANA_URL" - - "GRAFANA_API_KEY" - # curl must be installed - we check by running `curl --version` (if it's not installed, the command will fail) - - command: "curl --version" - - # human-readable description of the toolset (this is not seen by the AI model - its just for users) - description: "Grafana tools" - - # tools (capabilities) that will be provided to HolmesGPT when this toolset is enabled - tools: - - name: "grafana_get_dashboard" - # the LLM sees this description and uses it to decide when to use this tool - description: "Get list of grafana dashboards" - # the command that will be executed when this tool is used - # environment variables like GRAFANA_URL and GRAFANA_API_KEY can be used in the command - # they will not be exposed to the AI model, as the AI model doesn't see the command that was run - command: "curl \"${GRAFANA_URL}/api/search\" -H \"Authorization: Bearer ${GRAFANA_API_KEY}\"" - - - name: "grafana_get_url" - description: "Get the URL of a Grafana dashboard by UID, including the real URL of Grafana" - # in this command we use a variable called `{{ dashboard_uid }}` - # unlike environment variables that were provided by the user, variables like `{{ dashboard_uid }}` are provided by the AI model - # the AI model sees the tool description, decides to use this tool, and then provides a value for all {{ template_variables }} to invoke the tool - command: "echo \"${GRAFANA_URL}/d/{{ dashboard_uid }}\"" - - Set the appropriate environment variables and run Holmes: - - .. code-block:: bash - - export GRAFANA_API_KEY="" - export GRAFANA_URL="" - - To test, run: - - .. code-block:: bash - - holmes ask -t grafana_toolset.yaml "what grafana dashboard should I look at to investigate high pod cpu?" - -Example 2: Kubernetes Diagnostics Toolset -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This toolset provides diagnostics for Kubernetes clusters, helping developers identify and resolve issues. - -.. code-block:: yaml - - holmes: - toolsets: - kubernetes/diagnostics: - description: "Advanced diagnostics and troubleshooting tools for Kubernetes clusters" - docs_url: "https://kubernetes.io/docs/home/" - icon_url: "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRPKA-U9m5BxYQDF1O7atMfj9EMMXEoGu4t0Q&s" - tags: - - core - - cluster - prerequisites: - - command: "kubectl version --client" - tools: - - - name: "kubectl_node_health" - description: "Check the health status of all nodes in the cluster." - command: "kubectl get nodes -o wide" - - - name: "kubectl_check_resource_quota" - description: "Fetch the resource quota for a specific namespace." - command: "kubectl get resourcequota -n {{ namespace }} -o yaml" - - - name: "kubectl_find_evicted_pods" - description: "List all evicted pods in a specific namespace." - command: "kubectl get pods -n {{ namespace }} --field-selector=status.phase=Failed | grep Evicted" - -Update the ``generated_values.yaml`` file with the provided YAML configuration, then apply the changes by executing the Helm upgrade command: - -.. code-block:: bash - - helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName= - -Once deployed, Holmes will have access to advanced diagnostic tools for Kubernetes clusters. For example, you can ask Holmes, ``"Can you do a node health check?"`` and it will automatically use the newly added tools to provide you the answer. - - -Example 3: GitHub Toolset -^^^^^^^^^^^^^^^^^^^^^^^^^ - -This toolset enables Holmes to fetch information from GitHub repositories. - -First `create a GitHub Personal Access Token with fine-grained permissions `_. For this example, you can leave the default permissions. - -.. md-tab-set:: - - .. md-tab-item:: Robusta Helm Chart - - **Helm Values:** - - .. code-block:: yaml - - holmes: - # provide environment variables the toolset needs - additionalEnvVars: - - name: GITHUB_TOKEN - value: - - # define the toolset itself - toolsets: - github_tools: - description: "Tools for managing GitHub repositories" - tags: - - cli - prerequisites: - - env: - - "GITHUB_TOKEN" - - command: "curl --version" - tools: - - name: "get_recent_commits" - description: "Fetches the most recent commits for a repository" - command: "curl -H 'Authorization: token ${GITHUB_TOKEN}' https://api.github.com/repos/{{ owner }}/{{ repo }}/commits?per_page={{ limit }} " - - - name: "get_repo_details" - description: "Fetches details of a specific repository" - command: "curl -H 'Authorization: token ${GITHUB_TOKEN}' https://api.github.com/repos/{{ owner }}/{{ repo }}" - - # In the above examples, LLM-provided parameters like {{ owner }} are inferred automatically from the command - # you can also define them explicitly - this is useful if: - # - You want to enforce parameter requirements (e.g., `owner` and `repo` are required). - # - You want to define provide a default value for optional parameters. - parameters: - owner: - type: "string" - description: "Owner of the repository." - required: true - repo: - type: "string" - description: "Name of the repository." - required: true - - Update your Helm values with the provided YAML configuration, then apply the changes with Helm upgrade: - - .. code-block:: bash - - helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName= - - After the deployment is complete, the GitHub toolset will be available. HolmesGPT will be able to use it to interact with GitHub repositories. - For example, you can now open the HolmesGPT chat in the Robusta SaaS UI and ask, *who made the last commit to the robusta-dev/holmesgpt repo on github?*. - - .. image:: /images/custom-github-toolset.png - :width: 600 - :align: center - - .. md-tab-item:: Holmes CLI - - First, add the following environment variables: - - .. code-block:: bash - - export GITHUB_TOKEN="" - - Then, add the following to **~/.holmes/config.yaml**, creating the file if it doesn't exist: - - .. code-block:: yaml - - toolsets: - github_tools: - description: "Tools for managing GitHub repositories" - tags: - - cli - prerequisites: - - env: - - "GITHUB_TOKEN" - - command: "curl --version" - tools: - - name: "get_recent_commits" - description: "Fetches the most recent commits for a repository" - command: "curl -H 'Authorization: token ${GITHUB_TOKEN}' https://api.github.com/repos/{{ owner }}/{{ repo }}/commits?per_page={{ limit }} " - - # In the above examples, LLM-provided parameters like {{ owner }} are inferred automatically from the command - # you can also define them explicitly - this is useful if: - # - You want to enforce parameter requirements (e.g., `owner` and `repo` are required). - # - You want to provide a default value for optional parameters. - parameters: - owner: - type: "string" - description: "Owner of the repository." - required: true - repo: - type: "string" - description: "Name of the repository." - required: true - - To test, run: - - .. code-block:: bash - - holmes ask -t github_toolset.yaml "who made the last commit to the robusta-dev/holmesgpt repo on github?" - - -Reference ---------- - -A toolset is defined in your Helm values (``generated_values.yaml``). Each toolset has a unique name and has to contain tools. - - -.. code-block:: yaml - - toolsets: - : - enabled: - name: "" - description: "" - docs_url: "" - icon_url: "" - tags: - - - installation_instructions: "" - prerequisites: - - command: "" - expected_output: "" - - env: - - "" - additional_instructions: "" - tools: - - name: "" - description: "" - command: "" - script: "