Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
apiVersion: runwhen.com/v1
kind: GenerationRules
spec:
platform: azure
generationRules:
- resourceTypes:
- azure_resources_resource_groups
matchRules:
- type: pattern
pattern: ".+"
properties: [name]
mode: substring
slxs:
- baseName: az-devops-triage
qualifiers: ["resource"]
baseTemplateName: azure-devops-triage
levelOfDetail: basic
outputItems:
- type: slx
#- type: sli
- type: runbook
templateName: azure-devops-triage-taskset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
apiVersion: runwhen.com/v1
kind: ServiceLevelIndicator
metadata:
name: {{slx_name}}
labels:
{% include "common-labels.yaml" %}
annotations:
{% include "common-annotations.yaml" %}
spec:
displayUnitsLong: OK
displayUnitsShort: ok
locations:
- {{default_location}}
description: Checks Azure DevOps health by examining pipeline status, agent pools, repository policies, and service connections in project {{ custom.devops_project }} of organization {{ custom.devops_org }}
codeBundle:
{% if repo_url %}
repoUrl: {{repo_url}}
{% else %}
repoUrl: https://github.com/runwhen-contrib/rw-cli-codecollection.git
{% endif %}
{% if ref %}
ref: {{ref}}
{% else %}
ref: main
{% endif %}
pathToRobot: codebundles/azure-devops-triage/sli.robot
intervalStrategy: intermezzo
intervalSeconds: 600
configProvided:
- name: AZURE_RESOURCE_GROUP
value: "{{ match_resource.resource.name }}"
- name: AZURE_DEVOPS_ORG
value: "{{ custom.devops_org }}"
- name: AZURE_DEVOPS_PROJECT
value: "{{ custom.devops_project }}"
secretsProvided:
{% if wb_version %}
{% include "azure-auth.yaml" ignore missing %}
{% else %}
- name: azure_credentials
workspaceKey: AUTH DETAILS NOT FOUND
{% endif %}

alerts:
warning:
operator: <
threshold: '1'
for: '20m'
ticket:
operator: <
threshold: '1'
for: '40m'
page:
operator: '=='
threshold: '0'
for: ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
apiVersion: runwhen.com/v1
kind: ServiceLevelX
metadata:
name: {{ slx_name }}
labels:
{% include "common-labels.yaml" %}
annotations:
{% include "common-annotations.yaml" %}
spec:
imageURL: https://storage.googleapis.com/runwhen-nonprod-shared-images/icons/azure/security/10245-icon-service-Key-Vaults.svg
alias: >-
{{ match_resource.resource_group.name }} Azure DevOps Health
asMeasuredBy: Composite health score of Azure DevOps resources & activities.
configProvided:
- name: SLX_PLACEHOLDER
value: SLX_PLACEHOLDER
owners:
- {{ workspace.owner_email }}
statement: >-
Measure Azure DevOps health by checking agent pools, pipeline status, repository policies,
and service connections in project {{ custom.devops_project }} of organization {{ custom.devops_org }}.
additionalContext:
name: "{{ match_resource.resource.name }}"
# {% include "azure-hierarchy.yaml" ignore missing %}
# qualified_name: "{{ match_resource.qualified_name }}"
tags:
#{% include "azure-tags.yaml" ignore missing %}
- name: cloud
value: azure
- name: service
value: devops
- name: access
value: read-only
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
apiVersion: runwhen.com/v1
kind: Runbook
metadata:
name: {{slx_name}}
labels:
{% include "common-labels.yaml" %}
annotations:
{% include "common-annotations.yaml" %}
spec:
location: {{default_location}}
description: Check Azure DevOps health by examining pipeline status, agent pools, repository policies, and service connections in project {{ custom.devops_project }} of organization {{ custom.devops_org }}
codeBundle:
{% if repo_url %}
repoUrl: {{repo_url}}
{% else %}
repoUrl: https://github.com/runwhen-contrib/rw-cli-codecollection.git
{% endif %}
{% if ref %}
ref: {{ref}}
{% else %}
ref: main
{% endif %}
pathToRobot: codebundles/azure-devops-triage/runbook.robot
configProvided:
- name: AZURE_RESOURCE_GROUP
value: "{{ resource_group.name }}"
- name: AZURE_DEVOPS_ORG
value: "{{ custom.devops_org }}"
- name: AZURE_DEVOPS_PROJECT
value: "{{ custom.devops_project }}"
secretsProvided:
{% if wb_version %}
{% include "azure-auth.yaml" ignore missing %}
{% else %}
- name: azure_credentials
workspaceKey: AUTH DETAILS NOT FOUND
{% endif %}
115 changes: 115 additions & 0 deletions codebundles/azure-devops-triage/.test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
## Testing

The `.test` directory contains infrastructure test code using Terraform to set up a test environment.

### Prerequisites for Testing

1. An existing Azure subscription
2. An existing Azure DevOps organization
3. Permissions to create resources in Azure and Azure DevOps
4. Azure CLI installed and configured
5. Terraform installed (v1.0.0+)

### Azure DevOps Organization Setup (Before Running Terraform)

Before running Terraform, you need to configure your Azure DevOps organization with the necessary permissions:

#### 1. Organization Settings Configuration

1. Navigate to your Azure DevOps organization settings (To Add the user who will be running Terraform to the organization)
2. Navigate to Users and Add the service principal as user with Basic Access level.

#### 2. Agent Pool Permissions

1. Go to Organization Settings > Agent Pools > Security
2. Add your user (service principal) account with Administrator permissions
3. Click on Save.

#### 3. Organization-Level Security Permissions

1. Go to Organization Settings > Security > Permissions
2. Navigate to Users and Find your user (service principal)
3. Click on the user and Ensure they have "Create new projects" permission set to "Allow"

These permissions are required for Terraform to successfully create and configure resources in your Azure DevOps organization.

### Test Environment Setup

The test environment creates:
- A new Azure DevOps project
- A new agent pool
- Git repositories with sample pipeline definitions
- Variable groups for testing

#### Step 1: Configure Terraform Variables

Create a `terraform.tfvars` file in the `.test/terraform` directory:

```hcl
azure_devops_org = "your-org-name"
azure_devops_org_url = "https://dev.azure.com/your-org-name"
resource_group = "your-resource-group"
location = "eastus"
tags = "your-tags"
```

#### Step 2: Initialize and Apply Terraform

```bash
cd .test/terraform
terraform init
terraform apply
```

#### Step 3: Set Up Self-Hosted Agent (Manual Step)

After Terraform creates the agent pool, you need to manually set up at least one self-hosted agent:

1. In Azure DevOps, navigate to Project Settings > Agent pools > [Your Pool Name]
2. Click "New agent"
3. Follow the instructions to download and configure the agent on your machine
4. Start the agent and verify it's online

Or follow these steps:
a. Create a folder on your machine (e.g., mkdir ~/azagent && cd ~/azagent)
b. Download the agent: curl -O https://vstsagentpackage.azureedge.net/agent/2.214.1/vsts-agent-linux-x64-2.214.1.tar.gz
c. Extract: tar zxvf vsts-agent-linux-x64-2.214.1.tar.gz
d. Configure: ./config.sh
- Server URL: https://dev.azure.com/${var.azure_devops_org}
- PAT: (your PAT) #generate PAT from the your azure devops org
- Agent pool: ${azuredevops_agent_pool.test_pool.name}
e. Run as a service: ./svc.sh install && ./svc.sh start

#### Step 4: Trigger Test Pipelines (Manual Step)

The test environment includes several pipeline definitions:
- Success Pipeline: A pipeline that completes successfully
- Failed Pipeline: A pipeline that intentionally fails
- Long-Running Pipeline: A pipeline that runs for longer than the threshold

To trigger these pipelines:
1. Navigate to Pipelines in your Azure DevOps project
2. Select each pipeline and click "Run pipeline"

#### Step 5: Run the Triage Runbook

Once the test environment is set up and pipelines are running, you can execute the triage runbook to verify it correctly identifies issues.

### Cleaning Up

To remove the test environment:

```bash
cd .test/terraform
terraform destroy
```

Note: This will not remove the Azure DevOps organization, as it was a prerequisite.

## Notes

- The codebundle uses the Azure CLI with the Azure DevOps extension to interact with Azure DevOps.
- Service principal authentication is used for Azure resources.
- The runbook focuses on identifying issues rather than fixing them.
- For queued pipelines, the threshold is measured from when the pipeline was created to the current time.
- For long-running pipelines, the threshold is measured from start time to finish time (or current time if still running).
Loading