Monitoring - metrics#117
Conversation
526951a to
c4377fb
Compare
c4377fb to
04fcd80
Compare
jschoedl
left a comment
There was a problem hiding this comment.
I think we should also setup Prometheus Operator, right? but this could be a later PR
| namespace: monitoring | ||
| spec: | ||
| accessModes: | ||
| - ReadWriteOnce |
There was a problem hiding this comment.
Per default, k8s does a rolling upgrade, so during upgrade 2 instances are up at the same time. ReadWriteOnce means that the first instance holds a lock, so the second one waits -> deadlock. To not use rolling upgrade, set
spec:
strategy:
type: Recreate
in the Deployment spec. Same for Prometheus.
| - name: Upsert Grafana secret | ||
| run: | | ||
| kubectl create secret generic grafana-secret -n monitoring \ | ||
| --from-literal=admin-password="${{ secrets.GRAFANA_ADMIN_PASSWORD }}" \ |
There was a problem hiding this comment.
GRAFANA_ADMIN_PASSWORD is missing in the comment at the top of this file
| name: grafana-ingress | ||
| namespace: monitoring | ||
| annotations: | ||
| cert-manager.io/cluster-issuer: letsencrypt-prod |
There was a problem hiding this comment.
| cert-manager.io/cluster-issuer: letsencrypt-prod | |
| # no TLS certificate, ingress.yaml already requests one |
& drop spec.tls
(otherwise, we need more Let's Encrypt calls, which are rate limited)
| - name: Deploy monitoring | ||
| run: kubectl apply -f infra/k8s/monitoring/ |
There was a problem hiding this comment.
you could add a status check too / make sure that deployment worked (see "wait for rollouts" a few lines below)
| metadata: | ||
| name: grafana-datasources | ||
| namespace: monitoring | ||
| data: | ||
| prometheus.yaml: | | ||
| apiVersion: 1 | ||
| datasources: | ||
| - name: Prometheus | ||
| type: prometheus | ||
| url: http://prometheus.monitoring.svc.cluster.local:9090 | ||
| isDefault: true | ||
| access: proxy |
There was a problem hiding this comment.
Do I get it correctly that we have a persistent datasource, but there are no graphs in it (except when they are created manually in the UI)? Or did you want to add them later? (course requirement is to have persistent dashboards)
Summary
Adds monitoring for metrics on the python and spring services, collects them via prometheus, and displays them in grafana.
Closes #119
Type of change
API changes
api/openapi.yamlupdated andapi/scripts/gen-all.shre-runDefinition of Done