Two Python microservices deployed on Minikube: Service A exposes an HTTP endpoint that reports Service B's CPU usage, memory usage, and Kubernetes node name.
All CI checks pass, unit tests cover edge cases, and the full end-to-end pipeline runs successfully on Minikube inside GitHub Actions.
(⎈|minikube:mlops)➜ wsc-mlops-project git:(main) ✗ curl http://127.0.0.1:56457/service-b-info
{"nodeName": "minikube", "cpu": "168", "memory": "59"}The implementation was delivered via PR #1 following trunk-based development — all work on a feature branch, merged to main after CI passes.
The CI pipeline runs 4 jobs on every push and pull request:
| Job | Description | Status |
|---|---|---|
| lint | Ruff linting + format check | Passed |
| test | 15 pytest unit tests with JUnit report | Passed |
| build | Docker image builds for both services | Passed |
| e2e | Full Minikube deploy + endpoint validation | Passed |
The e2e job deploys both services into a real Minikube cluster inside GitHub Actions, waits for metrics-server, and validates the actual HTTP responses.
flowchart LR
User(["User / curl"]) -->|"GET /service-b-info"| SA
subgraph cluster["Minikube Cluster"]
subgraph mlops["namespace: mlops"]
SA["Service A (HTTP API :8080)"]
SB["Service B (Resource Consumer)"]
end
subgraph system["namespace: kube-system"]
MS["Metrics Server"]
end
end
SA -- "list pods" --> API["K8s API Server"]
SA -- "get pod metrics" --> MS
MS -. "scrapes" .-> SB
- Service A queries the Kubernetes Metrics API to retrieve Service B's resource usage and pod metadata.
- Service B is a Python process that allocates memory and burns CPU in a loop.
sequenceDiagram
actor User
participant SA as Service A
participant K8s as K8s API Server
participant MS as Metrics Server
User->>SA: GET /service-b-info
SA->>K8s: list pods (label=app:service-b)
K8s-->>SA: pod name, node name
SA->>MS: get pod metrics
MS-->>SA: CPU (nanocores), memory (bytes)
SA-->>User: {"nodeName", "cpu" (mCores), "memory" (MB)}
| Tool | Minimum Version | Install (macOS) |
|---|---|---|
| Minikube | v1.30+ | brew install minikube |
| kubectl | v1.30+ | brew install kubectl |
| Docker | v27+ | brew install --cask docker |
| uv | v0.5+ | brew install uv |
make setupmake deployThis builds Docker images inside Minikube, applies all Kubernetes manifests, and waits for deployments to become ready.
Expose the NodePort URL and keep that terminal open:
make urlThen call the printed URL from another terminal:
curl <printed-url>/service-b-infoExpected response:
{
"nodeName": "minikube",
"cpu": "150",
"memory": "50"
}Note: The metrics-server needs ~60 seconds after startup to begin reporting metrics. If you get a 503 response, wait a minute and retry.
On macOS with the Minikube Docker driver,
make urlrunsminikube service service-a -n mlops --url, which creates a tunnel for theNodePortservice. Leave that command running and use the URL it prints from a second terminal.
Remove Kubernetes resources:
make deleteFull cleanup (also deletes Minikube cluster):
make cleanInstall dependencies:
uv syncRun tests:
make testRun linting:
make lint├── services/
│ ├── service_a/
│ │ ├── app.py # HTTP API server
│ │ ├── Dockerfile
│ │ └── requirements.txt
│ └── service_b/
│ ├── app.py # Resource consumer
│ └── Dockerfile
├── k8s/
│ ├── namespace.yaml
│ ├── rbac.yaml # ServiceAccount, Role, RoleBinding
│ ├── service-a.yaml # Deployment + NodePort Service
│ ├── service-b.yaml # Deployment
│ ├── network-policy.yaml # Default deny + allow Service A ingress
│ └── hpa.yaml # HorizontalPodAutoscaler for Service A
├── tests/
│ ├── conftest.py
│ └── test_service_a.py
├── .github/workflows/ci.yaml # Lint, test, build checks
├── Makefile
├── pyproject.toml
└── README.md
- Health probes - HTTP liveness/readiness on Service A; exec liveness on Service B
- Resource limits - CPU and memory requests/limits on all pods
- RBAC - Least-privilege ServiceAccount for Service A (read-only pods + metrics)
- Non-root containers - Both services run as non-root users
- Network policies - Default deny-all ingress with explicit allow for Service A
- Autoscaling - HPA on Service A (scales replicas at 70% CPU)
- Graceful shutdown - Signal handling in Service B
- Structured logging - Timestamped log output
- CI pipeline - GitHub Actions for lint, test, and Docker build checks


