feat(logging): migrate fluent-bit+adapter → Logging Operator (Fluentd CFAPI)#152
Merged
Merged
Conversation
… CFAPI) Replaces the previous standalone fluent-bit DaemonSet + vcflogs-cfapi-adapter sidecar with the Logging Operator pattern. Architectural reasons: - The vmware-loginsight CFAPI plugin (1.4.2, bundled in the operator's ghcr.io/kube-logging/fluentd:v1.17-5.0-full image) replaces our 80-LOC homemade adapter. Maintenance burden moves off us. - Operator-managed Fluent Bit (per-node collector) + Fluentd (HA ×2 aggregator) is the canonical k8s logging topology used at enterprise scale — direct knowledge transfer for the work-side vcflogs project. - CRD-driven config (Logging / ClusterFlow / ClusterOutput) separates infrastructure from routing policy, enabling future per-namespace selectivity without ops involvement. What lands: - New ns "logging" (replaces "tanzu-system-logging") - HelmRelease for logging-operator chart 6.5.2 from oci://ghcr.io/kube-logging/helm-charts - Logging CR — Fluent Bit DS + Fluentd STS×2 specs - ClusterOutput "vcflogs" — vmwareLogInsight HTTPS CFAPI - ClusterFlow "all-to-vcflogs" — match *, single drop logtag - New HelmRepository "kube-logging" (oci); removes "fluent" repo What goes: - "tanzu-system-logging" ns + HelmRelease + ConfigMap deleted - "flux-repositories/fluent.yaml" HelmRepository deleted - ghcr.io/varashi/vcflogs-cfapi-adapter:* image + Varashi/vcflogs- cfapi-adapter repo to be deleted manually after Flux reconciles successfully (one-way action; image referenced in #151 commit message remains discoverable via git history) Hard-swap cutover: brief log gap (~2-5 min) while Flux deletes the old DS, installs the operator, and reconciles the Logging CR. Schema verified against logging-operator 6.5.2 CRDs: - Logging: spec.fluentbit.inputTail uses native fluent-bit casing (Skip_Long_Lines / Mem_Buf_Limit / Refresh_Interval) - ClusterOutput.spec.vmwareLogInsight: all fields valid - ClusterFlow.spec.match[].select + filters[].record_modifier: valid Plex configmaps + READMEs updated to reference the new ns/topology. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
--- cluster-talos/kubernetes/infrastructure/flux-system/flux-repositories Kustomization: flux-system/flux-repositories HelmRepository: flux-system/fluent
+++ cluster-talos/kubernetes/infrastructure/flux-system/flux-repositories Kustomization: flux-system/flux-repositories HelmRepository: flux-system/fluent
@@ -1,13 +0,0 @@
----
-apiVersion: source.toolkit.fluxcd.io/v1
-kind: HelmRepository
-metadata:
- labels:
- kustomize.toolkit.fluxcd.io/name: flux-repositories
- kustomize.toolkit.fluxcd.io/namespace: flux-system
- name: fluent
- namespace: flux-system
-spec:
- interval: 1h
- url: https://fluent.github.io/helm-charts
-
--- cluster-talos/kubernetes/infrastructure/flux-system/flux-repositories Kustomization: flux-system/flux-repositories HelmRepository: flux-system/kube-logging
+++ cluster-talos/kubernetes/infrastructure/flux-system/flux-repositories Kustomization: flux-system/flux-repositories HelmRepository: flux-system/kube-logging
@@ -0,0 +1,14 @@
+---
+apiVersion: source.toolkit.fluxcd.io/v1
+kind: HelmRepository
+metadata:
+ labels:
+ kustomize.toolkit.fluxcd.io/name: flux-repositories
+ kustomize.toolkit.fluxcd.io/namespace: flux-system
+ name: kube-logging
+ namespace: flux-system
+spec:
+ interval: 1h
+ type: oci
+ url: oci://ghcr.io/kube-logging/helm-charts
+
--- cluster-talos/kubernetes/infrastructure/platform Kustomization: flux-system/infrastructure-platform Kustomization: flux-system/tanzu-system-logging
+++ cluster-talos/kubernetes/infrastructure/platform Kustomization: flux-system/infrastructure-platform Kustomization: flux-system/tanzu-system-logging
@@ -1,25 +0,0 @@
----
-apiVersion: kustomize.toolkit.fluxcd.io/v1
-kind: Kustomization
-metadata:
- labels:
- kustomize.toolkit.fluxcd.io/name: infrastructure-platform
- kustomize.toolkit.fluxcd.io/namespace: flux-system
- name: tanzu-system-logging
- namespace: flux-system
-spec:
- dependsOn:
- - name: configs
- interval: 1h
- path: ./cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app
- postBuild:
- substituteFrom:
- - kind: Secret
- name: cluster-vars
- prune: true
- sourceRef:
- kind: GitRepository
- name: flux-system
- timeout: 5m
- wait: true
-
--- cluster-talos/kubernetes/infrastructure/platform Kustomization: flux-system/infrastructure-platform Kustomization: flux-system/logging
+++ cluster-talos/kubernetes/infrastructure/platform Kustomization: flux-system/infrastructure-platform Kustomization: flux-system/logging
@@ -0,0 +1,25 @@
+---
+apiVersion: kustomize.toolkit.fluxcd.io/v1
+kind: Kustomization
+metadata:
+ labels:
+ kustomize.toolkit.fluxcd.io/name: infrastructure-platform
+ kustomize.toolkit.fluxcd.io/namespace: flux-system
+ name: logging
+ namespace: flux-system
+spec:
+ dependsOn:
+ - name: configs
+ interval: 1h
+ path: ./cluster-talos/kubernetes/infrastructure/platform/logging/app
+ postBuild:
+ substituteFrom:
+ - kind: Secret
+ name: cluster-vars
+ prune: true
+ sourceRef:
+ kind: GitRepository
+ name: flux-system
+ timeout: 5m
+ wait: true
+
--- cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app Kustomization: flux-system/tanzu-system-logging Namespace: flux-system/tanzu-system-logging
+++ cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app Kustomization: flux-system/tanzu-system-logging Namespace: flux-system/tanzu-system-logging
@@ -1,12 +0,0 @@
----
-apiVersion: v1
-kind: Namespace
-metadata:
- labels:
- kustomize.toolkit.fluxcd.io/name: tanzu-system-logging
- kustomize.toolkit.fluxcd.io/namespace: flux-system
- pod-security.kubernetes.io/audit: privileged
- pod-security.kubernetes.io/enforce: privileged
- pod-security.kubernetes.io/warn: privileged
- name: tanzu-system-logging
-
--- cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app Kustomization: flux-system/tanzu-system-logging HelmRelease: tanzu-system-logging/fluent-bit
+++ cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app Kustomization: flux-system/tanzu-system-logging HelmRelease: tanzu-system-logging/fluent-bit
@@ -1,153 +0,0 @@
----
-apiVersion: helm.toolkit.fluxcd.io/v2
-kind: HelmRelease
-metadata:
- labels:
- kustomize.toolkit.fluxcd.io/name: tanzu-system-logging
- kustomize.toolkit.fluxcd.io/namespace: flux-system
- name: fluent-bit
- namespace: tanzu-system-logging
-spec:
- chart:
- spec:
- chart: fluent-bit
- sourceRef:
- kind: HelmRepository
- name: fluent
- namespace: flux-system
- version: 0.57.5
- install:
- remediation:
- retries: 3
- interval: 30m
- upgrade:
- cleanupOnFail: true
- remediation:
- retries: 3
- strategy: rollback
- values:
- config:
- customParsers: |
- [PARSER]
- Name cri
- Format regex
- Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
- Time_Key time
- Time_Format %Y-%m-%dT%H:%M:%S.%L%z
- filters: |
- [FILTER]
- Name kubernetes
- Match kube.*
- Kube_URL https://kubernetes.default.svc:443
- Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
- Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
- Kube_Tag_Prefix kube.var.log.containers.
- Merge_Log On
- Merge_Log_Key log_processed
- K8S-Logging.Parser On
- K8S-Logging.Exclude On
-
- [FILTER]
- Name record_modifier
- Match *
- Record tkg_instance k8s-talos
- Record tkg_cluster k8s-talos
-
- [FILTER]
- Name modify
- Match kube.*
- Copy kubernetes k8s
-
- [FILTER]
- Name record_modifier
- Match kube.*
- Record node_name $${HOSTNAME}
-
- [FILTER]
- Name nest
- Match kube.*
- Operation lift
- Nested_Under kubernetes
- inputs: |
- [INPUT]
- Name tail
- Path /var/log/containers/*.log
- Parser cri
- DB /var/log/flb_kube.db
- Tag kube.*
- Mem_Buf_Limit 5MB
- Skip_Long_Lines On
- Refresh_Interval 10
- outputs: "# CFAPI HTTPS ingest via the vcflogs-cfapi-adapter sidecar\n# (see\
- \ extraContainers above). Replaces the previous syslog\n# output, which hit\
- \ the RFC 5424 2048-byte per-message cap\n# and truncated long Plex Web Request:\
- \ lines mid-token \u2014\n# losing X-Plex-Product / X-Plex-Version / X-Plex-Token\
- \ from\n# the tail when X-Plex-Client-Profile-Extra (1.5KB typical)\n# appeared\
- \ earlier in the header list.\n#\n# Format json + json_date_format epoch_ms\
- \ produce the shape\n# the adapter expects: top-level JSON array of records,\
- \ each\n# with a numeric `timestamp` key in epoch ms. The adapter\n# reshapes\
- \ per-record into CFAPI's\n# {\"events\":[{\"text\":...,\"timestamp\":...,\"\
- fields\":[...]}]}\n# wrapper and POSTs to skw-vcflogs CFAPI.\n[OUTPUT]\n \
- \ Name http\n Match kube.*\n Host \
- \ 127.0.0.1\n Port 8080\n URI /\n Format \
- \ json\n json_date_key timestamp\n json_date_format epoch_ms\n\
- \ tls Off\n # Loopback to the sidecar \u2014 retries are cheap;\
- \ let\n # fluent-bit drive the back-pressure rather than blocking\n # the\
- \ input thread on a long upstream POST.\n Retry_Limit 5\n"
- service: |
- [Service]
- Flush 1
- Log_Level info
- Daemon off
- Parsers_File parsers.conf
- HTTP_Server On
- HTTP_Listen 0.0.0.0
- HTTP_Port 2020
- extraContainers:
- - env:
- - name: LISTEN_ADDR
- value: :8080
- - name: VCFLOGS_INGEST_URL
- value: https://skw-vcflogs.${SECRET_DOMAIN}:9543/api/v1/events/ingest/k8s-talos
- - name: VCFLOGS_TLS_INSECURE
- value: 'true'
- - name: FORWARD_TIMEOUT
- value: 15s
- image: ghcr.io/varashi/vcflogs-cfapi-adapter:v0.1.0
- imagePullPolicy: IfNotPresent
- livenessProbe:
- httpGet:
- path: /healthz
- port: ingest
- initialDelaySeconds: 5
- periodSeconds: 30
- name: vcflogs-cfapi-adapter
- ports:
- - containerPort: 8080
- name: ingest
- readinessProbe:
- httpGet:
- path: /healthz
- port: ingest
- initialDelaySeconds: 2
- periodSeconds: 10
- resources:
- limits:
- cpu: 100m
- memory: 128Mi
- requests:
- cpu: 10m
- memory: 32Mi
- securityContext:
- allowPrivilegeEscalation: false
- capabilities:
- drop:
- - ALL
- readOnlyRootFilesystem: true
- runAsNonRoot: true
- runAsUser: 65532
- fullnameOverride: fluent-bit
- kind: DaemonSet
- tolerations:
- - operator: Exists
-
--- cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging Namespace: flux-system/logging
+++ cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging Namespace: flux-system/logging
@@ -0,0 +1,12 @@
+---
+apiVersion: v1
+kind: Namespace
+metadata:
+ labels:
+ kustomize.toolkit.fluxcd.io/name: logging
+ kustomize.toolkit.fluxcd.io/namespace: flux-system
+ pod-security.kubernetes.io/audit: privileged
+ pod-security.kubernetes.io/enforce: privileged
+ pod-security.kubernetes.io/warn: privileged
+ name: logging
+
--- cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging HelmRelease: logging/logging-operator
+++ cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging HelmRelease: logging/logging-operator
@@ -0,0 +1,41 @@
+---
+apiVersion: helm.toolkit.fluxcd.io/v2
+kind: HelmRelease
+metadata:
+ labels:
+ kustomize.toolkit.fluxcd.io/name: logging
+ kustomize.toolkit.fluxcd.io/namespace: flux-system
+ name: logging-operator
+ namespace: logging
+spec:
+ chart:
+ spec:
+ chart: logging-operator
+ sourceRef:
+ kind: HelmRepository
+ name: kube-logging
+ namespace: flux-system
+ version: 6.5.2
+ install:
+ crds: CreateReplace
+ remediation:
+ retries: 3
+ interval: 30m
+ upgrade:
+ cleanupOnFail: true
+ crds: CreateReplace
+ remediation:
+ retries: 3
+ strategy: rollback
+ values:
+ enableLeaderElection: true
+ logging:
+ enabled: false
+ resources:
+ limits:
+ cpu: 200m
+ memory: 256Mi
+ requests:
+ cpu: 10m
+ memory: 64Mi
+
--- cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging Logging: logging/vcflogs
+++ cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging Logging: logging/vcflogs
@@ -0,0 +1,59 @@
+---
+apiVersion: logging.banzaicloud.io/v1beta1
+kind: Logging
+metadata:
+ labels:
+ kustomize.toolkit.fluxcd.io/name: logging
+ kustomize.toolkit.fluxcd.io/namespace: flux-system
+ name: vcflogs
+ namespace: logging
+spec:
+ controlNamespace: logging
+ fluentbit:
+ bufferStorageVolume:
+ hostPath:
+ path: /var/lib/fluent-bit-buffer
+ image:
+ pullPolicy: IfNotPresent
+ repository: fluent/fluent-bit
+ tag: 3.1.10
+ inputTail:
+ Mem_Buf_Limit: 5MB
+ Refresh_Interval: '10'
+ Skip_Long_Lines: 'On'
+ resources:
+ limits:
+ cpu: 500m
+ memory: 200Mi
+ requests:
+ cpu: 50m
+ memory: 100Mi
+ tolerations:
+ - operator: Exists
+ fluentd:
+ bufferStorageVolume:
+ pvc:
+ spec:
+ accessModes:
+ - ReadWriteOnce
+ resources:
+ requests:
+ storage: 5Gi
+ storageClassName: longhorn
+ image:
+ pullPolicy: IfNotPresent
+ repository: ghcr.io/kube-logging/fluentd
+ tag: v1.17-5.0-full
+ resources:
+ limits:
+ cpu: 1000m
+ memory: 1Gi
+ requests:
+ cpu: 100m
+ memory: 256Mi
+ scaling:
+ replicas: 2
+ terminationGracePeriodSeconds: 60
+ tolerations:
+ - operator: Exists
+
--- cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging ClusterOutput: logging/vcflogs
+++ cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging ClusterOutput: logging/vcflogs
@@ -0,0 +1,26 @@
+---
+apiVersion: logging.banzaicloud.io/v1beta1
+kind: ClusterOutput
+metadata:
+ labels:
+ kustomize.toolkit.fluxcd.io/name: logging
+ kustomize.toolkit.fluxcd.io/namespace: flux-system
+ name: vcflogs
+ namespace: logging
+spec:
+ vmwareLogInsight:
+ agent_id: k8s-talos
+ buffer:
+ chunk_limit_size: 8MB
+ flush_interval: 5s
+ retry_max_interval: 30s
+ total_limit_size: 1GB
+ host: skw-vcflogs.${SECRET_DOMAIN}
+ log_text_keys:
+ - log
+ - msg
+ - message
+ port: 9543
+ scheme: https
+ ssl_verify: false
+
--- cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging ClusterFlow: logging/all-to-vcflogs
+++ cluster-talos/kubernetes/infrastructure/platform/logging/app Kustomization: flux-system/logging ClusterFlow: logging/all-to-vcflogs
@@ -0,0 +1,18 @@
+---
+apiVersion: logging.banzaicloud.io/v1beta1
+kind: ClusterFlow
+metadata:
+ labels:
+ kustomize.toolkit.fluxcd.io/name: logging
+ kustomize.toolkit.fluxcd.io/namespace: flux-system
+ name: all-to-vcflogs
+ namespace: logging
+spec:
+ filters:
+ - record_modifier:
+ remove_keys: logtag
+ globalOutputRefs:
+ - vcflogs
+ match:
+ - select: {}
+ |
|
--- HelmRelease: tanzu-system-logging/fluent-bit ServiceAccount: tanzu-system-logging/fluent-bit
+++ HelmRelease: tanzu-system-logging/fluent-bit ServiceAccount: tanzu-system-logging/fluent-bit
@@ -1,11 +0,0 @@
----
-apiVersion: v1
-kind: ServiceAccount
-metadata:
- name: fluent-bit
- namespace: tanzu-system-logging
- labels:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
- app.kubernetes.io/managed-by: Helm
-
--- HelmRelease: tanzu-system-logging/fluent-bit ConfigMap: tanzu-system-logging/fluent-bit
+++ HelmRelease: tanzu-system-logging/fluent-bit ConfigMap: tanzu-system-logging/fluent-bit
@@ -1,52 +0,0 @@
----
-apiVersion: v1
-kind: ConfigMap
-metadata:
- name: fluent-bit
- namespace: tanzu-system-logging
- labels:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
- app.kubernetes.io/managed-by: Helm
-data:
- custom_parsers.conf: |
- [PARSER]
- Name cri
- Format regex
- Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
- Time_Key time
- Time_Format %Y-%m-%dT%H:%M:%S.%L%z
- fluent-bit.conf: "[Service]\n Flush 1\n Log_Level info\n Daemon \
- \ off\n Parsers_File parsers.conf\n HTTP_Server On\n HTTP_Listen \
- \ 0.0.0.0\n HTTP_Port 2020\n\n[INPUT]\n Name tail\n Path\
- \ /var/log/containers/*.log\n Parser cri\n DB \
- \ /var/log/flb_kube.db\n Tag kube.*\n Mem_Buf_Limit\
- \ 5MB\n Skip_Long_Lines On\n Refresh_Interval 10\n\n[FILTER]\n\
- \ Name kubernetes\n Match kube.*\n Kube_URL \
- \ https://kubernetes.default.svc:443\n Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n\
- \ Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token\n \
- \ Kube_Tag_Prefix kube.var.log.containers.\n Merge_Log On\n Merge_Log_Key\
- \ log_processed\n K8S-Logging.Parser On\n K8S-Logging.Exclude On\n\n\
- [FILTER]\n Name record_modifier\n Match *\n Record\
- \ tkg_instance k8s-talos\n Record tkg_cluster k8s-talos\n\n[FILTER]\n Name\
- \ modify\n Match kube.*\n Copy kubernetes\
- \ k8s\n\n[FILTER]\n Name record_modifier\n Match \
- \ kube.*\n Record node_name $${HOSTNAME}\n\n[FILTER]\n Name \
- \ nest\n Match kube.*\n Operation lift\n\
- \ Nested_Under kubernetes\n\n# CFAPI HTTPS ingest via the vcflogs-cfapi-adapter\
- \ sidecar\n# (see extraContainers above). Replaces the previous syslog\n# output,\
- \ which hit the RFC 5424 2048-byte per-message cap\n# and truncated long Plex\
- \ Web Request: lines mid-token \u2014\n# losing X-Plex-Product / X-Plex-Version\
- \ / X-Plex-Token from\n# the tail when X-Plex-Client-Profile-Extra (1.5KB typical)\n\
- # appeared earlier in the header list.\n#\n# Format json + json_date_format epoch_ms\
- \ produce the shape\n# the adapter expects: top-level JSON array of records, each\n\
- # with a numeric `timestamp` key in epoch ms. The adapter\n# reshapes per-record\
- \ into CFAPI's\n# {\"events\":[{\"text\":...,\"timestamp\":...,\"fields\":[...]}]}\n\
- # wrapper and POSTs to skw-vcflogs CFAPI.\n[OUTPUT]\n Name http\n\
- \ Match kube.*\n Host 127.0.0.1\n Port \
- \ 8080\n URI /\n Format json\n json_date_key \
- \ timestamp\n json_date_format epoch_ms\n tls Off\n # Loopback\
- \ to the sidecar \u2014 retries are cheap; let\n # fluent-bit drive the back-pressure\
- \ rather than blocking\n # the input thread on a long upstream POST.\n Retry_Limit\
- \ 5\n"
-
--- HelmRelease: tanzu-system-logging/fluent-bit ClusterRole: tanzu-system-logging/fluent-bit
+++ HelmRelease: tanzu-system-logging/fluent-bit ClusterRole: tanzu-system-logging/fluent-bit
@@ -1,20 +0,0 @@
----
-apiVersion: rbac.authorization.k8s.io/v1
-kind: ClusterRole
-metadata:
- name: fluent-bit
- labels:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
- app.kubernetes.io/managed-by: Helm
-rules:
-- apiGroups:
- - ''
- resources:
- - namespaces
- - pods
- verbs:
- - get
- - list
- - watch
-
--- HelmRelease: tanzu-system-logging/fluent-bit ClusterRoleBinding: tanzu-system-logging/fluent-bit
+++ HelmRelease: tanzu-system-logging/fluent-bit ClusterRoleBinding: tanzu-system-logging/fluent-bit
@@ -1,18 +0,0 @@
----
-apiVersion: rbac.authorization.k8s.io/v1
-kind: ClusterRoleBinding
-metadata:
- name: fluent-bit
- labels:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
- app.kubernetes.io/managed-by: Helm
-roleRef:
- apiGroup: rbac.authorization.k8s.io
- kind: ClusterRole
- name: fluent-bit
-subjects:
-- kind: ServiceAccount
- name: fluent-bit
- namespace: tanzu-system-logging
-
--- HelmRelease: tanzu-system-logging/fluent-bit Service: tanzu-system-logging/fluent-bit
+++ HelmRelease: tanzu-system-logging/fluent-bit Service: tanzu-system-logging/fluent-bit
@@ -1,21 +0,0 @@
----
-apiVersion: v1
-kind: Service
-metadata:
- name: fluent-bit
- namespace: tanzu-system-logging
- labels:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
- app.kubernetes.io/managed-by: Helm
-spec:
- type: ClusterIP
- ports:
- - port: 2020
- targetPort: http
- protocol: TCP
- name: http
- selector:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
-
--- HelmRelease: tanzu-system-logging/fluent-bit DaemonSet: tanzu-system-logging/fluent-bit
+++ HelmRelease: tanzu-system-logging/fluent-bit DaemonSet: tanzu-system-logging/fluent-bit
@@ -1,115 +0,0 @@
----
-apiVersion: apps/v1
-kind: DaemonSet
-metadata:
- name: fluent-bit
- namespace: tanzu-system-logging
- labels:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
- app.kubernetes.io/managed-by: Helm
-spec:
- selector:
- matchLabels:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
- template:
- metadata:
- labels:
- app.kubernetes.io/name: fluent-bit
- app.kubernetes.io/instance: fluent-bit
- spec:
- serviceAccountName: fluent-bit
- hostNetwork: false
- dnsPolicy: ClusterFirst
- containers:
- - name: fluent-bit
- image: cr.fluentbit.io/fluent/fluent-bit:5.0.5
- imagePullPolicy: IfNotPresent
- command:
- - /fluent-bit/bin/fluent-bit
- args:
- - --workdir=/fluent-bit/etc
- - --config=/fluent-bit/etc/conf/fluent-bit.conf
- ports:
- - name: http
- containerPort: 2020
- protocol: TCP
- livenessProbe:
- httpGet:
- path: /
- port: http
- readinessProbe:
- httpGet:
- path: /api/v2/health
- port: http
- volumeMounts:
- - name: config
- mountPath: /fluent-bit/etc/conf
- - mountPath: /var/log
- name: varlog
- - mountPath: /var/lib/docker/containers
- name: varlibdockercontainers
- readOnly: true
- - mountPath: /etc/machine-id
- name: etcmachineid
- readOnly: true
- - env:
- - name: LISTEN_ADDR
- value: :8080
- - name: VCFLOGS_INGEST_URL
- value: https://skw-vcflogs.${SECRET_DOMAIN}:9543/api/v1/events/ingest/k8s-talos
- - name: VCFLOGS_TLS_INSECURE
- value: 'true'
- - name: FORWARD_TIMEOUT
- value: 15s
- image: ghcr.io/varashi/vcflogs-cfapi-adapter:v0.1.0
- imagePullPolicy: IfNotPresent
- livenessProbe:
- httpGet:
- path: /healthz
- port: ingest
- initialDelaySeconds: 5
- periodSeconds: 30
- name: vcflogs-cfapi-adapter
- ports:
- - containerPort: 8080
- name: ingest
- readinessProbe:
- httpGet:
- path: /healthz
- port: ingest
- initialDelaySeconds: 2
- periodSeconds: 10
- resources:
- limits:
- cpu: 100m
- memory: 128Mi
- requests:
- cpu: 10m
- memory: 32Mi
- securityContext:
- allowPrivilegeEscalation: false
- capabilities:
- drop:
- - ALL
- readOnlyRootFilesystem: true
- runAsNonRoot: true
- runAsUser: 65532
- volumes:
- - name: config
- configMap:
- name: fluent-bit
- - hostPath:
- path: /var/log
- name: varlog
- - hostPath:
- path: /var/lib/docker/containers
- name: varlibdockercontainers
- - hostPath:
- path: /etc/machine-id
- type: File
- name: etcmachineid
- tolerations:
- - operator: Exists
-
--- HelmRelease: logging/logging-operator ServiceAccount: logging/logging-operator
+++ HelmRelease: logging/logging-operator ServiceAccount: logging/logging-operator
@@ -0,0 +1,11 @@
+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+ name: logging-operator
+ namespace: logging
+ labels:
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+ app.kubernetes.io/managed-by: Helm
+
--- HelmRelease: logging/logging-operator ClusterRole: logging/logging-operator
+++ HelmRelease: logging/logging-operator ClusterRole: logging/logging-operator
@@ -0,0 +1,270 @@
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+ name: logging-operator
+ labels:
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+ app.kubernetes.io/managed-by: Helm
+rules:
+- apiGroups:
+ - ''
+ resources:
+ - configmaps
+ - persistentvolumeclaims
+ - pods
+ - secrets
+ - serviceaccounts
+ - services
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - ''
+ resources:
+ - endpoints
+ - namespaces
+ - nodes
+ - nodes/proxy
+ verbs:
+ - get
+ - list
+ - watch
+- apiGroups:
+ - ''
+ - events.k8s.io
+ resources:
+ - events
+ verbs:
+ - create
+ - get
+ - list
+ - watch
+- apiGroups:
+ - apps
+ resources:
+ - daemonsets
+ - replicasets
+ - statefulsets
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - apps
+ - extensions
+ resources:
+ - daemonsets
+ - deployments
+ - statefulsets
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - batch
+ resources:
+ - jobs
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - coordination.k8s.io
+ resources:
+ - leases
+ verbs:
+ - '*'
+- apiGroups:
+ - events.k8s.io
+ resources:
+ - events
+ verbs:
+ - get
+ - list
+ - watch
+- apiGroups:
+ - extensions
+ - networking.k8s.io
+ resources:
+ - ingresses
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - extensions
+ - policy
+ resources:
+ - podsecuritypolicies
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - use
+ - watch
+- apiGroups:
+ - logging-extensions.banzaicloud.io
+ resources:
+ - eventtailers
+ - hosttailers
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - logging-extensions.banzaicloud.io
+ resources:
+ - eventtailers/status
+ - hosttailers/status
+ verbs:
+ - get
+ - patch
+ - update
+- apiGroups:
+ - logging.banzaicloud.io
+ resources:
+ - axosyslogs
+ - clusterflows
+ - clusteroutputs
+ - flows
+ - fluentbitagents
+ - fluentdconfigs
+ - loggingroutes
+ - loggings
+ - outputs
+ - syslogngclusterflows
+ - syslogngclusteroutputs
+ - syslogngconfigs
+ - syslogngflows
+ - syslogngoutputs
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - logging.banzaicloud.io
+ resources:
+ - axosyslogs/status
+ - clusterflows/status
+ - clusteroutputs/status
+ - flows/status
+ - fluentbitagents/status
+ - fluentdconfigs/status
+ - loggingroutes/status
+ - loggings/status
+ - outputs/status
+ - syslogngclusterflows/status
+ - syslogngclusteroutputs/status
+ - syslogngconfigs/status
+ - syslogngflows/status
+ - syslogngoutputs/status
+ verbs:
+ - get
+ - patch
+ - update
+- apiGroups:
+ - logging.banzaicloud.io
+ resources:
+ - loggings/finalizers
+ verbs:
+ - update
+- apiGroups:
+ - monitoring.coreos.com
+ resources:
+ - prometheusrules
+ - servicemonitors
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - policy
+ resources:
+ - poddisruptionbudgets
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - rbac.authorization.k8s.io
+ resources:
+ - clusterrolebindings
+ - clusterroles
+ - rolebindings
+ - roles
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - security.openshift.io
+ resourceNames:
+ - anyuid
+ - privileged
+ resources:
+ - securitycontextconstraints
+ verbs:
+ - use
+- apiGroups:
+ - telemetry.kube-logging.dev
+ resources:
+ - bridges
+ - collectors
+ - outputs
+ - subscriptions
+ - tenants
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - watch
+
--- HelmRelease: logging/logging-operator ClusterRole: logging/logging-operator-edit
+++ HelmRelease: logging/logging-operator ClusterRole: logging/logging-operator-edit
@@ -0,0 +1,41 @@
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+ name: logging-operator-edit
+ labels:
+ rbac.authorization.k8s.io/aggregate-to-edit: 'true'
+ rbac.authorization.k8s.io/aggregate-to-admin: 'true'
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+ app.kubernetes.io/managed-by: Helm
+rules:
+- apiGroups:
+ - logging.banzaicloud.io
+ resources:
+ - flows
+ - outputs
+ verbs:
+ - create
+ - delete
+ - deletecollection
+ - get
+ - list
+ - patch
+ - update
+ - watch
+- apiGroups:
+ - logging.banzaicloud.io
+ resources:
+ - syslogngflows
+ - syslogngoutputs
+ verbs:
+ - create
+ - delete
+ - deletecollection
+ - get
+ - list
+ - patch
+ - update
+ - watch
+
--- HelmRelease: logging/logging-operator ClusterRoleBinding: logging/logging-operator
+++ HelmRelease: logging/logging-operator ClusterRoleBinding: logging/logging-operator
@@ -0,0 +1,18 @@
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+ name: logging-operator
+ labels:
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+ app.kubernetes.io/managed-by: Helm
+subjects:
+- kind: ServiceAccount
+ name: logging-operator
+ namespace: logging
+roleRef:
+ apiGroup: rbac.authorization.k8s.io
+ kind: ClusterRole
+ name: logging-operator
+
--- HelmRelease: logging/logging-operator Service: logging/logging-operator
+++ HelmRelease: logging/logging-operator Service: logging/logging-operator
@@ -0,0 +1,22 @@
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: logging-operator
+ namespace: logging
+ labels:
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ clusterIP: None
+ ports:
+ - port: 8080
+ targetPort: http
+ protocol: TCP
+ name: http
+ selector:
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+
--- HelmRelease: logging/logging-operator Deployment: logging/logging-operator
+++ HelmRelease: logging/logging-operator Deployment: logging/logging-operator
@@ -0,0 +1,40 @@
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: logging-operator
+ namespace: logging
+ labels:
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: logging-operator
+ app.kubernetes.io/instance: logging-operator
+ spec:
+ containers:
+ - name: logging-operator
+ image: ghcr.io/kube-logging/logging-operator:6.5.2
+ args:
+ - -enable-leader-election=true
+ imagePullPolicy: IfNotPresent
+ resources:
+ limits:
+ cpu: 200m
+ memory: 256Mi
+ requests:
+ cpu: 10m
+ memory: 64Mi
+ ports:
+ - name: http
+ containerPort: 8080
+ serviceAccountName: logging-operator
+ |
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughMigrate cluster logging from Tanzu-based fluent-bit (with vcflogs-cfapi-adapter sidecar) to kube-logging operator stack. Update Helm repositories, deploy logging-operator with Logging/ClusterFlow/ClusterOutput CRs, route all logs to vRealize Log Insight via CFAPI, and update platform kustomization and app documentation. ChangesLogging Operator Migration
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly Related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Varashi
added a commit
that referenced
this pull request
May 27, 2026
…stomizations (#153) PR #152 landed but the single Flux Kustomization failed at first apply: the Logging/ClusterFlow/ClusterOutput CRs reference CRDs that don't exist until the logging-operator HelmRelease completes, so the whole kustomization stalled on "no matches for kind Logging in version logging.banzaicloud.io/v1beta1". Cluster currently has NO log forwarding (old fluent-bit gone, new operator never installed). Split mirrors the cert-manager / certs pattern already in this repo: platform/logging/ ├── logging-operator/ (ks: no CRD deps; ships HR + CRDs) │ └── app/ │ ├── namespace.yaml │ ├── helmrelease-operator.yaml │ └── kustomization.yaml └── logging-config/ (ks: dependsOn logging-operator; ships CRs) └── app/ ├── logging.yaml ├── clusteroutput-vcflogs.yaml ├── clusterflow-all.yaml ├── README.md └── kustomization.yaml Flux's dependsOn + `wait: true` makes logging-config sit out until logging-operator reports Ready, which only happens after the HR finishes (CRDs registered, controller pod healthy). Parent platform/kustomization.yaml lists both ks paths. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Migrates cluster log forwarding from the standalone
fluent-bitDaemonSet + customvcflogs-cfapi-adaptersidecar (PR #151, last week) to the Logging Operator pattern. The operator's bundledghcr.io/kube-logging/fluentd:v1.17-5.0-fullimage shipsfluent-plugin-vmware-loginsight 1.4.2— the official VMware CFAPI plugin — so no homemade adapter required.Drivers:
ClusterFlow/ClusterOutputseparate infrastructure from policy. Future per-namespace routing is a CR add, not a config patch.What lands
flux-repositories/kube-logging.yamlplatform/logging/ks.yamltanzu-system-loggingksplatform/logging/app/namespace.yamlloggingwith privileged PSA labelsplatform/logging/app/helmrelease-operator.yamllogging-operatorchart 6.5.2platform/logging/app/logging.yamlplatform/logging/app/clusteroutput-vcflogs.yamlvmwareLogInsightCFAPI targetplatform/logging/app/clusterflow-all.yaml*→ vcflogsplatform/logging/app/README.mdWhat goes
tanzu-system-loggingns + HelmRelease (fluent-bitchart, syslog OUTPUT) + adapter sidecar configflux-repositories/fluent.yaml(no more fluent-bit chart)ghcr.io/varashi/vcflogs-cfapi-adapterimage deletionVarashi/vcflogs-cfapi-adapterGitHub repo deletionSchema validation
CRDs pulled from chart 6.5.2 + verified all field names with a Python AST walk:
Logging.spec.fluentbit.inputTailuses fluent-bit native casing (Skip_Long_Lines/Mem_Buf_Limit/Refresh_Interval) — operator passes through verbatimClusterOutput.spec.vmwareLogInsight— all fields valid (scheme,ssl_verify,host,port,agent_id,log_text_keys,buffer.*)ClusterFlow.spec.match[].select+filters[].record_modifier— validkubectl kustomizerenders all 5 resources clean.Hard-swap cutover
Per discussion: one PR, no dual-run. On merge:
tanzu-system-logging/fluent-bitHelmRelease → fluent-bit DS scales to 0 → log shipping stops cluster-widelogging/directory → operator installs + CRDs ready/var/log/containers/*.log, forward to FluentdExpected log gap: ~2-5 minutes between step 1 and step 5 on this cluster (image pulls + StatefulSet ordered start).
Verification plan (post-merge)
Same approach we ran for PR #151:
Rollback
git revert <merge-commit>→ flux reconcile → ~2 min back to fluent-bit. Adapter image will be deleted by then so a true rollback requires republishing the image from source. The Varashi/vcflogs-cfapi-adapter repo is also being deleted — code remains in git history of #151 if needed.References
fluent-plugin-vmware-loginsightupstream (archived)Summary by CodeRabbit
Chores
Documentation