Skip to content

feat(tanzu-system-logging): swap fluent-bit OUTPUT syslog→CFAPI via vcflogs-cfapi-adapter sidecar#151

Merged
Varashi merged 1 commit into
mainfrom
feat/vcflogs-cfapi-adapter-sidecar
May 27, 2026
Merged

feat(tanzu-system-logging): swap fluent-bit OUTPUT syslog→CFAPI via vcflogs-cfapi-adapter sidecar#151
Varashi merged 1 commit into
mainfrom
feat/vcflogs-cfapi-adapter-sidecar

Conversation

@Varashi
Copy link
Copy Markdown
Owner

@Varashi Varashi commented May 27, 2026

Summary

Replaces the fluent-bit [OUTPUT] syslog output with [OUTPUT] http → a per-pod sidecar adapter that wraps records into vcflogs CFAPI's {"events":[...]} ingest shape. Removes the 2048-byte per-message cap that's been clipping Plex Web Request: lines mid-token since LogVerbose=1 was flipped on 2026-05-26.

Why

vcflogs ingests via two paths:

Path Port Size cap
syslog (TCP) 514 RFC 5424 — 2048 bytes/message
CFAPI (HTTPS) 9543 no documented limit

Plex Web Request: lines run ~3887 bytes when the client carries a fat X-Plex-Client-Profile-Extra header (1500-1900 bytes of codec capability negotiation). The syslog path clips them at byte 2040 — losing X-Plex-Product, X-Plex-Version, X-Plex-Token from the tail. Confirmed live 2026-05-27 during the T3 client-profile-matrix harvest: 8 of 33 transcode-start sessions had cid resolution fail because of the cut.

What

  • New repo + image: Varashi/vcflogs-cfapi-adapter — 80-line Go HTTP service that reshapes fluent-bit's Format json records into CFAPI {events:[{text,timestamp,fields:[{name,content}]}]} shape and forwards.
  • Sidecar in each fluent-bit DaemonSet pod (no new Service, no SPOF beyond per-node):
    • Image: ghcr.io/varashi/vcflogs-cfapi-adapter:v0.1.0
    • 10m CPU req / 100m limit; 32Mi req / 128Mi limit
    • runAsNonRoot uid 65532, readOnlyRootFilesystem, dropped caps
  • fluent-bit OUTPUT swap: syslog → http posting to 127.0.0.1:8080 (the sidecar).

Pre-merge step

The ghcr.io image is currently private (default for new packages). Flip to public before merging so k8s nodes can pull anonymously, matching the existing scaleplex_worker / scaleplex_orchestrator pattern:

https://github.com/users/Varashi/packages/container/vcflogs-cfapi-adapter/settings → "Change visibility" → Public.

(Alternative: add an imagePullSecret on the fluent-bit DS. The sidecar adapter doesn't need it for any other reason.)

Live smoke test (pre-merge)

Posted a synthetic 3KB Plex Web Request: line through the locally-running adapter → vcflogs ingested it → searchable within ~30s → text length 2408 bytes (368 bytes over the previous syslog ceiling) with all X-Plex-* tail headers preserved. Adapter unit tests: 7/7 PASS.

Blast radius

The fluent-bit DaemonSet (8 pods, one per node) rolls when this lands. During the roll, each node briefly stops shipping logs to vcflogs — same as any DS image bump. If the adapter sidecar misbehaves at scale (sustained 5xx upstream, container crash-looping), the symptom is "no logs reaching vcflogs" until rollback.

Rollback: git revert <this-commit> + flux reconcile helmrelease -n tanzu-system-logging fluent-bit → ~2 min back to syslog. The fluent-bit chart's upgrade.remediation.strategy: rollback setting also covers Helm-level chart-render failures.

What stays the same

  • Same vcflogs host (skw-vcflogs.boeye.net), no new endpoint.
  • Same record-level fields (pod_name, container_name, namespace_name, k8s, labels, annotations, tkg, node_name) — they land as CFAPI fields[] entries instead of syslog SD entries, but searchable the same way.
  • Same record_modifier / kubernetes / modify filter chain.

Related

Summary by CodeRabbit

  • Improvements
    • Updated logging infrastructure with enhanced security controls for the log processing pipeline
    • Modified log forwarding mechanism to improve reliability and performance in log transmission

Review Change Stack

…via vcflogs-cfapi-adapter sidecar

vcflogs (vRealize Log Insight) ingests via syslog (TCP/514, RFC
5424 — 2048-byte per-message cap) OR CFAPI HTTPS (port 9543, no
size limit). The syslog path clipped long Plex Web Request: lines
at byte 2040 — losing X-Plex-Product / X-Plex-Version /
X-Plex-Token from the tail when the 1.5KB X-Plex-Client-Profile-
Extra header appeared earlier in the request. Surfaced 2026-05-27
during the T3 client-profile-matrix harvest: 8 of 33 transcode-
start sessions had cid headers truncated away.

fluent-bit's [OUTPUT] http produces top-level JSON arrays or
NDJSON; CFAPI requires a {"events":[...]} wrapper with per-event
{text,timestamp,fields[]} shape. The new vcflogs-cfapi-adapter
sidecar (github.com/Varashi/vcflogs-cfapi-adapter) bridges that gap
— runs alongside fluent-bit in each DaemonSet pod, listens on
localhost:8080, reshapes records, batches, forwards.

Topology:
  fluent-bit OUTPUT http → 127.0.0.1:8080 (adapter sidecar) →
  https://skw-vcflogs.boeye.net:9543/api/v1/events/ingest/k8s-talos

Sidecar pattern (no new Service, no SPOF beyond per-node):
  - image: ghcr.io/varashi/vcflogs-cfapi-adapter:v0.1.0
  - 10m CPU req / 100m limit; 32Mi req / 128Mi limit
  - runAsNonRoot uid 65532, readOnlyRootFilesystem, dropped caps
  - liveness+readiness on /healthz

Live smoke test pre-merge:
  Synthetic 3KB Plex Web Request line through adapter →
  vcflogs landed at 2408 bytes (368 over the previous syslog
  ceiling), all X-Plex-* tail headers preserved.

Rollback:
  git revert + flux reconcile → ~2 min back to syslog. Adapter
  image stays available on ghcr.io if forward-pulled.

Closes the v1.7.0 follow-up 'vcflogs transcode-start line
truncation' investigation item.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

kustomization diff:

--- cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app Kustomization: flux-system/tanzu-system-logging HelmRelease: tanzu-system-logging/fluent-bit

+++ cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app Kustomization: flux-system/tanzu-system-logging HelmRelease: tanzu-system-logging/fluent-bit

@@ -75,37 +75,78 @@

           Parser              cri
           DB                  /var/log/flb_kube.db
           Tag                 kube.*
           Mem_Buf_Limit       5MB
           Skip_Long_Lines     On
           Refresh_Interval    10
-      outputs: |
-        [OUTPUT]
-          Name                 syslog
-          Match                kube.*
-          Host                 skw-vcflogs.${SECRET_DOMAIN}
-          Port                 514
-          Mode                 tcp
-          Syslog_Format        rfc5424
-          Syslog_Hostname_key  tkg_cluster
-          Syslog_Appname_key   pod_name
-          Syslog_Procid_key    container_name
-          Syslog_Message_key   message
-          Syslog_Msgid_key     namespace_name
-          Syslog_SD_key        k8s
-          Syslog_SD_key        labels
-          Syslog_SD_key        annotations
-          Syslog_SD_key        tkg
+      outputs: "# CFAPI HTTPS ingest via the vcflogs-cfapi-adapter sidecar\n# (see\
+        \ extraContainers above). Replaces the previous syslog\n# output, which hit\
+        \ the RFC 5424 2048-byte per-message cap\n# and truncated long Plex Web Request:\
+        \ lines mid-token \u2014\n# losing X-Plex-Product / X-Plex-Version / X-Plex-Token\
+        \ from\n# the tail when X-Plex-Client-Profile-Extra (1.5KB typical)\n# appeared\
+        \ earlier in the header list.\n#\n# Format json + json_date_format epoch_ms\
+        \ produce the shape\n# the adapter expects: top-level JSON array of records,\
+        \ each\n# with a numeric `timestamp` key in epoch ms. The adapter\n# reshapes\
+        \ per-record into CFAPI's\n# {\"events\":[{\"text\":...,\"timestamp\":...,\"\
+        fields\":[...]}]}\n# wrapper and POSTs to skw-vcflogs CFAPI.\n[OUTPUT]\n \
+        \ Name              http\n  Match             kube.*\n  Host             \
+        \ 127.0.0.1\n  Port              8080\n  URI               /\n  Format   \
+        \         json\n  json_date_key     timestamp\n  json_date_format  epoch_ms\n\
+        \  tls               Off\n  # Loopback to the sidecar \u2014 retries are cheap;\
+        \ let\n  # fluent-bit drive the back-pressure rather than blocking\n  # the\
+        \ input thread on a long upstream POST.\n  Retry_Limit       5\n"
       service: |
         [Service]
           Flush         1
           Log_Level     info
           Daemon        off
           Parsers_File  parsers.conf
           HTTP_Server   On
           HTTP_Listen   0.0.0.0
           HTTP_Port     2020
+    extraContainers:
+    - env:
+      - name: LISTEN_ADDR
+        value: 127.0.0.1:8080
+      - name: VCFLOGS_INGEST_URL
+        value: https://skw-vcflogs.${SECRET_DOMAIN}:9543/api/v1/events/ingest/k8s-talos
+      - name: VCFLOGS_TLS_INSECURE
+        value: 'true'
+      - name: FORWARD_TIMEOUT
+        value: 15s
+      image: ghcr.io/varashi/vcflogs-cfapi-adapter:v0.1.0
+      imagePullPolicy: IfNotPresent
+      livenessProbe:
+        httpGet:
+          host: 127.0.0.1
+          path: /healthz
+          port: 8080
+        initialDelaySeconds: 5
+        periodSeconds: 30
+      name: vcflogs-cfapi-adapter
+      readinessProbe:
+        httpGet:
+          host: 127.0.0.1
+          path: /healthz
+          port: 8080
+        initialDelaySeconds: 2
+        periodSeconds: 10
+      resources:
+        limits:
+          cpu: 100m
+          memory: 128Mi
+        requests:
+          cpu: 10m
+          memory: 32Mi
+      securityContext:
+        allowPrivilegeEscalation: false
+        capabilities:
+          drop:
+          - ALL
+        readOnlyRootFilesystem: true
+        runAsNonRoot: true
+        runAsUser: 65532
     fullnameOverride: fluent-bit
     kind: DaemonSet
     tolerations:
     - operator: Exists
 

@github-actions
Copy link
Copy Markdown

helmrelease diff:

--- HelmRelease: tanzu-system-logging/fluent-bit ConfigMap: tanzu-system-logging/fluent-bit

+++ HelmRelease: tanzu-system-logging/fluent-bit ConfigMap: tanzu-system-logging/fluent-bit

@@ -13,77 +13,40 @@

     [PARSER]
       Name                 cri
       Format               regex
       Regex                ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
       Time_Key             time
       Time_Format          %Y-%m-%dT%H:%M:%S.%L%z
-  fluent-bit.conf: |
-    [Service]
-      Flush         1
-      Log_Level     info
-      Daemon        off
-      Parsers_File  parsers.conf
-      HTTP_Server   On
-      HTTP_Listen   0.0.0.0
-      HTTP_Port     2020
+  fluent-bit.conf: "[Service]\n  Flush         1\n  Log_Level     info\n  Daemon \
+    \       off\n  Parsers_File  parsers.conf\n  HTTP_Server   On\n  HTTP_Listen \
+    \  0.0.0.0\n  HTTP_Port     2020\n\n[INPUT]\n  Name                tail\n  Path\
+    \                /var/log/containers/*.log\n  Parser              cri\n  DB  \
+    \                /var/log/flb_kube.db\n  Tag                 kube.*\n  Mem_Buf_Limit\
+    \       5MB\n  Skip_Long_Lines     On\n  Refresh_Interval    10\n\n[FILTER]\n\
+    \  Name                kubernetes\n  Match               kube.*\n  Kube_URL  \
+    \          https://kubernetes.default.svc:443\n  Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n\
+    \  Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token\n \
+    \ Kube_Tag_Prefix     kube.var.log.containers.\n  Merge_Log           On\n  Merge_Log_Key\
+    \       log_processed\n  K8S-Logging.Parser  On\n  K8S-Logging.Exclude On\n\n\
+    [FILTER]\n  Name                record_modifier\n  Match               *\n  Record\
+    \ tkg_instance k8s-talos\n  Record tkg_cluster  k8s-talos\n\n[FILTER]\n  Name\
+    \                modify\n  Match               kube.*\n  Copy                kubernetes\
+    \ k8s\n\n[FILTER]\n  Name                record_modifier\n  Match            \
+    \   kube.*\n  Record              node_name $${HOSTNAME}\n\n[FILTER]\n  Name \
+    \               nest\n  Match               kube.*\n  Operation           lift\n\
+    \  Nested_Under        kubernetes\n\n# CFAPI HTTPS ingest via the vcflogs-cfapi-adapter\
+    \ sidecar\n# (see extraContainers above). Replaces the previous syslog\n# output,\
+    \ which hit the RFC 5424 2048-byte per-message cap\n# and truncated long Plex\
+    \ Web Request: lines mid-token \u2014\n# losing X-Plex-Product / X-Plex-Version\
+    \ / X-Plex-Token from\n# the tail when X-Plex-Client-Profile-Extra (1.5KB typical)\n\
+    # appeared earlier in the header list.\n#\n# Format json + json_date_format epoch_ms\
+    \ produce the shape\n# the adapter expects: top-level JSON array of records, each\n\
+    # with a numeric `timestamp` key in epoch ms. The adapter\n# reshapes per-record\
+    \ into CFAPI's\n# {\"events\":[{\"text\":...,\"timestamp\":...,\"fields\":[...]}]}\n\
+    # wrapper and POSTs to skw-vcflogs CFAPI.\n[OUTPUT]\n  Name              http\n\
+    \  Match             kube.*\n  Host              127.0.0.1\n  Port           \
+    \   8080\n  URI               /\n  Format            json\n  json_date_key   \
+    \  timestamp\n  json_date_format  epoch_ms\n  tls               Off\n  # Loopback\
+    \ to the sidecar \u2014 retries are cheap; let\n  # fluent-bit drive the back-pressure\
+    \ rather than blocking\n  # the input thread on a long upstream POST.\n  Retry_Limit\
+    \       5\n"
 
-    [INPUT]
-      Name                tail
-      Path                /var/log/containers/*.log
-      Parser              cri
-      DB                  /var/log/flb_kube.db
-      Tag                 kube.*
-      Mem_Buf_Limit       5MB
-      Skip_Long_Lines     On
-      Refresh_Interval    10
-
-    [FILTER]
-      Name                kubernetes
-      Match               kube.*
-      Kube_URL            https://kubernetes.default.svc:443
-      Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
-      Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
-      Kube_Tag_Prefix     kube.var.log.containers.
-      Merge_Log           On
-      Merge_Log_Key       log_processed
-      K8S-Logging.Parser  On
-      K8S-Logging.Exclude On
-
-    [FILTER]
-      Name                record_modifier
-      Match               *
-      Record tkg_instance k8s-talos
-      Record tkg_cluster  k8s-talos
-
-    [FILTER]
-      Name                modify
-      Match               kube.*
-      Copy                kubernetes k8s
-
-    [FILTER]
-      Name                record_modifier
-      Match               kube.*
-      Record              node_name $${HOSTNAME}
-
-    [FILTER]
-      Name                nest
-      Match               kube.*
-      Operation           lift
-      Nested_Under        kubernetes
-
-    [OUTPUT]
-      Name                 syslog
-      Match                kube.*
-      Host                 skw-vcflogs.${SECRET_DOMAIN}
-      Port                 514
-      Mode                 tcp
-      Syslog_Format        rfc5424
-      Syslog_Hostname_key  tkg_cluster
-      Syslog_Appname_key   pod_name
-      Syslog_Procid_key    container_name
-      Syslog_Message_key   message
-      Syslog_Msgid_key     namespace_name
-      Syslog_SD_key        k8s
-      Syslog_SD_key        labels
-      Syslog_SD_key        annotations
-      Syslog_SD_key        tkg
-
--- HelmRelease: tanzu-system-logging/fluent-bit DaemonSet: tanzu-system-logging/fluent-bit

+++ HelmRelease: tanzu-system-logging/fluent-bit DaemonSet: tanzu-system-logging/fluent-bit

@@ -51,12 +51,53 @@

         - mountPath: /var/lib/docker/containers
           name: varlibdockercontainers
           readOnly: true
         - mountPath: /etc/machine-id
           name: etcmachineid
           readOnly: true
+      - env:
+        - name: LISTEN_ADDR
+          value: 127.0.0.1:8080
+        - name: VCFLOGS_INGEST_URL
+          value: https://skw-vcflogs.${SECRET_DOMAIN}:9543/api/v1/events/ingest/k8s-talos
+        - name: VCFLOGS_TLS_INSECURE
+          value: 'true'
+        - name: FORWARD_TIMEOUT
+          value: 15s
+        image: ghcr.io/varashi/vcflogs-cfapi-adapter:v0.1.0
+        imagePullPolicy: IfNotPresent
+        livenessProbe:
+          httpGet:
+            host: 127.0.0.1
+            path: /healthz
+            port: 8080
+          initialDelaySeconds: 5
+          periodSeconds: 30
+        name: vcflogs-cfapi-adapter
+        readinessProbe:
+          httpGet:
+            host: 127.0.0.1
+            path: /healthz
+            port: 8080
+          initialDelaySeconds: 2
+          periodSeconds: 10
+        resources:
+          limits:
+            cpu: 100m
+            memory: 128Mi
+          requests:
+            cpu: 10m
+            memory: 32Mi
+        securityContext:
+          allowPrivilegeEscalation: false
+          capabilities:
+            drop:
+            - ALL
+          readOnlyRootFilesystem: true
+          runAsNonRoot: true
+          runAsUser: 65532
       volumes:
       - name: config
         configMap:
           name: fluent-bit
       - hostPath:
           path: /var/log

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

📝 Walkthrough

Walkthrough

Fluent-bit logging pipeline switches from direct syslog output to HTTP adapter sidecar forwarding. A new vcflogs-cfapi-adapter sidecar is added with environment-based configuration, health probes, and hardened security posture. Output routing changes from RFC5424 syslog to JSON-formatted HTTP requests to the adapter at loopback.

Changes

Fluent-bit CFAPI adapter logging pipeline

Layer / File(s) Summary
vcflogs-cfapi adapter sidecar definition
cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app/helmrelease.yaml
Sidecar container added with listen address and CFAPI ingest URL environment variables, TCP and HTTP health probes, CPU/memory requests/limits (50m/64Mi–200m/256Mi), and security context enforcing dropped capabilities, read-only root filesystem, and non-root user.
Fluent-bit output routing to HTTP adapter
cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app/helmrelease.yaml
Syslog RFC5424 TCP output to remote host removed; replaced with HTTP output forwarding kube.* logs to adapter at 127.0.0.1:8080 using JSON format with epoch_ms timestamps and configured retry limit.

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title directly matches main changeset objective: replacing fluent-bit syslog OUTPUT with HTTP OUTPUT via vcflogs-cfapi-adapter sidecar.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app/helmrelease.yaml`:
- Around line 38-67: The HelmRelease sets only container-level securityContext
for the extraContainer named vcflogs-cfapi-adapter but lacks pod-level
podSecurityContext/seccompProfile; add values.podSecurityContext to this
helmrelease.yaml (e.g., set podSecurityContext: { seccompProfile: { type:
RuntimeDefault } } or equivalent supported by the chart) so the chart applies a
pod-level seccompProfile and other podSecurityContext settings to the fluent-bit
DaemonSet; ensure the pod-level key is added alongside extraContainers in the
values structure so the chart picks it up.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 6e38e055-8614-453e-a556-7ee62fe917f4

📥 Commits

Reviewing files that changed from the base of the PR and between 76178f6 and 639595e.

📒 Files selected for processing (1)
  • cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app/helmrelease.yaml

Comment on lines +38 to +67
extraContainers:
- name: vcflogs-cfapi-adapter
image: ghcr.io/varashi/vcflogs-cfapi-adapter:v0.1.0
imagePullPolicy: IfNotPresent
env:
- name: LISTEN_ADDR
value: "127.0.0.1:8080"
- name: VCFLOGS_INGEST_URL
value: "https://skw-vcflogs.${SECRET_DOMAIN}:9543/api/v1/events/ingest/k8s-talos"
- name: VCFLOGS_TLS_INSECURE
value: "true"
- name: FORWARD_TIMEOUT
value: "15s"
resources:
requests: {cpu: 10m, memory: 32Mi}
limits: {cpu: 100m, memory: 128Mi}
livenessProbe:
httpGet: {path: /healthz, port: 8080, host: 127.0.0.1}
initialDelaySeconds: 5
periodSeconds: 30
readinessProbe:
httpGet: {path: /healthz, port: 8080, host: 127.0.0.1}
initialDelaySeconds: 2
periodSeconds: 10
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 65532
capabilities: {drop: [ALL]}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

For fluent/fluent-bit Helm chart version 0.57.5, which values key configures pod-level security context (including seccompProfile)?

💡 Result:

In the fluent/fluent-bit Helm chart version 0.57.5, the pod-level security context is configured using the podSecurityContext values key [1][2][3]. While the chart provides a podSecurityContext object in its values.yaml, it is typically empty by default [1][2][3]. To add a seccompProfile at the pod level, you must manually define it within the podSecurityContext key in your values file, as follows: podSecurityContext: seccompProfile: type: RuntimeDefault Please note that this applies to the official Fluent Bit Helm chart maintained by the Fluent project [1][4]. If you are using a different Helm chart for Fluent Bit (such as the one provided by Bitnami), the configuration keys and structures may differ [5][6].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app/helmrelease.yaml"

echo "== File exists? =="
ls -l "$FILE"

echo
echo "== Lines 1-120 (to capture chart values wiring and pod security context keys) =="
nl -ba "$FILE" | sed -n '1,200p'

echo
echo "== Search for pod security context wiring keys =="
rg -n --hidden --no-ignore-vcs -S "podSecurityContext|seccompProfile|securityContext:" "$FILE"

echo
echo "== Show segment around provided lines 38-67 =="
nl -ba "$FILE" | sed -n '30,90'

Repository: Varashi/k8s

Length of output: 358


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app/helmrelease.yaml"

echo "== Lines 1-200 (with line numbers) =="
awk '{printf "%6d\t%s\n", NR, $0}' "$FILE" | sed -n '1,200p'

echo
echo "== Search for pod/security context keys in this HelmRelease =="
rg -n "podSecurityContext|seccompProfile|securityContext" "$FILE" || true

echo
echo "== Segment lines 30-90 (with line numbers) =="
awk '{printf "%6d\t%s\n", NR, $0}' "$FILE" | sed -n '30,90p'

Repository: Varashi/k8s

Length of output: 9752


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app/helmrelease.yaml"

echo "== podSecurityContext occurrences =="
rg -n "podSecurityContext" "$FILE" || true

echo
echo "== seccompProfile occurrences =="
rg -n "seccompProfile" "$FILE" || true

echo
echo "== pod-level securityContext hint (no extraContainers context) =="
rg -n --context=2 "securityContext:" "$FILE" || true

Repository: Varashi/k8s

Length of output: 375


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n --hidden --no-ignore-vcs "podSecurityContext:" cluster-talos/kubernetes || true

Repository: Varashi/k8s

Length of output: 37


Add pod-level securityContext for the fluent-bit DaemonSet

This HelmRelease only sets container-level securityContext for extraContainers[].securityContext (vcflogs-cfapi-adapter) and defines no pod-level podSecurityContext / seccompProfile. Add values.podSecurityContext (e.g., seccompProfile.type: RuntimeDefault) in this helmrelease.yaml so the chart applies it to the pod.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@cluster-talos/kubernetes/infrastructure/platform/tanzu-system-logging/app/helmrelease.yaml`
around lines 38 - 67, The HelmRelease sets only container-level securityContext
for the extraContainer named vcflogs-cfapi-adapter but lacks pod-level
podSecurityContext/seccompProfile; add values.podSecurityContext to this
helmrelease.yaml (e.g., set podSecurityContext: { seccompProfile: { type:
RuntimeDefault } } or equivalent supported by the chart) so the chart applies a
pod-level seccompProfile and other podSecurityContext settings to the fluent-bit
DaemonSet; ensure the pod-level key is added alongside extraContainers in the
values structure so the chart picks it up.

@Varashi Varashi merged commit 5928b3e into main May 27, 2026
4 checks passed
@Varashi Varashi deleted the feat/vcflogs-cfapi-adapter-sidecar branch May 27, 2026 06:00
Varashi added a commit that referenced this pull request May 27, 2026
…let probes

httpGet liveness/readiness probes run from the node kubelet, NOT
from inside the container netns. With LISTEN_ADDR=127.0.0.1:8080
the adapter only listened on the container's loopback, which the
host-side probe can't reach (kubelet's 127.0.0.1 = node loopback,
not container). Pod crash-looped every ~60s on liveness fail
during the initial rollout of #151.

Fix: bind :8080 (all interfaces) inside the container netns. The
netns is pod-scoped — no Service/hostPort means the port is still
unreachable from outside the pod. fluent-bit (same pod, same
netns) continues to reach the sidecar via 127.0.0.1:8080.

Also: declare containerPort: 8080 (lets the probe reference it by
name, future-proofs against port renames).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Varashi added a commit that referenced this pull request May 27, 2026
… CFAPI) (#152)

Replaces the previous standalone fluent-bit DaemonSet +
vcflogs-cfapi-adapter sidecar with the Logging Operator pattern.

Architectural reasons:

- The vmware-loginsight CFAPI plugin (1.4.2, bundled in the
  operator's ghcr.io/kube-logging/fluentd:v1.17-5.0-full image)
  replaces our 80-LOC homemade adapter. Maintenance burden moves
  off us.
- Operator-managed Fluent Bit (per-node collector) + Fluentd
  (HA ×2 aggregator) is the canonical k8s logging topology used at
  enterprise scale — direct knowledge transfer for the work-side
  vcflogs project.
- CRD-driven config (Logging / ClusterFlow / ClusterOutput)
  separates infrastructure from routing policy, enabling future
  per-namespace selectivity without ops involvement.

What lands:

- New ns "logging" (replaces "tanzu-system-logging")
- HelmRelease for logging-operator chart 6.5.2 from
  oci://ghcr.io/kube-logging/helm-charts
- Logging CR — Fluent Bit DS + Fluentd STS×2 specs
- ClusterOutput "vcflogs" — vmwareLogInsight HTTPS CFAPI
- ClusterFlow "all-to-vcflogs" — match *, single drop logtag
- New HelmRepository "kube-logging" (oci); removes "fluent" repo

What goes:

- "tanzu-system-logging" ns + HelmRelease + ConfigMap deleted
- "flux-repositories/fluent.yaml" HelmRepository deleted
- ghcr.io/varashi/vcflogs-cfapi-adapter:* image + Varashi/vcflogs-
  cfapi-adapter repo to be deleted manually after Flux reconciles
  successfully (one-way action; image referenced in #151 commit
  message remains discoverable via git history)

Hard-swap cutover: brief log gap (~2-5 min) while Flux deletes the
old DS, installs the operator, and reconciles the Logging CR.

Schema verified against logging-operator 6.5.2 CRDs:
- Logging: spec.fluentbit.inputTail uses native fluent-bit casing
  (Skip_Long_Lines / Mem_Buf_Limit / Refresh_Interval)
- ClusterOutput.spec.vmwareLogInsight: all fields valid
- ClusterFlow.spec.match[].select + filters[].record_modifier: valid

Plex configmaps + READMEs updated to reference the new ns/topology.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant