Skip to content

mwildnrx/flink

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Playbook Dokumentasi Setup Kubernetes Single Node & Confluent Manager for Apache Flink

Dokumen ini berisi panduan langkah-demi-langkah (playbook) untuk instalasi, konfigurasi, peningkatan (upgrade), serta manajemen operasional cluster Kubernetes Single Node yang dikonfigurasi bersama Confluent Manager for Apache Flink (CMF Cluster).


📌 Metadata Dokumen

Atribut Informasi
Dibuat Oleh System Administrator / Data Stream Engineer (Manual Setup)
Nama Mohammad Wildan Nuryulda
Email nuryulda@gmail.com
Tanggal Pembuatan 4 Juni 2026
Lingkungan (Environment) Staging (staging-env)

📋 Daftar Isi

  1. Persiapan Sistem Operasi (OS Preparation)
  2. Instalasi Container Runtime (Containerd)
  3. Instalasi Komponen Kubernetes (v1.33)
  4. Inisialisasi Cluster & Konfigurasi CNI (Cilium)
  5. Prosedur Upgrade Kubernetes (v1.33 ke v1.35)
  6. Deployment Confluent Manager for Apache Flink (CMF)
  7. Manajemen Aplikasi Flink (FlinkApplication & Catalogs)
  8. Panduan Operasional, Query, & Troubleshooting (FAQ)

1. Persiapan Sistem Operasi (OS Preparation)

Langkah awal untuk memastikan sistem operasi siap menjalankan node Kubernetes tanpa interferensi dari manajemen memori atau kebijakan keamanan OS yang terlalu ketat.

1.1 Nonaktifkan Swap Memory

Kubernetes mensyaratkan Swap dinonaktifkan agar manajemen alokasi resource pod berjalan akurat.

Matiin Swap

sudo swapoff -a sudo sed -i '/ swap / s/^(.*)$/#\1/g' /etc/fstab

Set SELinux ke Permissive (agar tidak memblokir komunikasi pod)

sudo setenforce 0 sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

Load Modul Kernel

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF

sudo modprobe overlay sudo modprobe br_netfilter

Sysctl untuk Networking Kubernetes

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF

Terapin perubahan sysctl

sudo sysctl --system

Repositori Docker

sudo dnf -y install dnf-plugins-core sudo dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

Install containerd

sudo dnf install -y containerd.io

Generate konfigurasi default containerd

sudo mkdir -p /etc/containerd containerd config default | sudo tee /etc/containerd/config.toml > /dev/null

Konfigurasi SystemdCgroup dan Sandbox Image v1.33

(Kubernetes wajib menggunakan SystemdCgroup = true pada RHEL/Rocky)

sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml sudo sed -i 's|sandbox_image = ".*"|sandbox_image = "registry.k8s.io/pause:3.10"|g' /etc/containerd/config.toml

Start dan Enable Containerd

sudo systemctl daemon-reload sudo systemctl enable --now containerd

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://pkgs.k8s.io/core:/stable:/v1.33/rpm/ enabled=1 gpgcheck=1 gpgkey=https://pkgs.k8s.io/core:/stable:/v1.33/rpm/repodata/repomd.xml.key exclude=kubelet kubeadm kubectl EOF

Install tools

sudo dnf install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

Aktifkan kubelet (dia bakalan standby sampai kubeadm init dijalankan)

sudo systemctl enable --now kubelet

Download dan ekstrak Helm v3.17.3

curl -LO https://get.helm.sh/helm-v3.17.3-linux-amd64.tar.gz tar -zxvf helm-v3.17.3-linux-amd64.tar.gz sudo mv linux-amd64/helm /usr/local/bin/helm rm -rf helm-v3.17.3-linux-amd64.tar.gz linux-amd64

Verifikasi instalasi

helm version

Nambahin repo prometheus

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update

Inisialisasi Cluster

sudo kubeadm init
--apiserver-advertise-address=10.10.10.106
--pod-network-cidr=10.244.0.0/16

sudo kubeadm init --pod-network-cidr=10.0.0.0/16

Setup Kubeconfig untuk user biasa agar bisa akses kubectl

mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config

Cek status node (pasti masih NotReady karena belum ada CNI)

kubectl get nodes

Download dan install Cilium CLI

curl -L --remote-name https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz sudo tar xzvf cilium-linux-amd64.tar.gz -C /usr/local/bin rm -fn cilium-linux-amd64.tar.gz

Install Cilium ke Cluster K8s

cilium install

Verifikasi Status

kubectl get nodes cilium status

Cek Status Pod Secara Detail

kubectl get pods -n kube-system

Pantau pod sampai semuanya jadi Running

kubectl get pods -n kube-system -w

Upgrade version ke repo 1.35

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://pkgs.k8s.io/core:/stable:/v1.35/rpm/ enabled=1 gpgcheck=1 gpgkey=https://pkgs.k8s.io/core:/stable:/v1.35/rpm/repodata/repomd.xml.key exclude=kubelet kubeadm kubectl EOF

sudo dnf clean all

Upgrade kubeadm lagi ke 1.35

sudo dnf upgrade -y kubeadm --disableexcludes=kubernetes

Eksekusi apply 1.35 (Ganti patch versinya sesuai plan baru, misal v1.35.5)

sudo kubeadm upgrade apply v1.35.5 -y

Finalisasi kubelet & kubectl ke 1.35

sudo dnf upgrade -y kubelet kubectl --disableexcludes=kubernetes sudo systemctl daemon-reload && sudo systemctl restart kubelet

Playbook Dokumentasi Setup Confluent Manager for Apache Flink Cluster (CMF Cluster)

Add Confluent repository.

helm repo add confluentinc https://packages.confluent.io/helm helm repo update

Install certificate manager.

kubectl create -f https://github.com/jetstack/cert-manager/releases/download/v1.18.2/cert-manager.yaml

Install Flink Kubernetes Operator

helm upgrade --install cp-flink-kubernetes-operator confluentinc/flink-kubernetes-operator
--version "~1.140.0"
--namespace confluent-flink
--set watchNamespaces="{confluent-flink}"

Install CMF Details

kubectl create secret generic flink-license-secret
--from-file=license.txt=/data/flink/license.txt
-n confluent-flink

kubectl create secret generic cmf-postgres-secret
--from-literal=password='cmf_mwn'
-n confluent-flink

helm upgrade --install cmf confluentinc/confluent-manager-for-apache-flink
--namespace confluent-flink
--set license.secretRef="flink-license-secret"
--set resources.requests.cpu="500m"
--set resources.requests.memory="1024Mi"
--set resources.limits.cpu="1"
--set resources.limits.memory="2048Mi"
--set persistence.create=false
--set cmf.database.type="jdbc"
--set cmf.database.jdbc.engine="postgresql"
--set cmf.database.jdbc.url="10.10.10.105"
--set cmf.database.jdbc.port=5432
--set cmf.database.jdbc.database="cmf_mwn"
--set cmf.database.jdbc.user="cmf_mwn"
--set cmf.database.jdbc.password.kubernetesSecretName="cmf-postgres-secret"
--set cmf.database.jdbc.password.kubernetesSecretProperty="password"

kubectl logs -n confluent-flink -l app.kubernetes.io/name=confluent-manager-for-apache-flink --tail=100 -f

Open port forwarding buat CMF (background= '&')

nohup kubectl port-forward -n confluent-flink deployment/confluent-manager-for-apache-flink 8080:8080 --address 0.0.0.0 > /dev/null 2>&1 &

Setup FlinkApplication YAML

apiVersion: cmf.confluent.io/v1 kind: FlinkApplication metadata: name: staging-state-machine namespace: confluent-flink spec: flinkVersion: "v2_0" image: confluentinc/cp-flink:2.0.1-cp1 serviceAccount: flink
flinkConfiguration: metrics.reporter.prom.factory.class: "org.apache.flink.metrics.prometheus.PrometheusReporterFactory" metrics.reporter.prom.port: "9249-9250" taskmanager.numberOfTaskSlots: "4" execution.checkpointing.interval: "1min" execution.checkpointing.mode: "EXACTLY_ONCE" execution.checkpointing.min-pause: "30s" execution.checkpointing.max-concurrent-checkpoints: "1" execution.checkpointing.externalized-checkpoint-retention: "RETAIN_ON_CANCELLATION" state.backend.type: "rocksdb" state.backend.incremental: "true" state.checkpoints.dir: "file:///opt/flink/volume/checkpoints" state.savepoints.dir: "file:///opt/flink/volume/savepoints" high-availability.type: "kubernetes" high-availability.storageDir: "file:///opt/flink/volume/ha" restart-strategy.type: "exponential-delay" restart-strategy.exponential-delay.initial-backoff: "10 s" restart-strategy.exponential-delay.max-backoff: "2 min" job: jarURI: "local:///opt/flink/examples/streaming/StateMachineExample.jar" parallelism: 2 state: "running" upgradeMode: "last-state" jobManager: resource: cpu: 0.5 memory: "1024m" taskManager: resource: cpu: 0.5 memory: "6Gi" podTemplate: spec: containers: - name: flink-main-container volumeMounts: - name: flink-host-storage mountPath: /opt/flink/volume volumes: - name: flink-host-storage hostPath: path: /data/flink/cmf/storage type: Directory

Create POD FlinkApplication

confluent flink application apply staging-state-machine.yaml --environment staging-env --url http://10.10.10.106:8080 confluent flink application create staging-state-machine.yaml --environment staging-env --url http://10.10.10.106:8080 confluent flink application update staging-state-machine.yaml --environment staging-env --url http://10.10.10.106:8080

Create POD Compute Pools

confluent flink compute-pool create /data/flink/cmf/apps/staging-shared-pool.yaml --environment staging-env --url http://10.10.10.106:8080

POD Catalog JSON

{ "apiVersion": "cmf.confluent.io/v1", "kind": "KafkaCatalog", "metadata": { "name": "staging-catalog" }, "spec": { "srInstance": { "connectionConfig": { "schema.registry.url": "http://10.10.10.106:8081" } } } }

Create POD Catalog using curl POST

curl -v -H "Content-Type: application/json" -X POST http://10.10.10.106:8080/cmf/api/v1/catalogs/kafka -d@/data/flink/cmf/apps/staging-catalog.json

POD Database JSON

{ "apiVersion": "cmf.confluent.io/v1", "kind": "KafkaDatabase", "metadata": { "name": "staging-database" }, "spec": { "kafkaCluster": { "connectionConfig": { "bootstrap.servers": "10.10.10.106:9092" } }, "ddlEnvironments": [ "staging-env" ] } }

Create POD Database using curl POST

curl -v -H "Content-Type: application/json" -X POST http://10.10.10.106:8080/cmf/api/v1/catalogs/kafka/staging-catalog/databases -d@/data/flink/cmf/apps/staging-database.json

Test Query Table

DESCRIBE account

SELECT JSON_VALUE(CAST(val AS STRING), '$.headers.operation') AS operation, JSON_VALUE(CAST(val AS STRING), '$.headers.timestamp') AS attunity_timestamp, JSON_VALUE(CAST(val AS STRING), '$.data.RECID["com.cdc.RecidRecord"].string') AS recid, JSON_VALUE(CAST(val AS STRING), '$.data.XMLRECORD["com.cdc.XmlrecordRecord"].string') AS xmlrecord FROM account

curl -s http://10.10.10.106:8080/cmf/api/v1/environments/staging-env/statements/[MASUKKAN-ID-STATEMENT] | python3 -m json.tool

confluent flink statement web-ui-forward [MASUKKAN-ID-STATEMENT] --environment staging-env --url http://10.10.10.106:8080 --port 8081

Check query Flink SQL anu Nyangkut/Pending

confluent flink statement list --environment staging-env --compute-pool staging-shared-pool --url http://10.10.10.106:8080 confluent flink statement stop [ID-STATEMENT] --environment staging-env --url http://10.10.10.106:8080

Note :

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

CARA BIKIN TABLE ?

table auto-create saat topic terbuat di kafka topics

KALAU SQL STATEMENT NYA PENDING / TIMEOUT ?

naikin resources pod FlinkEnvironment

CARA HITUNG NYA BERDASARKAN ?

resource memory taskmanager : taskmanager.numberOfTaskSlots cpu better start di 0.5/1

///////////////////////////////////////////////////////////

Check POD TaskManager Mogok/Pending

kubectl describe pod [NAMA-POD-TASKMANAGER] -n confluent-flink

Restart Rollout dan Force POD Mogok/Linglung (refresh

kubectl rollout restart deployment/[nama-deployment] -n confluent-flink kubectl delete pod [NAMA-POD] -n confluent-flink --force --grace-period=0

About

Flink exploration results

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors