feat: Add MinIO S3-compatible object storage#1
Merged
Conversation
Implement MinIO as the data lake storage layer for KLDP with: **Infrastructure:** - Helm values configuration (core/storage/minio-values.yaml) - Standalone mode deployment optimized for local development - NodePort service for easy local access - 10Gi persistent volume for data storage - Resource limits: 256Mi/512Mi memory, 100m/500m CPU - Default buckets: datalake, raw, processed, curated **Installation:** - Installation script (scripts/install-minio.sh) - Uses Bitnami MinIO Helm chart v17.0.21 via OCI registry - Default credentials: minioadmin/minioadmin - Automatic bucket provisioning on startup **Developer Experience:** - Makefile targets: install-minio, minio-console, minio-api, logs-minio - Enhanced status command to include MinIO pods and release - Port forwarding helpers for console (9001) and API (9000) **Documentation:** - Comprehensive setup guide (docs/MINIO_SETUP.md) - Data lake architecture patterns (bronze/silver/gold) - Integration examples with Airflow, Spark, Pandas - Troubleshooting and monitoring guide - Updated CLAUDE.md with MinIO installation instructions **Examples:** - Example DAG (examples/dags/example_minio_s3.py) - Demonstrates S3 operations: connect, upload, list, download - Uses KubernetesPodOperator with boto3 - Shows proper configuration for in-cluster connectivity **Integration:** - S3-compatible API accessible to all components - Uses Kubernetes DNS: minio.storage.svc.cluster.local:9000 - Ready for Spark, Airflow, and other data processing tools MinIO provides the foundational storage layer for: - Raw data ingestion and landing zones - Processed data transformation pipelines - Curated analytics-ready datasets - Multi-stage data lake architecture 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Replace deprecated schedule_interval with schedule parameter - Fixes CI validation error: "unexpected keyword argument 'schedule_interval'" - Compatible with Airflow 3.1.0 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implement MinIO as the data lake storage layer for KLDP with:
Infrastructure:
Installation:
Developer Experience:
Documentation:
Examples:
Integration:
MinIO provides the foundational storage layer for:
🤖 Generated with Claude Code