This issue tracks the Week 1 scope for the PyDebeziumAI GSoC 2026 project.
The goal is to implement the core ingestion and data representation foundation of the Python library:
- Ingestion handlers:
ConnectIngestionHandler: Reads raw Kafka Connect SourceRecord objects directly from the memory-bridge JVM context (using JPype).
JsonIngestionHandler: Consumes Debezium events serialized as standard JSON strings.
- Canonical event models:
- Normalizes different CDC structures into validated Pydantic v2 models (
DebeziumEventModel and DebeziumPayloadModel).
- Logical Type Conversion:
- Decodes Debezium's custom formats (VariableScaleDecimals, decimals, zoned times, micro/nano durations, and geometries) into native Python types.
- Validation:
- Initial unit tests validating Pydantic model serialization and primary key extraction.
This issue tracks the Week 1 scope for the PyDebeziumAI GSoC 2026 project.
The goal is to implement the core ingestion and data representation foundation of the Python library:
ConnectIngestionHandler: Reads raw Kafka ConnectSourceRecordobjects directly from the memory-bridge JVM context (using JPype).JsonIngestionHandler: Consumes Debezium events serialized as standard JSON strings.DebeziumEventModelandDebeziumPayloadModel).