Skip to content

Implement Week 1: Project skeleton, event models, and ingestion handlers for PyDebeziumAI #2003

@KMohnishM

Description

@KMohnishM

This issue tracks the Week 1 scope for the PyDebeziumAI GSoC 2026 project.

The goal is to implement the core ingestion and data representation foundation of the Python library:

  1. Ingestion handlers:
    • ConnectIngestionHandler: Reads raw Kafka Connect SourceRecord objects directly from the memory-bridge JVM context (using JPype).
    • JsonIngestionHandler: Consumes Debezium events serialized as standard JSON strings.
  2. Canonical event models:
    • Normalizes different CDC structures into validated Pydantic v2 models (DebeziumEventModel and DebeziumPayloadModel).
  3. Logical Type Conversion:
    • Decodes Debezium's custom formats (VariableScaleDecimals, decimals, zoned times, micro/nano durations, and geometries) into native Python types.
  4. Validation:
    • Initial unit tests validating Pydantic model serialization and primary key extraction.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions