Skip to content

Refactor jobs pipeline to hydrate data from snapshots #80

@mhd-hi

Description

@mhd-hi

Current

Data hydration works as follows:

  1. Fetch data from source inputs (PDFs, cheminements.txt, APIs)
  2. Parse the data
  3. Save it to database

Goal

Change this pipeline so hydration happens from snapshots instead of directly from source inputs:

  1. Save data as a snapshot
  2. hydrate the backend from that snapshot only

This should make the data aggregation pipeline easier to maintain.

(Attributes planificationPdfJson and horaireCoursPdfJson was a good start but never fully implemented, remove it and create table.s containing snapshots)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions