Skip to content

Create Configuration DAG #2653

@eveleighoj

Description

@eveleighoj

Overview
In order to shift to assembling via spark we need to also tackle how the old_entity table in postgres is updated. currently this happens by extracting from the sqlite files but actually the data can be updated directly from config files itself without happening at the same time.

Pull Request(PR):

Tech Approach

  • create a new DAG for configuration.
  • focus on just the old_entity table for now we can easily extract additional files in the future
  • apply cleaning and review expectations on the old_entity config files to make sure cleaning is represented in tests.
  • load into parquet datasets
  • load into postgres
  • remove old_entity extraction from digital-land-postgres

Acceptance Criteria/Tests

  • old_entity should be updated on the platform for all datasets/collections during a single dag run
  • DAG run should take place after config update action but before midnight when pipelines kick off

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

Status
Done - Consider for Weeknotes

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions