Skip to content

Best practices for recursive builds #182

@tcholewik

Description

@tcholewik

I'm working a project where I have to compile series of reports.
While I intended to have one report for each day, and I need incremental intraday reports as well.
To build intraday reports I can either execute queries that collect data from midnight untill now, or I could check what time did last report run and my query will pull just additinal data.

For now I can just query whole day, but second solution offers a puzzle, I wonder if it is possible to do in remake.

If I generated report_today.html and it used data_today.csv then to query just new data I would have a step that checks data_today.csv for most recent record timestamp and use that as an input for query what would say something like SELECT * FROM SOMETABLE WHERE TIME > {{MOST_RECENT_RECORD}}. At this point I can query the database as append results to data_today.csv.

What wories me is that in setup described above data_today.csv is both a dependecy for first step and a taget file for last step, so before as we finish running this workflow we already invalidated dependency of step 1.

So my questions are:

  1. Is remake prepared to handle this situation?
  2. Is there a way to decouple this target/dependency relationship?
  3. What are the best practices for handling this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions