Skip to content

Scaling concern: backdated effective balance recalculation in deep account set trees #693

@nicolasburtey

Description

@nicolasburtey

Problem

When posting a transaction with a past effective date, the effective balance recalculation happens synchronously within the same Postgres transaction as the entry creation. The work scales as:

rows deleted + reinserted ≈ (1 + tree_depth) × distinct_active_dates_after_backdated_date

For a moderately deep account set tree with significant history, this can become a heavy transaction:

Scenario Tree depth Days of history Rows touched
Modest 3 30 ~120
Moderate 5 90 ~540
Concerning 10 365 (1000+ entries) ~3,650+

How it works today

In post_transactionupdate_balances_in_opupdate_cumulative_balances_in_op:

  1. fetch_mappings_in_op returns all ancestors (transitive memberships are denormalized in cala_account_set_member_accounts)
  2. find_for_update issues a single SQL query that deletes all cala_cumulative_effective_balances rows with effective > backdated_date for every (ancestor, currency) pair, and returns the deleted snapshots
  3. re_calculate_snapshots recalculates in-memory by applying a diff to each deleted snapshot
  4. insert_new_snapshots bulk-inserts all recalculated rows

All of this runs inside the same DB transaction that creates the transaction/entries and updates current balances. This means:

  • Lock duration grows with tree depth × date range
  • Transaction size grows similarly
  • Concurrent postings to accounts sharing ancestor sets will contend on advisory locks

Concrete concern

An account 10 levels deep in the account set tree, with ~1000 entries spread over a year, receiving a backdated posting 12 months in the past, would need to delete and reinsert ~3,650+ rows across all ancestors — all within a single transaction while holding advisory locks.

Suggestion

Consider making the effective balance recalculation asynchronous for scalability:

  • The current balance update (which is just an append/version bump) can remain synchronous
  • The effective balance recalculation could be offloaded to an async job, allowing the posting transaction to complete quickly
  • The effective balances would become eventually consistent (the eventually_consistent flag on accounts already exists in the schema but is used to skip effective balance updates entirely — this could be extended)
  • The job could batch recalculations for the same account set tree to reduce redundant work

This would decouple the posting latency from the tree depth and historical date range.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions