Skip to content

fix(brain-repo): harden backup integrity — 4 silent data-loss fixes#98

Open
mt-alarcon wants to merge 1 commit into
evolution-foundation:mainfrom
mt-alarcon:fix/brain-repo-backup-integrity
Open

fix(brain-repo): harden backup integrity — 4 silent data-loss fixes#98
mt-alarcon wants to merge 1 commit into
evolution-foundation:mainfrom
mt-alarcon:fix/brain-repo-backup-integrity

Conversation

@mt-alarcon

@mt-alarcon mt-alarcon commented Jun 4, 2026

Copy link
Copy Markdown

Problem

The Brain Repo mirror could silently drop files from the backup while the sync-age healthcheck stayed green — the failure is invisible because an excluded/deleted file doesn't dirty git status. A disk-vs-backup sweep found ~13% of files missing across four independent root causes.

Fixes (each with regression tests — 69 tests total)

1. secrets_scanner false positives deleted legitimate files

The mirror removes any file matching a secret pattern before commit. Over-broad patterns matched non-secrets: variable names (password), recaptcha base64 blobs, image URLs, ALL_CAPS binding-name references (SERVICE_API_TOKEN_PROD), ${VAR} interpolations and doc placeholders (YOUR_PASSWORD). Added lookbehind/lookahead constraints + a central _is_false_positive filter. Regression both ways: FPs cleared, real secrets (DATABASE_URL with password, mixed-case passwords, Fernet keys, JWTs) still flagged.

2. Nested .gitignore mirrored into the brain repo hid content

A source .gitignore with a wildcard rule (e.g. a scratch state dir) was copied into the mirror, which then excluded that content from the backup. The mirror now skips .gitignore files from watched paths (the brain repo's own root .gitignore is unaffected).

3. Dropped sync tick

When a sync job was already running, enqueue_sync returned False and the tick was discarded with no guaranteed retry — batch writes during a busy window never got mirrored. Added trailing-run coalescing: N enqueues during one job → exactly one trailing sync after it finishes (lock-protected, no infinite loop).

4. Orphaned DB lock after restart-during-sync

A process killed mid-sync leaves BrainRepoConfig.sync_in_progress=True. Since _acquire_db_lock does UPDATE WHERE sync_in_progress=0, every future enqueue fails and auto-sync stays dead for up to JOB_STALE_SECONDS (20 min) until the janitor reclaims. Added reclaim_orphaned_locks_on_startup() (no age gate — at boot no sync can legitimately be running). Sibling of the existing git_ops._clear_stale_lock for the .git/index.lock case.

Tests

pytest dashboard/backend/brain_repo/tests/69 passed. Covers: secrets_scanner FP/TP both directions, nested-.gitignore skip, trailing-run coalescing (exactly-one, no-loop, idle-unchanged), and the startup orphan-lock reclaim path.

🤖 Generated with Claude Code

Summary by Sourcery

Harden brain repo backups against silent data loss by improving secret scanning accuracy, avoiding nested .gitignore interference, ensuring queued syncs are not dropped, and reclaiming orphaned sync locks on startup.

Bug Fixes:

  • Prevent secret scanner from deleting legitimate files by tightening several regex patterns and filtering out known non-secret placeholders and variable references.
  • Ensure source .gitignore files are not copied into the brain repo so they cannot hide backed-up content from git.
  • Guarantee that sync requests received while a job is already running are coalesced into a single trailing run instead of being silently dropped.
  • Clear orphaned BrainRepoConfig sync_in_progress locks at process startup so auto-sync cannot remain stuck after a mid-sync restart.

Tests:

  • Add extensive unit and integration tests for the secrets scanner to cover new false-positive exclusions and true-positive detections.
  • Add tests for trailing-run coalescing to verify exactly-one trailing sync, absence of infinite loops, and unchanged idle enqueue behaviour.
  • Add tests to validate that the ignore callback suppresses copying nested .gitignore files while still mirroring legitimate state files.

The Brain Repo mirror could silently drop files from the backup while the
sync-age healthcheck stayed green. Four independent root causes, each fixed
with regression tests (69 tests total):

1. secrets_scanner false positives deleted legitimate files. The mirror
   removes any file matching a secret pattern before commit; over-broad
   patterns matched variable names (password), recaptcha base64, image URLs,
   ALL_CAPS binding-name references (SERVICE_API_TOKEN_PROD), ${VAR} refs and
   doc placeholders (YOUR_PASSWORD). Added lookbehind/lookahead constraints +
   a central _is_false_positive filter. Regression both ways: FPs cleared,
   real secrets (DATABASE_URL, mixed-case passwords, Fernet, JWT) still flagged.

2. Nested .gitignore mirrored into the brain repo hid content. A source
   .gitignore with a wildcard (e.g. a scratch _state/ dir) was copied into the
   mirror, which then excluded that content from the backup. The mirror now
   skips .gitignore files from watched paths.

3. Dropped sync tick. When a sync job was already running, enqueue_sync
   returned False and the tick was discarded with no guaranteed retry — batch
   writes during a busy window never got mirrored. Added trailing-run
   coalescing: N enqueues during one job => exactly one trailing sync after it
   finishes (lock-protected, no infinite loop).

4. Orphaned DB lock after restart-during-sync. A process killed mid-sync
   leaves BrainRepoConfig.sync_in_progress=True; since _acquire_db_lock does
   UPDATE WHERE sync_in_progress=0, every future enqueue fails and auto-sync
   stays dead for up to JOB_STALE_SECONDS (20 min) until the janitor reclaims.
   Added reclaim_orphaned_locks_on_startup() (no age gate — at boot no sync can
   legitimately be running). Sibling of git_ops._clear_stale_lock.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sourcery-ai

sourcery-ai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Reviewer's Guide

Hardens the brain repo backup pipeline against silent data loss by coalescing dropped sync ticks into a guaranteed trailing run, reclaiming orphaned DB locks at startup, preventing nested .gitignore files from hiding content, and tightening the secrets scanner to avoid false positives while preserving true-positive detection, all backed by new regression tests.

Sequence diagram for trailing-run coalescing in enqueue_sync and run_sync_pipeline

sequenceDiagram
    actor Caller
    participant enqueue_sync
    participant _acquire_db_lock
    participant run_sync_pipeline
    participant _release_db_lock

    Caller->>enqueue_sync: enqueue_sync(flask_app, user_id, workspace, kind)
    enqueue_sync->>_acquire_db_lock: _acquire_db_lock(flask_app, user_id, kind)
    alt lock acquired
        _acquire_db_lock-->>enqueue_sync: True
        enqueue_sync->>run_sync_pipeline: start thread run_sync_pipeline(...)
        enqueue_sync-->>Caller: True
    else job already running
        _acquire_db_lock-->>enqueue_sync: False
        enqueue_sync->>enqueue_sync: set _rerun_requested[user_id] = True
        enqueue_sync-->>Caller: False
    end

    Note over run_sync_pipeline: Existing job finishes
    run_sync_pipeline->>_release_db_lock: _release_db_lock(flask_app, user_id, success, error)
    _release_db_lock-->>run_sync_pipeline: lock cleared
    run_sync_pipeline->>run_sync_pipeline: rerun = _rerun_requested.pop(user_id, False)
    alt rerun is True
        run_sync_pipeline->>run_sync_pipeline: start trailing run thread
    else no rerun
        run_sync_pipeline->>run_sync_pipeline: exit
    end
Loading

File-Level Changes

Change Details Files
Guarantee trailing sync runs when enqueue_sync is called during an in-progress job, without allowing infinite loops or changing idle behavior.
  • Introduce a module-level _rerun_requested map keyed by user_id, protected by _job_lock, to record trailing-run requests when enqueue_sync finds an existing sync in progress.
  • Update enqueue_sync to set _rerun_requested[user_id] and return False when _acquire_db_lock fails, logging that a trailing run was requested instead of silently dropping the tick.
  • Extend run_sync_pipeline to, after releasing the DB lock and still under _job_lock, pop the rerun flag and, if set, spawn exactly one trailing watcher-thread sync for that user with a fixed commit message.
  • Add unit tests covering that the rerun flag is present, set only when busy, results in exactly one trailing run, does not loop, and does not affect idle enqueue behavior.
dashboard/backend/brain_repo/job_runner.py
dashboard/backend/brain_repo/tests/test_job_runner_trailing_run.py
Prevent nested .gitignore files from source workspaces from being mirrored into the brain repo, ensuring they cannot hide backup content.
  • Modify the _ignore/build_ignore_callback implementation to always add '.gitignore' to the ignore list for any directory being copied from the workspace into the brain repo.
  • Ensure that only the literal '.gitignore' filename is filtered (not other .git* files) so versioning metadata like .gitkeep is preserved.
  • Add tests that exercise the ignore callback directly and via a simulated shutil.copytree run, verifying that nested .gitignore files are not copied while normal files under directories like workspace/marketing/_state are copied.
dashboard/backend/brain_repo/job_runner.py
dashboard/backend/brain_repo/tests/test_ignore_gitignore_files.py
Automatically clear orphaned BrainRepoConfig sync_in_progress locks at process startup so auto-sync cannot be stuck for up to JOB_STALE_SECONDS after a restart during sync.
  • Add reclaim_orphaned_locks_on_startup(flask_app) to scan BrainRepoConfig rows with sync_in_progress=True, reset lock-related fields, set an explanatory last_error, and commit the changes, returning the number cleared.
  • Call reclaim_orphaned_locks_on_startup(app) during app startup, before starting the brain watcher, wrapped in a best-effort try/except so janitor-based stale lock cleanup still works if this fails.
  • Log a warning when any orphaned sync locks are cleared so operators can correlate with restarts during sync.
dashboard/backend/brain_repo/job_runner.py
dashboard/backend/app.py
Tighten secrets_scanner patterns and add centralized false-positive filtering to avoid deleting legitimate files while maintaining coverage of real secrets.
  • Refine several regex patterns (AWS_SECRET_KEY, GENERIC_SECRET, JWT_TOKEN, FERNET_KEY, GENERIC_PASSWORD) with lookbehind/lookahead and entropy-based constraints to avoid matching substrings in base64 blobs, filenames, or ALL_CAPS binding names while still matching real secrets.
  • Introduce _VAR_REF_RE and _PLACEHOLDER_RE along with an _is_false_positive helper that identifies ${VAR} interpolations and documentation-style placeholders as non-secrets.
  • Update scan_files to call _is_false_positive on each regex match and skip reporting when it returns True, centralizing false-positive handling instead of encoding it into each pattern.
  • Add an extensive test suite that validates false negatives and false positives for each pattern, plus integration-style tests that scan fixture directories for both known FPs (which must yield zero findings) and known TPs (which must still be detected).
dashboard/backend/brain_repo/secrets_scanner.py
dashboard/backend/brain_repo/tests/test_secrets_scanner.py
Update dependency lockfile to reflect new or adjusted Python tooling used for tests or runtime.
  • Modify uv.lock to capture updated dependency graph required by the new tests and code paths.
uv.lock

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In app.py, the try/except Exception: pass around reclaim_orphaned_locks_on_startup(app) silently swallows startup errors; consider at least logging the exception so failures in the lock-reclaim path are visible in logs.
  • In reclaim_orphaned_locks_on_startup, you currently load all BrainRepoConfig rows with sync_in_progress=True and loop to update them; if this table can grow, consider a single bulk update (e.g. update with synchronize_session=False) to avoid holding all rows in memory.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `app.py`, the `try/except Exception: pass` around `reclaim_orphaned_locks_on_startup(app)` silently swallows startup errors; consider at least logging the exception so failures in the lock-reclaim path are visible in logs.
- In `reclaim_orphaned_locks_on_startup`, you currently load all `BrainRepoConfig` rows with `sync_in_progress=True` and loop to update them; if this table can grow, consider a single bulk update (e.g. `update` with `synchronize_session=False`) to avoid holding all rows in memory.

## Individual Comments

### Comment 1
<location path="dashboard/backend/brain_repo/secrets_scanner.py" line_range="134-135" />
<code_context>
             for name, regex in compiled:
                 m = regex.search(line)
                 if m:
+                    if _is_false_positive(m.group(0)):
</code_context>
<issue_to_address>
**🚨 issue (security):** Single `search` per line can drop real secrets when the first match is a false positive.

Because we only do a single `regex.search(line)` per pattern, if that first match is treated as a false positive we skip the rest of the line and may miss a real secret later in the same line.

Consider iterating all matches with `for m in regex.finditer(line):` and running `_is_false_positive` inside that loop, adding findings for each non–false-positive match. This ensures multiple secrets on one line (or secrets after placeholders) are all detected.
</issue_to_address>

### Comment 2
<location path="dashboard/backend/brain_repo/tests/test_job_runner_trailing_run.py" line_range="16" />
<code_context>
+from unittest.mock import MagicMock, call, patch
+
+
+class TestTrailingRunCoalescing(unittest.TestCase):
+    """Testa o flag rerun_requested e o trailing run automático."""
+
</code_context>
<issue_to_address>
**issue (testing):** No tests cover the new `reclaim_orphaned_locks_on_startup` behavior described in the PR

The PR calls out `reclaim_orphaned_locks_on_startup()` as a key fix, but its behavior isn’t tested. To validate the regression and prevent it from returning, please add a test that:

- Inserts a `BrainRepoConfig` row with `sync_in_progress=True` (and fields like `sync_started_at`, `sync_job_kind`, `cancel_requested` set).
- Calls `reclaim_orphaned_locks_on_startup(flask_app)` and asserts:
  - The returned count matches the number of locked rows.
  - The row is updated to `sync_in_progress=False`, `sync_started_at=None`, `sync_job_kind=None`, `cancel_requested=False`.
  - `last_error` is set to the expected marker string.
- Optionally, also assert that with no locked rows, it’s a no‑op that returns 0.
</issue_to_address>

### Comment 3
<location path="dashboard/backend/brain_repo/tests/test_job_runner_trailing_run.py" line_range="31-40" />
<code_context>
+    # Helpers
+    # ─────────────────────────────────────────────────────────
+
+    def _make_pipeline_that_blocks(self, event_start, event_unblock):
+        """Retorna um patch de run_sync_pipeline que bloqueia até event_unblock."""
+        def fake_pipeline(flask_app, user_id, workspace, *, kind, tag_name=None, commit_message=None):
+            # Sinaliza que começou e aguarda liberação do teste.
+            event_start.set()
+            event_unblock.wait(timeout=5)
+            # Libera o DB lock como o pipeline real faria.
+            self.jr._release_db_lock(flask_app, user_id, success=True, error=None)
+
+        return fake_pipeline
+
+    def _dummy_flask_app(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Helper `_make_pipeline_that_blocks` is currently unused and can be removed or used to strengthen integration-style coverage

Right now it’s dead code in this test class, which can be confusing and suggest a missing test. I’d suggest either removing it, or using it to add an integration-style test that:
- runs `run_sync_pipeline` in a background thread with this blocking helper,
- calls `enqueue_sync` while it’s running, and
- asserts that exactly one trailing run is created when the first job finishes.
Clarifying this will keep the tests clearer and easier to maintain.

Suggested implementation:

```python
    def _dummy_flask_app(self):
        """Flask app mínimo que satisfaz _acquire_db_lock / _release_db_lock."""
        app = MagicMock()
        # Simula que sync_in_progress começa False — primeira acquire retorna True.
        locked = {"value": False}

        def fake_app_context():
            class Ctx:
                def __enter__(self): return self
                def __exit__(self, *a): pass
            return Ctx()

        app.app_context.side_effect = fake_app_context

        def fake_query_first():
            class Row:
                sync_in_progress = locked["value"]
            return Row()

        def fake_update_sync(in_progress):
            locked["value"] = in_progress

        app.db.session.execute.side_effect = lambda *a, **k: None
        app.db.session.commit.side_effect = lambda: None
        app.db.session.query.return_value.filter_by.return_value.first.side_effect = fake_query_first
        app.db.session.query.return_value.filter_by.return_value.update.side_effect = (
            lambda values: fake_update_sync(values["sync_in_progress"])
        )

        return app

    def test_enqueue_sync_cria_apenas_um_trailing_run_com_pipeline_bloqueante(self):
        """
        Usa _make_pipeline_that_blocks para simular uma execução longa de run_sync_pipeline.

        Enquanto a primeira execução está em progresso, chamamos enqueue_sync novamente e
        verificamos que exatamente um trailing run extra é disparado após o término da primeira.
        """
        app = self._dummy_flask_app()
        user_id = "user-id"
        workspace = "workspace"

        event_start = threading.Event()
        event_unblock = threading.Event()
        fake_pipeline = self._make_pipeline_that_blocks(event_start, event_unblock)

        # Patcha run_sync_pipeline para usar o pipeline bloqueante.
        with patch.object(self.jr, "run_sync_pipeline", side_effect=fake_pipeline) as mocked_pipeline:
            # Inicia a primeira execução em background.
            def _run_first():
                self.jr.enqueue_sync(app, user_id, workspace, kind="manual")

            t = threading.Thread(target=_run_first)
            t.start()

            # Aguarda o início da primeira execução.
            assert event_start.wait(timeout=2), "Primeira execução do pipeline não iniciou a tempo"

            # Enquanto a primeira execução está rodando, agenda uma nova sync.
            self.jr.enqueue_sync(app, user_id, workspace, kind="manual")

            # Libera a execução bloqueante e aguarda término.
            event_unblock.set()
            t.join(timeout=5)

            # Dá uma pequena margem para qualquer trailing run disparar.
            time.sleep(0.1)

            # Deve haver exatamente duas execuções do pipeline:
            # - a execução original
            # - exatamente um trailing run.
            assert mocked_pipeline.call_count == 2

```

Para que o novo teste funcione, serão necessárias mais duas pequenas atualizações no arquivo:

1. Garantir que os imports incluam as dependências usadas pelo teste:
   - Adicionar `import threading` e `import time`.
   - Garantir que `patch` está importado: `from unittest.mock import MagicMock, patch` (ou adicionar `patch` à linha existente onde `MagicMock` já é importado).

2. Se a implementação atual de `_dummy_flask_app` já estiver completa em outra parte do arquivo, remova qualquer duplicação da lógica interna que eu reescrevi aqui e mantenha apenas uma versão consistente do helper (a lógica que simula `sync_in_progress` e o contexto de app deve permanecer equivalente ao que os demais testes já utilizam).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 134 to 135
for name, regex in compiled:
m = regex.search(line)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 issue (security): Single search per line can drop real secrets when the first match is a false positive.

Because we only do a single regex.search(line) per pattern, if that first match is treated as a false positive we skip the rest of the line and may miss a real secret later in the same line.

Consider iterating all matches with for m in regex.finditer(line): and running _is_false_positive inside that loop, adding findings for each non–false-positive match. This ensures multiple secrets on one line (or secrets after placeholders) are all detected.

from unittest.mock import MagicMock, call, patch


class TestTrailingRunCoalescing(unittest.TestCase):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (testing): No tests cover the new reclaim_orphaned_locks_on_startup behavior described in the PR

The PR calls out reclaim_orphaned_locks_on_startup() as a key fix, but its behavior isn’t tested. To validate the regression and prevent it from returning, please add a test that:

  • Inserts a BrainRepoConfig row with sync_in_progress=True (and fields like sync_started_at, sync_job_kind, cancel_requested set).
  • Calls reclaim_orphaned_locks_on_startup(flask_app) and asserts:
    • The returned count matches the number of locked rows.
    • The row is updated to sync_in_progress=False, sync_started_at=None, sync_job_kind=None, cancel_requested=False.
    • last_error is set to the expected marker string.
  • Optionally, also assert that with no locked rows, it’s a no‑op that returns 0.

Comment on lines +31 to +40
def _make_pipeline_that_blocks(self, event_start, event_unblock):
"""Retorna um patch de run_sync_pipeline que bloqueia até event_unblock."""
def fake_pipeline(flask_app, user_id, workspace, *, kind, tag_name=None, commit_message=None):
# Sinaliza que começou e aguarda liberação do teste.
event_start.set()
event_unblock.wait(timeout=5)
# Libera o DB lock como o pipeline real faria.
self.jr._release_db_lock(flask_app, user_id, success=True, error=None)

return fake_pipeline

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Helper _make_pipeline_that_blocks is currently unused and can be removed or used to strengthen integration-style coverage

Right now it’s dead code in this test class, which can be confusing and suggest a missing test. I’d suggest either removing it, or using it to add an integration-style test that:

  • runs run_sync_pipeline in a background thread with this blocking helper,
  • calls enqueue_sync while it’s running, and
  • asserts that exactly one trailing run is created when the first job finishes.
    Clarifying this will keep the tests clearer and easier to maintain.

Suggested implementation:

    def _dummy_flask_app(self):
        """Flask app mínimo que satisfaz _acquire_db_lock / _release_db_lock."""
        app = MagicMock()
        # Simula que sync_in_progress começa False — primeira acquire retorna True.
        locked = {"value": False}

        def fake_app_context():
            class Ctx:
                def __enter__(self): return self
                def __exit__(self, *a): pass
            return Ctx()

        app.app_context.side_effect = fake_app_context

        def fake_query_first():
            class Row:
                sync_in_progress = locked["value"]
            return Row()

        def fake_update_sync(in_progress):
            locked["value"] = in_progress

        app.db.session.execute.side_effect = lambda *a, **k: None
        app.db.session.commit.side_effect = lambda: None
        app.db.session.query.return_value.filter_by.return_value.first.side_effect = fake_query_first
        app.db.session.query.return_value.filter_by.return_value.update.side_effect = (
            lambda values: fake_update_sync(values["sync_in_progress"])
        )

        return app

    def test_enqueue_sync_cria_apenas_um_trailing_run_com_pipeline_bloqueante(self):
        """
        Usa _make_pipeline_that_blocks para simular uma execução longa de run_sync_pipeline.

        Enquanto a primeira execução está em progresso, chamamos enqueue_sync novamente e
        verificamos que exatamente um trailing run extra é disparado após o término da primeira.
        """
        app = self._dummy_flask_app()
        user_id = "user-id"
        workspace = "workspace"

        event_start = threading.Event()
        event_unblock = threading.Event()
        fake_pipeline = self._make_pipeline_that_blocks(event_start, event_unblock)

        # Patcha run_sync_pipeline para usar o pipeline bloqueante.
        with patch.object(self.jr, "run_sync_pipeline", side_effect=fake_pipeline) as mocked_pipeline:
            # Inicia a primeira execução em background.
            def _run_first():
                self.jr.enqueue_sync(app, user_id, workspace, kind="manual")

            t = threading.Thread(target=_run_first)
            t.start()

            # Aguarda o início da primeira execução.
            assert event_start.wait(timeout=2), "Primeira execução do pipeline não iniciou a tempo"

            # Enquanto a primeira execução está rodando, agenda uma nova sync.
            self.jr.enqueue_sync(app, user_id, workspace, kind="manual")

            # Libera a execução bloqueante e aguarda término.
            event_unblock.set()
            t.join(timeout=5)

            # Dá uma pequena margem para qualquer trailing run disparar.
            time.sleep(0.1)

            # Deve haver exatamente duas execuções do pipeline:
            # - a execução original
            # - exatamente um trailing run.
            assert mocked_pipeline.call_count == 2

Para que o novo teste funcione, serão necessárias mais duas pequenas atualizações no arquivo:

  1. Garantir que os imports incluam as dependências usadas pelo teste:

    • Adicionar import threading e import time.
    • Garantir que patch está importado: from unittest.mock import MagicMock, patch (ou adicionar patch à linha existente onde MagicMock já é importado).
  2. Se a implementação atual de _dummy_flask_app já estiver completa em outra parte do arquivo, remova qualquer duplicação da lógica interna que eu reescrevi aqui e mantenha apenas uma versão consistente do helper (a lógica que simula sync_in_progress e o contexto de app deve permanecer equivalente ao que os demais testes já utilizam).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant