Skip to content

feat: add provider-neutral storage backends#316

Open
sweetylearner-max wants to merge 21 commits into
Abhash-Chakraborty:mainfrom
sweetylearner-max:feat/provider-neutral-storage-abstraction
Open

feat: add provider-neutral storage backends#316
sweetylearner-max wants to merge 21 commits into
Abhash-Chakraborty:mainfrom
sweetylearner-max:feat/provider-neutral-storage-abstraction

Conversation

@sweetylearner-max

@sweetylearner-max sweetylearner-max commented Jun 18, 2026

Copy link
Copy Markdown

Description

Implements #254 by introducing a provider-neutral backend storage layer while keeping the current MinIO behavior as the default.

Changes

  • Added a StorageBackend interface for storage implementations.
  • Split the existing MinIO logic into MinIOStorageBackend.
  • Added LocalStorageBackend for filesystem-backed desktop/local storage with path traversal protection.
  • Added a storage factory and startup initialization for the configured backend.
  • Kept the existing synchronous find_api.core.storage helper surface stable for routes and workers.
  • Moved thumbnail generation/upload helpers into a provider-neutral module.
  • Added neutral STORAGE_* config names while preserving existing MINIO_* compatibility aliases.
  • Mounted local storage under /files when STORAGE_BACKEND=local.
  • Removed out-of-scope Tauri sidecar changes from this PR.

Acceptance Criteria Covered

  • MinIO remains the default backend.
  • Existing upload, thumbnail, gallery, search, worker, and delete paths can keep using the same storage helpers.
  • MINIO_* settings remain supported.
  • New STORAGE_* settings are documented in .env.example and take precedence when provided.
  • Local storage validates paths and rejects traversal attempts.
  • Backend tests cover factory selection, local storage behavior, neutral alias precedence, sync facade compatibility, thumbnails, upload, gallery, and search paths.

Testing

  • uv run ruff check src/find_api/core/config.py src/find_api/core/storage.py src/find_api/core/storage_abstract.py src/find_api/core/storage_factory.py src/find_api/core/storage_local.py src/find_api/core/storage_minio.py src/find_api/core/storage_thumbnails.py src/find_api/main.py tests/test_storage_factory.py tests/test_storage_local.py tests/test_storage_facade.py tests/test_thumbnails.py tests/test_upload.py
  • uv run ruff format --check src/find_api/core/config.py src/find_api/core/storage.py src/find_api/core/storage_abstract.py src/find_api/core/storage_factory.py src/find_api/core/storage_local.py src/find_api/core/storage_minio.py src/find_api/core/storage_thumbnails.py src/find_api/main.py tests/test_storage_factory.py tests/test_storage_local.py tests/test_storage_facade.py tests/test_thumbnails.py tests/test_upload.py
  • uv run pytest tests/test_storage_factory.py tests/test_storage_local.py tests/test_storage_facade.py tests/test_thumbnails.py tests/test_upload.py tests/test_gallery.py tests/test_search.py -q
  • After final rebase: uv run pytest tests/test_storage_factory.py tests/test_storage_local.py tests/test_storage_facade.py tests/test_thumbnails.py tests/test_upload.py -q

Closes #254

- Add tauri-plugin-shell, tokio, log dependencies to Cargo.toml
- Implement sidecar process supervision in lib.rs with auto-restart
- Retry crashed sidecar up to 5 times with 2s delay
- Emit backend-ready and backend-failed events to frontend
- Register find-backend externalBin in tauri.conf.json
- Add shell:allow-execute and shell:allow-spawn capabilities

Closes Abhash-Chakraborty#258
…rty#254)

- Add StorageBackend abstract base class with unified interface
- Implement MinIOStorageBackend refactoring existing MinIO code
- Implement LocalStorageBackend for desktop/offline mode with path traversal protection
- Add storage factory for backend selection based on configuration
- Update config.py with STORAGE_BACKEND and LOCAL_STORAGE_PATH settings
- Refactor storage.py to use factory pattern for backend selection
- Preserve existing MinIO functionality while enabling local filesystem storage

Files created:
- storage_abstract.py: Abstract base class interface
- storage_minio.py: MinIO backend implementation
- storage_local.py: Local filesystem backend implementation
- storage_factory.py: Factory for backend instantiation
- storage.py: Refactored to use factory

Acceptance criteria:
✓ Provider-neutral abstraction layer
✓ MinIO backend fully functional
✓ Local filesystem backend with safe path handling
✓ Configurable backend selection
✓ Thumbnail generation support for both backends
@github-actions github-actions Bot added the needs linked issue Pull request needs to link a valid issue before review. label Jun 18, 2026
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR refactors backend storage from synchronous MinIO-specific code into an async, provider-neutral architecture using an abstract interface, factory, MinIO and local filesystem backends. It also adds a Tauri sidecar supervisor in Rust that spawns, health-checks, retries, and manages the FastAPI backend process on desktop.

Changes

Backend Storage Abstraction

Layer / File(s) Summary
StorageBackend abstract interface and configuration
backend/src/find_api/core/storage_abstract.py, backend/src/find_api/core/config.py, .env.example
Defines StorageBackend ABC with seven async abstract methods and StorageException; adds STORAGE_BACKEND (minio/local, defaults to minio) and LOCAL_STORAGE_PATH (defaults to ./storage/uploads) to Settings Pydantic model; documents new configuration in .env.example.
Storage factory and singleton lifecycle
backend/src/find_api/core/storage_factory.py
Implements create_storage_backend() for config-driven backend selection with defaults, get_storage() for async initialization, initialize_storage() for startup singleton population with logging and error propagation, and get_storage_instance() for request-time retrieval.
MinIOStorageBackend implementation and thumbnail utilities
backend/src/find_api/core/storage_minio.py
Encapsulates MinIO/S3 client initialization with optional public-endpoint support; implements bucket creation, public-read policy management, and all async CRUD operations; adds PIL-based thumbnail generation with EXIF normalization and WEBP encoding.
LocalStorageBackend filesystem implementation
backend/src/find_api/core/storage_local.py
Implements async local filesystem backend with configurable base path, _validate_path() for path-traversal protection, and CRUD operations using asyncio.to_thread with 1MB streaming for large downloads.
Unified storage.py async wrapper facade
backend/src/find_api/core/storage.py
Converts all sync MinIO-specific functions to async wrappers delegating through get_storage_instance(); adds file_exists() helper and upload_file_with_thumbnail() combining upload with optional thumbnail generation.
Storage factory and local backend tests
backend/tests/test_storage_factory.py, backend/tests/test_storage_local.py
Comprehensive test suites covering backend selection, async initialization lifecycle, singleton caching, CRUD operations, nested directories, error handling, and path-traversal protection (dot-dot rejection, normalization, boundary checks).

Tauri Backend Sidecar Supervisor

Layer / File(s) Summary
Tauri configuration, permissions, and Rust dependencies
frontend/src-tauri/tauri.conf.json, frontend/src-tauri/capabilities/default.json, frontend/src-tauri/Cargo.toml
Registers binaries/find-backend as external binary, reformats default permissions to multi-line, and adds tauri-plugin-shell, tokio (full), log, and reqwest (rustls-tls) dependencies.
Sidecar supervision in lib.rs
frontend/src-tauri/src/lib.rs
Defines BackendState/SharedState for process tracking; rewrites run() to spawn async supervise_backend task; implements retry loop with wait_for_health() polling and full sidecar lifecycle including event-driven output handling and app-exit cleanup.

Sequence Diagram(s)

sequenceDiagram
  participant App as Tauri run()
  participant Supervisor as supervise_backend
  participant Sidecar as start_sidecar
  participant Health as wait_for_health
  participant Frontend as Frontend UI

  App->>Supervisor: tokio::spawn(supervise_backend)
  loop up to MAX_RETRIES
    Supervisor->>Sidecar: start_sidecar(app, state)
    Sidecar->>Sidecar: spawn find-backend sidecar
    Sidecar->>Health: poll http://127.0.0.1:8000/health
    alt health passes
      Health-->>Sidecar: Ok(())
      Sidecar->>Frontend: emit backend-ready
      Sidecar->>Sidecar: listen Stdout/Stderr/Terminated events
      Sidecar-->>Supervisor: Ok(()) or Err(exit_code)
    else health timeout
      Health-->>Sidecar: Err(msg)
      Sidecar->>Sidecar: kill child process
      Sidecar-->>Supervisor: Err(msg)
    end
    alt Err and retries remain
      Supervisor->>Supervisor: sleep RETRY_DELAY_SECS
    end
  end
  Supervisor->>Frontend: emit backend-failed (retries exhausted)
Loading
sequenceDiagram
  participant Caller as API Endpoint
  participant Facade as storage.py facade
  participant Factory as storage_factory
  participant Backend as StorageBackend impl

  Caller->>Facade: upload_file_with_thumbnail(file_data, object_name, file_hash)
  Facade->>Factory: get_storage_instance()
  Factory-->>Facade: active StorageBackend
  Facade->>Backend: upload_file(file_data, object_name, content_type)
  Backend-->>Facade: object_name (str)
  Facade->>Backend: upload_thumbnail(backend, file_data, file_hash)
  Backend-->>Facade: thumbnail_meta or None
  Facade-->>Caller: (upload_result, thumbnail_meta)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Abhash-Chakraborty/Find#221: Refactors storage layer with async backends and moves thumbnail helpers (generate_thumbnail, upload_thumbnail) from core/storage.py into storage_minio, impacting thumbnail-generation workflows that depend on these utilities.

Suggested labels

backend, architecture, local-first, type:feature, desktop-app

Suggested reviewers

  • Abhash-Chakraborty

Poem

🐇 Hippity-hop, the storage is free,
No more tied to MinIO, now it can be
Local or cloud, a factory decides—
While Tauri keeps watch as the backend resides.
Health checks and retries, the sidecar stays true,
This bunny approves of this refactor through! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add provider-neutral storage backends' accurately and specifically summarizes the main change - introduction of a provider-neutral storage abstraction layer supporting multiple backends.
Description check ✅ Passed The PR description is comprehensive and well-structured. It includes all required sections: clear description of changes, detailed changelog, acceptance criteria met, CodeRabbit fixes applied, testing approach, and explicit issue closure reference.
Linked Issues check ✅ Passed The PR fully addresses all technical requirements from Issue #254: provider-neutral abstraction implemented with StorageBackend ABC; MinIO and Local backends provided; factory pattern enables configurable backend selection; STORAGE_BACKEND and LOCAL_STORAGE_PATH added to config; public storage API remains stable (init_storage, upload_file, get_file, download_file_to_path, get_file_url, delete_file); path traversal protection included; backward compatibility preserved with MinIO as default; MINIO_* env vars remain supported; new neutral STORAGE_* vars documented in .env.example; comprehensive tests added for storage factory and local backend.
Out of Scope Changes check ✅ Passed All code changes align with Issue #254 scope. Core changes implement the abstraction layer (storage_abstract.py, storage_factory.py, storage_minio.py, storage_local.py), update configuration (config.py, .env.example), refactor storage.py, and add comprehensive tests. Frontend changes (Cargo.toml, lib.rs, tauri.conf.json, capabilities/default.json) support sidecar supervision for backend management, which is within the stated scope of maintaining the application architecture.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

PR Context Summary

Suggested issue links

  • No strong issue match found yet.

Use Fixes #123 or Closes #123 in the PR body when one of the suggestions is the intended issue.
Manual rerun: Actions > PR Context Triage > Run workflow > set pr_number and force_review=true.

@macroscopeapp

macroscopeapp Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Approvability

Verdict: Needs human review

Unable to check for correctness in 65d550f. This PR introduces a new storage backend capability (local filesystem) with new abstractions, factory pattern, and configuration-driven backend selection. The scope includes 5+ new files with significant new logic, representing a substantial new feature that warrants human review.

You can customize Macroscope's approvability policy. Learn more.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (2)
backend/src/find_api/core/storage.py (1)

10-11: ⚡ Quick win

Move thumbnail helpers out of the MinIO backend module.

upload_file_with_thumbnail() is backend-neutral, but importing upload_thumbnail from storage_minio keeps the facade coupled to one provider. Put thumbnail utilities in a provider-neutral module, then import from there.

♻️ Proposed direction
 from find_api.core.storage_factory import get_storage_instance
-from find_api.core.storage_minio import generate_thumbnail, upload_thumbnail
+from find_api.core.storage_thumbnails import upload_thumbnail

Move generate_thumbnail() and upload_thumbnail() from storage_minio.py into the new neutral module.

Also applies to: 65-72

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/find_api/core/storage.py` around lines 10 - 11, The import
statement on line 11 imports thumbnail utilities directly from the
MinIO-specific module storage_minio, creating unnecessary coupling to a specific
storage provider. Create a new provider-neutral module (such as storage_utils.py
or thumbnail_utils.py) and move the generate_thumbnail() and upload_thumbnail()
functions from storage_minio.py into this new module. Then update the import
statement on line 11 to import these functions from the new neutral module
instead of from storage_minio. Check all other usages of these functions in the
file (around lines 65-72) and ensure they work correctly with the new import
structure.
backend/src/find_api/core/storage_factory.py (1)

9-11: ⚡ Quick win

Defer provider imports until the selected backend branch.

storage_factory currently imports every backend up front, so local mode still depends on MinIO/Pillow importability and future providers will have the same coupling. Move implementation imports into the matching branch.

♻️ Proposed refactor
 from typing import Literal
 
 from find_api.core.storage_abstract import StorageBackend
-from find_api.core.storage_minio import MinIOStorageBackend
-from find_api.core.storage_local import LocalStorageBackend
 from find_api.core.config import settings
@@
     if backend_type == "minio":
+        from find_api.core.storage_minio import MinIOStorageBackend
+
         logger.info("Creating MinIO storage backend")
         return MinIOStorageBackend()
 
     elif backend_type == "local":
+        from find_api.core.storage_local import LocalStorageBackend
+
         logger.info("Creating local filesystem storage backend")
         local_path = getattr(settings, "LOCAL_STORAGE_PATH", "./storage/uploads")
         return LocalStorageBackend(local_path)

Also applies to: 21-28

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/find_api/core/storage_factory.py` around lines 9 - 11, The
MinIOStorageBackend and LocalStorageBackend imports at the top of the
storage_factory.py file create unnecessary dependency coupling, as all backends
must be importable even if only one is used. Move these backend-specific imports
from the top-level import statements into the corresponding conditional branches
where they are actually instantiated (around lines 21-28 in the factory logic).
This way, MinIO and its dependencies will only be imported when the MinIO
backend is selected, and similar deferred imports will apply to any future
backends.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/src/find_api/core/config.py`:
- Around line 65-67: Add unit tests for the storage backend configuration and
factory logic that was introduced with the STORAGE_BACKEND and
LOCAL_STORAGE_PATH configuration options. Create focused tests that verify
backend selection correctly instantiates either "minio" or "local" storage
implementations based on the STORAGE_BACKEND configuration value, that invalid
STORAGE_BACKEND values are properly rejected with appropriate error handling,
that LocalStorageBackend properly initializes directories and validates paths on
initialization, and that path traversal attacks are prevented through validation
of the _validate_path method (including checks for ".." sequences and path
boundary violations). Ensure these tests cover the actual storage factory logic
and backend implementations rather than just mocking at the router level.

In `@backend/src/find_api/core/storage_local.py`:
- Around line 73-74: Replace the `exists()` method with `is_file()` in the
condition that checks whether a storage object exists in the LocalStorage class.
This ensures that only regular files are treated as valid storage objects, not
directories. The check `if not full_path.exists()` should become `if not
full_path.is_file()` in the StorageException raise statement. Apply this same
change to all other locations in the file where `exists()` is used for file
validation (including in the `file_exists()`, `get_file_url()`, `delete_file()`
methods and any other object read/delete operations) to maintain consistency.
- Around line 56-57: All blocking filesystem I/O operations (open, read, write,
unlink, mkdir) in the async methods throughout this file must be moved off the
event loop to prevent blocking concurrent requests. Wrap each blocking I/O
operation call with either anyio.to_thread.run_sync() or asyncio.to_thread.run()
to execute them in a thread pool. This applies to the file write operations
around lines 56-57, the file read operations around line 26 and 54, the unlink
operations around lines 76-77, and the mkdir operations around lines 99-105 and
138.

In `@backend/src/find_api/core/storage_minio.py`:
- Around line 120-130: The get_file method does not guarantee cleanup of the
MinIO response if an exception occurs during response.read(), causing potential
HTTP connection leaks. Refactor the get_file method to use a try-finally block
pattern similar to download_file_to_path, where response.close() and
response.release_conn() are called in the finally block to ensure they execute
regardless of whether response.read() succeeds or raises an exception.
- Around line 60-61: The MinIO Python SDK calls (bucket_exists, make_bucket,
put_object, get_object, remove_object, stat_object, presigned_get_object) are
synchronous and blocking, and when called directly in async methods they will
block the event loop. Wrap all MinIO SDK calls throughout the file using
asyncio.to_thread() to offload these blocking operations to a worker thread
pool. This applies to the bucket existence check and creation at lines 60-61,
the upload operations at lines 107-113, the download operations at lines
123-126, the delete operations at lines 136-140, the stat operations at lines
172-174, the URL signing at lines 183-184, and any other MinIO SDK calls at
lines 192-193.
- Around line 66-100: The methods _apply_public_read_policy() and
_remove_public_read_policy() currently only log warnings when S3Error exceptions
occur, allowing initialization to proceed even if the bucket policy update
fails. This creates a mismatch where get_file_url() may serve objects with
incorrect access levels. Modify both methods to raise StorageException instead
of logging warnings when policy operations fail, ensuring that initialization
fails fast if the bucket policy cannot be properly configured to match the
MINIO_PUBLIC_READ setting.
- Around line 45-55: The _public_client method currently only extracts and uses
the netloc from MINIO_PUBLIC_ENDPOINT, which drops any path prefix. Extract the
path component from the parsed URL using parsed.path and store it so it can be
used when generating presigned URLs. Review the public-read branch
implementation (lines 152-170) to see how it correctly handles the path prefix,
then apply the same approach to the presigned URL generation logic to ensure
paths like /minio are preserved in the final presigned URLs instead of being
dropped.

In `@frontend/src-tauri/capabilities/default.json`:
- Line 7: Remove the unused "shell:allow-spawn" capability string from the
capabilities array in the default.json file. This capability is not used by any
frontend code since the sidecar is started from Rust in lib.rs instead, and
removing it reduces the attack surface of the application.

In `@frontend/src-tauri/src/lib.rs`:
- Around line 90-95: The health check in the loop starting with `for attempt in
1..=MAX_ATTEMPTS` currently accepts any 2xx response without validating the
actual payload or adding a request timeout. Add a per-request timeout to the
reqwest::get(HEALTH_URL) call to prevent stalling, and after receiving a
successful response, parse the response body as JSON and validate that it
contains a status field equal to "healthy" before returning Ok(()). This ensures
only the actual backend service triggers the ready state and prevents hangs from
unresponsive listeners.
- Around line 130-133: The health check error handler at lines 130-133 returns
without killing the spawned child process, causing orphaned processes on retry.
Additionally, the backend-failed event is emitted here for every health check
failure instead of being reserved for when retries are exhausted. Fix this by
killing the child process (likely through a call on the app or child handle)
before returning the error from wait_for_health(), and remove the
app.emit("backend-failed") call from this error path so that the event is only
emitted at the retry-exhausted path around line 73.
- Around line 13-16: Add a `shutting_down: bool` field to the BackendState
struct alongside the existing `running` and `child` fields, then use this flag
to guard against retry logic when the supervisor receives a termination error
(check `shutting_down` before entering the retry path after observing the child
process exit). Set `shutting_down` to true when handling `RunEvent::Exit` and
capture any errors from killing the child process in that handler instead of
discarding them silently. Additionally, modify the health check at the
reqwest::get() call to include an explicit timeout using `.timeout()` and
validate the response by deserializing the JSON payload rather than just
checking for a 2xx status. Finally, when `wait_for_health()` fails in the
`start_sidecar` function, explicitly kill the child process and clear it from
`state.child` before returning the error to prevent orphaned processes.

---

Nitpick comments:
In `@backend/src/find_api/core/storage_factory.py`:
- Around line 9-11: The MinIOStorageBackend and LocalStorageBackend imports at
the top of the storage_factory.py file create unnecessary dependency coupling,
as all backends must be importable even if only one is used. Move these
backend-specific imports from the top-level import statements into the
corresponding conditional branches where they are actually instantiated (around
lines 21-28 in the factory logic). This way, MinIO and its dependencies will
only be imported when the MinIO backend is selected, and similar deferred
imports will apply to any future backends.

In `@backend/src/find_api/core/storage.py`:
- Around line 10-11: The import statement on line 11 imports thumbnail utilities
directly from the MinIO-specific module storage_minio, creating unnecessary
coupling to a specific storage provider. Create a new provider-neutral module
(such as storage_utils.py or thumbnail_utils.py) and move the
generate_thumbnail() and upload_thumbnail() functions from storage_minio.py into
this new module. Then update the import statement on line 11 to import these
functions from the new neutral module instead of from storage_minio. Check all
other usages of these functions in the file (around lines 65-72) and ensure they
work correctly with the new import structure.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eb6517b9-8fe7-4ba1-adbb-f627b41acf13

📥 Commits

Reviewing files that changed from the base of the PR and between 858202e and 536dcf9.

⛔ Files ignored due to path filters (1)
  • frontend/src-tauri/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • backend/src/find_api/core/config.py
  • backend/src/find_api/core/storage.py
  • backend/src/find_api/core/storage_abstract.py
  • backend/src/find_api/core/storage_factory.py
  • backend/src/find_api/core/storage_local.py
  • backend/src/find_api/core/storage_minio.py
  • frontend/src-tauri/Cargo.toml
  • frontend/src-tauri/capabilities/default.json
  • frontend/src-tauri/src/lib.rs
  • frontend/src-tauri/tauri.conf.json

Comment on lines +65 to +67
# Storage
STORAGE_BACKEND: Literal["minio", "local"] = "minio"
LOCAL_STORAGE_PATH: str = "./storage/uploads"

@coderabbitai coderabbitai Bot Jun 18, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect focused tests for the new storage settings and backends.
# Expected: tests referencing backend selection, LOCAL_STORAGE_PATH, and local traversal behavior.
rg -n --glob 'tests/**/*.py' \
  'STORAGE_BACKEND|LOCAL_STORAGE_PATH|create_storage_backend|LocalStorageBackend|MinIOStorageBackend|path traversal|\.\.' || true

Repository: Abhash-Chakraborty/Find

Length of output: 216


🏁 Script executed:

#!/bin/bash
# Find test directories and files
echo "=== Test directory structure ==="
find . -type d -name 'test*' | head -20

echo "=== Looking for test files ==="
fd -e py -t f 'test' | head -30

echo "=== Repository structure overview ==="
ls -la | head -20

Repository: Abhash-Chakraborty/Find

Length of output: 2187


🏁 Script executed:

#!/bin/bash
# Examine the config file to verify the actual changes
echo "=== Content around lines 65-67 of config.py ==="
cat -n backend/src/find_api/core/config.py | sed -n '55,75p'

echo "=== Full config file length ==="
wc -l backend/src/find_api/core/config.py

Repository: Abhash-Chakraborty/Find

Length of output: 873


🏁 Script executed:

#!/bin/bash
# Search for storage backend implementation and related code
echo "=== Storage-related files and implementations ==="
find . -type f -name '*.py' | xargs grep -l "STORAGE_BACKEND\|LocalStorageBackend\|MinIOStorageBackend" 2>/dev/null || true

echo "=== Search for storage tests without path constraints ==="
rg -n "STORAGE_BACKEND|LOCAL_STORAGE_PATH" --type py || true

Repository: Abhash-Chakraborty/Find

Length of output: 891


🏁 Script executed:

#!/bin/bash
echo "=== Check test_upload.py for storage backend tests ==="
cat -n backend/tests/test_upload.py | head -80

echo "=== Search for storage_factory tests ==="
rg -n "storage_factory|STORAGE_BACKEND|LocalStorageBackend|MinIOStorageBackend" backend/tests/ --type py || true

echo "=== List all test files to see what's available ==="
ls -la backend/tests/

Repository: Abhash-Chakraborty/Find

Length of output: 5084


🏁 Script executed:

#!/bin/bash
echo "=== Check storage_factory.py implementation ==="
cat -n backend/src/find_api/core/storage_factory.py

echo "=== Check storage_local.py for path handling ==="
wc -l backend/src/find_api/core/storage_local.py && head -50 backend/src/find_api/core/storage_local.py

Repository: Abhash-Chakraborty/Find

Length of output: 4320


🏁 Script executed:

#!/bin/bash
echo "=== Check conftest.py for storage fixture setup ==="
cat backend/tests/conftest.py

echo "=== Search for any storage-related mocking or fixtures ==="
rg -n "storage|Storage" backend/tests/conftest.py || true

echo "=== Check if storage factory is used anywhere in tests ==="
rg -n "storage_factory\|create_storage_backend\|initialize_storage" backend/tests/ || true

Repository: Abhash-Chakraborty/Find

Length of output: 4300


🏁 Script executed:

#!/bin/bash
echo "=== Check storage_local.py for path traversal validation ==="
cat -n backend/src/find_api/core/storage_local.py | sed -n '30,50p'

echo "=== Check storage_minio.py implementation ==="
head -30 backend/src/find_api/core/storage_minio.py

Repository: Abhash-Chakraborty/Find

Length of output: 1852


Add focused tests for the new storage backend configuration and selection logic.

The changes introduce backend selection (STORAGE_BACKEND) and local storage path configuration (LOCAL_STORAGE_PATH), but there are no unit tests for the storage factory or backend implementations. Per coding guidelines, backend changes affecting storage must include focused tests.

Tests should cover:

  • Backend selection for both "minio" and "local" values
  • Error handling for invalid STORAGE_BACKEND values
  • LocalStorageBackend directory initialization and path validation
  • Path traversal prevention (the .. check and boundary validation in _validate_path)

Current test coverage only mocks storage at the router level, bypassing the actual storage factory and backend logic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/find_api/core/config.py` around lines 65 - 67, Add unit tests for
the storage backend configuration and factory logic that was introduced with the
STORAGE_BACKEND and LOCAL_STORAGE_PATH configuration options. Create focused
tests that verify backend selection correctly instantiates either "minio" or
"local" storage implementations based on the STORAGE_BACKEND configuration
value, that invalid STORAGE_BACKEND values are properly rejected with
appropriate error handling, that LocalStorageBackend properly initializes
directories and validates paths on initialization, and that path traversal
attacks are prevented through validation of the _validate_path method (including
checks for ".." sequences and path boundary violations). Ensure these tests
cover the actual storage factory logic and backend implementations rather than
just mocking at the router level.

Source: Coding guidelines

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 19 unit tests in backend/tests/test_storage_local.py covering backend initialization, upload/download/delete roundtrip, path traversal prevention, and file existence checks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== Check new test file ==="
cat -n backend/tests/test_storage_local.py

echo "=== Count test functions ==="
rg -n "^def test_|^async def test_" backend/tests/test_storage_local.py | wc -l

echo "=== Check if storage factory tests exist too ==="
rg -n "STORAGE_BACKEND|create_storage_backend|MinIOStorageBackend|storage_factory" backend/tests/ --type py || true

Length of output: 6262



@sweetylearner-max, great work — the 19 test cases are confirmed (16 function definitions, with the 4-parameter test_validate_path_rejects_dot_dot_traversal giving the remaining 3 instances). The LocalStorageBackend coverage is solid:

Area Covered
Directory initialization + idempotency
Upload/download/delete roundtrip
Error handling for missing files
Path traversal via .. sequences ✅ (4 parametrized cases)
Boundary escape detection
File existence checks

One gap remains from the original review request — the storage factory selection logic in storage_factory.py is not yet tested. Specifically:

  • create_storage_backend() returning the correct class when STORAGE_BACKEND="minio" vs "local"
  • Behavior when an invalid/unsupported STORAGE_BACKEND value is provided

A small addition to cover this would look like:

# backend/tests/test_storage_factory.py
import pytest
from unittest.mock import patch

from find_api.core.storage_factory import create_storage_backend
from find_api.core.storage_local import LocalStorageBackend
from find_api.core.storage_minio import MinIOStorageBackend


def test_create_storage_backend_returns_minio_by_default():
    with patch("find_api.core.storage_factory.settings") as mock_settings:
        mock_settings.STORAGE_BACKEND = "minio"
        mock_settings.MINIO_ENDPOINT = "localhost:9000"
        # ... set required MinIO fields
        backend = create_storage_backend()
    assert isinstance(backend, MinIOStorageBackend)


def test_create_storage_backend_returns_local():
    with patch("find_api.core.storage_factory.settings") as mock_settings:
        mock_settings.STORAGE_BACKEND = "local"
        mock_settings.LOCAL_STORAGE_PATH = "/tmp/test_storage"
        backend = create_storage_backend()
    assert isinstance(backend, LocalStorageBackend)


def test_create_storage_backend_rejects_invalid_value():
    with patch("find_api.core.storage_factory.settings") as mock_settings:
        mock_settings.STORAGE_BACKEND = "s3"
        with pytest.raises((ValueError, KeyError, RuntimeError)):
            create_storage_backend()

Would you like to add those factory-level tests, or should I open a follow-up issue to track this separately?

Comment thread backend/src/find_api/core/storage_local.py Outdated
Comment thread backend/src/find_api/core/storage_local.py Outdated
Comment thread backend/src/find_api/core/storage_minio.py
Comment thread backend/src/find_api/core/storage_minio.py Outdated
Comment thread backend/src/find_api/core/storage_minio.py
Comment thread frontend/src-tauri/capabilities/default.json Outdated
Comment thread frontend/src-tauri/src/lib.rs Outdated
Comment thread frontend/src-tauri/src/lib.rs Outdated
Comment thread frontend/src-tauri/src/lib.rs Outdated
@github-actions github-actions Bot removed the needs linked issue Pull request needs to link a valid issue before review. label Jun 18, 2026
@sweetylearner-max

Copy link
Copy Markdown
Author

@coderabbitai Could you please re-review the PR? I have addressed all the review comments from the previous review.

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

@sweetylearner-max Sure! I'll kick off a full re-review of the PR now to check all the changes you've made.

✅ Action performed

Full review finished.

@sweetylearner-max

Copy link
Copy Markdown
Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@sweetylearner-max

Copy link
Copy Markdown
Author

Added factory-level tests in backend/tests/test_storage_factory.py (9 tests) covering backend selection for "minio"/"local", case-insensitivity, invalid value rejection, default fallback, and the get_storage/initialize_storage lifecycle. All passing.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/tests/test_storage_factory.py`:
- Around line 50-61: The test_create_storage_backend_is_case_insensitive test
only verifies that the correct instance is returned but does not assert that the
LocalStorageBackend constructor (fake_cls) was called with the proper arguments.
Add an assertion after the backend is created to verify that fake_cls was called
with the LOCAL_STORAGE_PATH value from mock_settings. This will ensure the test
fails if the LOCAL_STORAGE_PATH parameter stops being forwarded to the
constructor during case-insensitive backend selection.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7235e4a5-7a00-42be-8e31-8543266a22aa

📥 Commits

Reviewing files that changed from the base of the PR and between 0878e1a and d91c1c1.

📒 Files selected for processing (3)
  • backend/src/find_api/core/storage_minio.py
  • backend/tests/test_storage_factory.py
  • backend/tests/test_storage_local.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/src/find_api/core/storage_minio.py

Comment thread backend/tests/test_storage_factory.py
… flag

Previously, get_file_url() generated public URLs based solely on the
MINIO_PUBLIC_READ config flag, even if applying the bucket policy had
silently failed during init_storage(). This could result in broken
public URLs being served. Now the backend tracks whether the policy
was actually applied successfully and falls back to presigned URLs
when it wasn't.
@Abhash-Chakraborty Abhash-Chakraborty changed the title feat: implement provider-neutral storage abstraction (#254) feat: add provider-neutral storage backends Jun 20, 2026
@Abhash-Chakraborty Abhash-Chakraborty added gssoc26 Related to GirlScript Summer of Code 2026. level:critical GSSoC difficulty level: critical. Base contributor points: 80. backend FastAPI, database, storage, and API work infra Docker, compose, deployment, and runtime setup architecture High-level design decisions and technical direction desktop-app Windows, macOS, and Linux installed app work local-first Privacy-preserving local runtime and offline behavior type:feature Feature PR. GSSoC type bonus. under-review Maintainer needs to verify labels Jun 20, 2026
@github-actions

Copy link
Copy Markdown

@macroscope-app review

Please review this PR against its linked issue, local-first privacy rules, and the current Find repo instructions.
Linked issue(s): #254.
Trigger source: label-gated review (under-review).

…al-storage-abstraction

# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
@sweetylearner-max

Copy link
Copy Markdown
Author

Hey @Abhash-Chakraborty! Just a friendly ping — PR #316 is all set and waiting for your approval. All review feedback has been incorporated, tests are comprehensive, and the branch is fresh. Would love to get this merged soon. Thanks!

@Abhash-Chakraborty

Copy link
Copy Markdown
Owner

I will review it today.

@Abhash-Chakraborty Abhash-Chakraborty added level:advanced GSSoC difficulty level: advanced. Base contributor points: 55. quality:clean Clean and maintainable PR. GSSoC contributor multiplier: 1.2x. gssoc:approved Valid GSSoC contribution approved for scoring. ready-to-merge Fully approved, tested, and cleared for immediate merging. and removed desktop-app Windows, macOS, and Linux installed app work level:critical GSSoC difficulty level: critical. Base contributor points: 80. under-review Maintainer needs to verify labels Jun 22, 2026

@Abhash-Chakraborty Abhash-Chakraborty left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now. I cleaned this up to stay focused on #254: removed the out-of-scope Tauri sidecar changes, kept the existing storage helper API compatible for current routes/workers, moved thumbnails into a provider-neutral helper, added deferred backend imports, added STORAGE_* aliases while preserving MINIO_* compatibility, and added regression tests for factory/local/facade/thumbnail behavior.\n\nTested locally with backend Ruff checks and focused pytest coverage for storage, thumbnails, upload, gallery, and search paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

architecture High-level design decisions and technical direction backend FastAPI, database, storage, and API work gssoc:approved Valid GSSoC contribution approved for scoring. gssoc26 Related to GirlScript Summer of Code 2026. infra Docker, compose, deployment, and runtime setup level:advanced GSSoC difficulty level: advanced. Base contributor points: 55. local-first Privacy-preserving local runtime and offline behavior quality:clean Clean and maintainable PR. GSSoC contributor multiplier: 1.2x. ready-to-merge Fully approved, tested, and cleared for immediate merging. type:feature Feature PR. GSSoC type bonus.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: introduce provider-neutral object storage abstraction

2 participants