feat: add metadata filters to gallery and search APIs#306
Conversation
📝 WalkthroughWalkthroughGallery and search endpoints now accept optional metadata filter query parameters for camera make/model, date range, minimum image dimensions, orientation, and file type. New helpers validate and apply these filters at the database layer. Query cache service incorporates filter keys to segregate cached responses by active metadata filters. Comprehensive tests verify filtering behavior, input validation, and combined filter interactions. ChangesMetadata filtering for gallery and search
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Context Summary
Suggested issue links
Use |
ApprovabilityVerdict: Needs human review Unable to check for correctness in d192e22. CodeQL has flagged potential SQL injection vulnerabilities in the search API's metadata filter implementation. Combined with an unresolved bug report about date filtering behavior, this new feature addition warrants security and functional review by a human. You can customize Macroscope's approvability policy. Learn more. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
backend/src/find_api/routers/gallery.py (1)
110-120: ⚡ Quick winUpdate docstring to document the new filter parameters.
The function docstring's Args section does not include the five new query parameters (
camera_make,camera_model,min_width,min_height,file_type), making the API documentation incomplete.📝 Proposed docstring update
""" Get paginated list of images Args: skip: Number of records to skip limit: Max number of records to return status: Filter by status (pending, processing, indexed, failed) + liked: Filter by liked status + camera_make: Filter by EXIF camera make (case-insensitive partial match) + camera_model: Filter by EXIF camera model (case-insensitive partial match) + min_width: Filter images with width >= this value + min_height: Filter images with height >= this value + file_type: Filter by file type (e.g., "jpeg", "png") Returns: Paginated list of media records """🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/src/find_api/routers/gallery.py` around lines 110 - 120, The docstring for the gallery endpoint is missing entries for the five new query filters; update the function docstring in backend/src/find_api/routers/gallery.py (the gallery GET handler function where skip/limit/status are documented) to add Args descriptions for camera_make, camera_model, min_width, min_height, and file_type, specifying their types (e.g. str or int), purpose (filter by camera make/model, minimum width/height, and file MIME/extension), and whether they are optional; keep the existing format and ordering so generated API docs include these new query parameters.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/src/find_api/routers/gallery.py`:
- Around line 128-142: Extract the filtering logic out of the router by creating
a service function (e.g., apply_media_filters or build_media_query) in a new or
existing backend service module (suggested name: media_query service), move the
camera_make/camera_model/min_width/min_height/file_type filter logic that
currently manipulates the local variable query and references the Media model
into that function, have it accept the SQLAlchemy Query and the filter
parameters and return the modified Query, then replace the inline filter block
in the gallery router with a single call to that service function and import the
service; ensure the service uses the same Media model names (Media, exif_json,
width, height, content_type) so references remain correct and add unit tests for
the service to keep the router thin and testable.
- Around line 101-107: The string query params camera_make, camera_model, and
file_type in the gallery router lack max length validation; update their Query
declarations (the parameters named camera_make, camera_model, file_type) to
include a safe max_length (e.g., 255) so extremely long inputs are rejected
before hitting the DB or consuming excessive memory; keep the descriptions and
Optional typing the same and ensure validation is enforced by FastAPI/Pydantic
via the Query(..., max_length=255) argument.
In `@backend/tests/test_gallery.py`:
- Around line 436-498: Add two tests to TestGalleryMetadataFilters to cover
missing EXIF and combined-filter behavior: implement
test_gallery_filters_exclude_missing_exif which seeds one record with exif_json
set to a dict and one with exif_json = None, calls GET /api/gallery with
camera_make and asserts only the record with EXIF appears; and implement
test_gallery_filters_combine_correctly which seeds canon_large, canon_small,
nikon_large, sets exif_json and widths appropriately, calls GET /api/gallery
with params camera_make="Canon" and min_width=1500 and asserts only canon_large
is returned; place these new tests alongside existing methods in the
TestGalleryMetadataFilters class to ensure API filtering excludes null exif_json
and combines filters with AND logic.
---
Nitpick comments:
In `@backend/src/find_api/routers/gallery.py`:
- Around line 110-120: The docstring for the gallery endpoint is missing entries
for the five new query filters; update the function docstring in
backend/src/find_api/routers/gallery.py (the gallery GET handler function where
skip/limit/status are documented) to add Args descriptions for camera_make,
camera_model, min_width, min_height, and file_type, specifying their types (e.g.
str or int), purpose (filter by camera make/model, minimum width/height, and
file MIME/extension), and whether they are optional; keep the existing format
and ordering so generated API docs include these new query parameters.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4af23dd6-925e-4420-8d65-3b062957b760
📒 Files selected for processing (2)
backend/src/find_api/routers/gallery.pybackend/tests/test_gallery.py
|
@macroscope-app review Please review this PR against its linked issue, local-first privacy rules, and the current Find repo instructions. |
Abhash-Chakraborty
left a comment
There was a problem hiding this comment.
Looks good to merge now.
I reviewed this against #301. The original PR only added a subset of gallery filters, so I pushed a follow-up commit to complete the backend/API scope:
- Added
date_from/date_tofiltering with safe 422 validation. - Added
orientationfiltering (landscape,portrait,square). - Added the metadata filters to
/api/searchas well as/api/gallery. - Updated the search cache key so filtered searches do not reuse unfiltered cached responses.
- Added gallery and search regression tests for the new filters and invalid input handling.
Checks run:
uv run ruff format src/find_api/routers/gallery.py src/find_api/routers/search.py src/find_api/services/query_cache.py tests/test_gallery.py tests/test_search.pyuv run ruff check src/find_api/routers/gallery.py src/find_api/routers/search.py src/find_api/services/query_cache.py tests/test_gallery.py tests/test_search.pyuv run pytest tests/test_gallery.py tests/test_search.py -q
Result: 49 tests passed. Only the existing SQLAlchemy deprecation warning appeared.
| {metadata_filter_sql} | ||
| AND 1 - (vector <=> CAST(:embedding AS vector)) > :threshold | ||
| """ | ||
| """.format(metadata_filter_sql=metadata_filter_sql) | ||
| ) | ||
| count_result = db.execute( | ||
| count_query, {"embedding": embedding_str, "threshold": threshold} | ||
| count_query, | ||
| { |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/src/find_api/routers/gallery.py`:
- Around line 77-93: The parse_metadata_date function sets date-only values
(like YYYY-MM-DD) to midnight (00:00:00), which causes the date_to comparison to
exclude records created later that same day. To fix this, detect when the input
is date-only (check if the raw_value contains no time component by checking for
absence of 'T' or colon separators), and for the date_to field specifically,
adjust the parsed datetime to the end of that day (23:59:59.999999 UTC) instead
of the start. You can determine if this is date_to by checking the field_name
parameter and adjusting the tzinfo replacement logic accordingly to set the time
to end-of-day for date_to fields only.
In `@backend/src/find_api/routers/search.py`:
- Around line 58-66: The filter_parts list is being constructed with raw
unescaped k=v pairs that are vulnerable to cache key collisions when values
contain the delimiter character "&". Special characters in filter values (like
camera_make, min_width, etc.) must be properly escaped or URL-encoded when
appending to filter_parts to prevent different filter combinations from
generating identical cache keys. Apply URL encoding using urllib.parse.quote or
similar to the value portion before constructing the filter_parts strings in all
locations where filter_parts.append is called with f-string patterns like
camera_make={value.lower()}, min_width={value}, max_width={value}, and similar
filter parameter constructions throughout the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 24bbb12a-9358-473b-8ca1-80ea70b20289
📒 Files selected for processing (5)
backend/src/find_api/routers/gallery.pybackend/src/find_api/routers/search.pybackend/src/find_api/services/query_cache.pybackend/tests/test_gallery.pybackend/tests/test_search.py
Abhash-Chakraborty
left a comment
There was a problem hiding this comment.
Reviewed and fixed the remaining metadata-filter edge cases. Date-only end filters now include the full selected day, metadata cache keys are escaped safely, camera fields have length bounds, and focused backend checks pass: ruff check, ruff format --check, and pytest for gallery/search.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/tests/test_search.py`:
- Line 10: The function `_metadata_filter_sql` contains substantial database
logic and should not reside in the router file per coding guidelines. Move
`_metadata_filter_sql` from `find_api/routers/search.py` to a backend module
such as `find_api/services/metadata_filters.py` (or similar), then update the
import statement in this test file to import from the new backend module
location instead of from the router. Also update the import in
`routers/search.py` to reflect the new location so the router can use the
function from the backend module.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 78f32adb-4a90-4f79-a931-688a18b12c27
📒 Files selected for processing (4)
backend/src/find_api/routers/gallery.pybackend/src/find_api/routers/search.pybackend/tests/test_gallery.pybackend/tests/test_search.py
🚧 Files skipped from review as they are similar to previous changes (3)
- backend/src/find_api/routers/gallery.py
- backend/tests/test_gallery.py
- backend/src/find_api/routers/search.py
|
|
||
| from find_api.core.database import get_db | ||
| from find_api.main import app | ||
| from find_api.routers.search import _metadata_filter_sql |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Move _metadata_filter_sql to a backend module per coding guidelines.
The import reveals that _metadata_filter_sql contains substantial database logic (~80 lines of SQL building, parameter management, and validation) but lives in the router file. As per coding guidelines, FastAPI routers should be thin, with database logic placed in existing backend modules (e.g., backend/src/find_api/services/ or a new metadata_filters.py module).
Refactor by moving _metadata_filter_sql to a backend module and updating imports in both routers/search.py and this test file.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/tests/test_search.py` at line 10, The function `_metadata_filter_sql`
contains substantial database logic and should not reside in the router file per
coding guidelines. Move `_metadata_filter_sql` from `find_api/routers/search.py`
to a backend module such as `find_api/services/metadata_filters.py` (or
similar), then update the import statement in this test file to import from the
new backend module location instead of from the router. Also update the import
in `routers/search.py` to reflect the new location so the router can use the
function from the backend module.
Source: Coding guidelines
Summary
This PR adds EXIF-based metadata filtering support to the gallery endpoint, allowing users to refine gallery results using camera information and image attributes. This enhancement improves image discovery and organization, especially for users managing large local photo collections.
Fixes #301
Type of change
What changed
Added optional gallery filters for:
camera_makecamera_modelmin_widthmin_heightfile_typeApplied filtering logic at the database query layer while preserving existing gallery behavior when filters are not provided.
Added tests covering all newly introduced metadata filtering functionality, including camera make/model, dimension-based filtering, and file type filtering.
Ensured all existing gallery tests continue to pass.
Screenshots / recordings (for UI changes)
N/A – Backend/API enhancement with no UI changes.
How to test
Navigate to the backend directory:
cd backendInstall project dependencies:
Run the gallery test suite:
Verify linting and formatting:
Expected result:
Checklist
GSSoC'26 checklist
Summary by CodeRabbit