Skip to content

Fix 3 flaws in image_cache.py#549

Open
greensum wants to merge 1 commit into
helgeerbe:mainfrom
greensum:fix-image-cache-flaws
Open

Fix 3 flaws in image_cache.py#549
greensum wants to merge 1 commit into
helgeerbe:mainfrom
greensum:fix-image-cache-flaws

Conversation

@greensum
Copy link
Copy Markdown

@greensum greensum commented Oct 1, 2025

  1. Trap all OS calls on file/folder as it might be deleted between it was listed and before its turn to be processed
  2. Folders that don’t contain any files are now cached properly so they won’t be discovered as new repeatedly
  3. Orphaned folders in cache are now marked properly so it won’t try to mark them repeatedly

Summary by Sourcery

Improve the resilience and correctness of image cache updates by handling missing files and folders during scanning, caching empty directories properly, and marking orphaned items without repeated processing.

Bug Fixes:

  • Wrap OS file and directory operations in exception handling to skip items removed during processing
  • Cache folders without files so they aren’t rediscovered on each scan
  • Mark orphaned folders in the database instead of repeatedly attempting to purge them

Enhancements:

  • Add detailed logging for folder/file discovery, SQL queries, orphan purges, and shutdown completion
  • Skip folder info updates when no changes are detected and batch-insert new folders to the database

1. Trap all OS calls on file/folder as it might be deleted between it was listed and before its turn to be processed
2. Folders that don’t contain any files are now cached properly so they won’t be discovered as new repeatedly
3. Orphaned folders in cache are now marked properly so it won’t try to mark them repeatedly
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Oct 1, 2025

Reviewer's Guide

This PR enhances resilience and observability in image_cache.py by trapping filesystem races with try/except, enriching logging throughout the cache lifecycle, refining folder update logic to avoid redundant operations, and correcting orphaned folder handling to mark rather than repeatedly purge entries.

Sequence diagram for updated cache update and error handling

sequenceDiagram
participant Cache as ImageCache
participant FS as FileSystem
participant DB as Database
participant Logger as Logger
Cache->>FS: List modified folders
alt Folder missing during stat
    Cache->>Logger: Warn folder no longer exists
end
Cache->>FS: List files in folders
alt File missing during getmtime
    Cache->>Logger: Warn file no longer exists
end
Cache->>DB: Query for new/modified files
alt Files found
    Cache->>Logger: Info about new files
    loop For each file
        Cache->>DB: Insert file
        alt OSError during insert
            Cache->>Logger: Warn file inaccessible
        end
    end
end
Cache->>DB: Update folder info
Cache->>Logger: Info about cache update
Loading

Class diagram for updated ImageCache methods and error handling

classDiagram
class ImageCache {
    - __modified_files
    - __modified_folders
    - __db
    - __db_write_lock
    - __logger
    - __pause_looping
    - __shutdown_completed
    + update_cache()
    + __get_modified_folders()
    + __get_modified_files(modified_folders)
    + __insert_file(file, file_id=None)
    + __update_folder_info(folder_collection)
    + __purge_missing_files_and_folders()
    + query_cache(where_clause, sort_clause)
}
ImageCache --> "1" Database
ImageCache --> "1" Logger
Loading

File-Level Changes

Change Details Files
Enhanced logging across cache operations
  • Log shutdown completion at INFO level
  • Report counts of discovered folders and files in update_cache
  • Add debug logging for executed SQL queries
  • Log purge and marking of orphaned files and folders
src/picframe/image_cache.py
Trap OS race conditions on files and folders
  • Wrap __insert_file calls in try/except OSError
  • Guard os.stat(dir).st_mtime with try/except and skip missing dirs
  • Guard os.path.getmtime(file) with try/except and skip missing files
src/picframe/image_cache.py
Refine folder info update to avoid empty writes
  • Exit early if folder_collection is empty
  • Pre-insert folders with INSERT OR IGNORE before updating metadata
src/picframe/image_cache.py
Correct orphaned folder cleanup logic
  • Select only non-missing folders when purge_files is false
  • Log and mark missing folders instead of repeated purges
  • Conditionally purge or mark folders based on purge_files flag
src/picframe/image_cache.py

Possibly linked issues

  • Thankyou! & Wiki suggestion #123: The PR fixes the bug in __update_folder_info by adding an INSERT statement for folders, as suggested in the issue, preventing repeated scanning of empty folders.
  • #0: The PR adds error handling for FileNotFoundError when accessing files and folders, directly resolving the stalling issue caused by missing images.
  • Proof of concept #1: PR adds error handling for files that are not completely written or accessible during cache updates, which directly addresses the missing image meta info problem.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Please avoid using bare excepts in __get_modified_folders and __get_modified_files—catch specific exceptions like FileNotFoundError or OSError to prevent masking other errors.
  • There’s a typo in the log messages ('modifled'); please correct it to 'modified' for clarity.
  • The new info-level logs in update_cache may be too verbose if this runs frequently—consider using debug level or throttling these messages for consistency.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Please avoid using bare excepts in __get_modified_folders and __get_modified_files—catch specific exceptions like FileNotFoundError or OSError to prevent masking other errors.
- There’s a typo in the log messages ('modifled'); please correct it to 'modified' for clarity.
- The new info-level logs in update_cache may be too verbose if this runs frequently—consider using debug level or throttling these messages for consistency.

## Individual Comments

### Comment 1
<location> `src/picframe/image_cache.py:376` </location>
<code_context>
-            mod_tm = int(os.stat(dir).st_mtime)
+            try:
+                mod_tm = int(os.stat(dir).st_mtime)
+            except:
+                # This folder must have been removed since the walk
+                self.__logger.warning("Folder '%s' no longer exists", dir)
</code_context>

<issue_to_address>
**issue (bug_risk):** Bare except may mask unexpected errors when checking folder modification time.

Catch specific exceptions like FileNotFoundError or OSError instead of using a bare except to avoid suppressing unrelated errors.
</issue_to_address>

### Comment 2
<location> `src/picframe/image_cache.py:376` </location>
<code_context>
    def __get_modified_folders(self):
        out_of_date_folders = []
        sql_select = "SELECT * FROM folder WHERE name = ?"
        for dir in [d[0] for d in os.walk(self.__picture_dir, followlinks=self.__follow_links)]:
            if os.path.basename(dir):
                if os.path.basename(dir)[0] == '.':
                    continue  # ignore hidden folders
            try:
                mod_tm = int(os.stat(dir).st_mtime)
            except:
                # This folder must have been removed since the walk
                self.__logger.warning("Folder '%s' no longer exists", dir)
                continue
            found = self.__db.execute(sql_select, (dir,)).fetchone()
            if not found or found['last_modified'] < mod_tm or found['missing'] == 1:
                out_of_date_folders.append((dir, mod_tm))
        return out_of_date_folders

</code_context>

<issue_to_address>
**suggestion (code-quality):** Use `except Exception:` rather than bare `except:` ([`do-not-use-bare-except`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/do-not-use-bare-except/))

```suggestion
            except Exception:
```
</issue_to_address>

### Comment 3
<location> `src/picframe/image_cache.py:404` </location>
<code_context>
    def __get_modified_files(self, modified_folders):
        out_of_date_files = []
        # sql_select = "SELECT fname, last_modified FROM all_data WHERE fname = ? and last_modified >= ?"
        sql_select = """
        SELECT file.basename, file.last_modified
            FROM file
                INNER JOIN folder
                    ON folder.folder_id = file.folder_id
            WHERE file.basename = ? AND file.extension = ? AND folder.name = ? AND file.last_modified >= ?
        """
        for dir, _date in modified_folders:
            for file in os.listdir(dir):
                base, extension = os.path.splitext(file)
                if (extension.lower() in (ImageCache.EXTENSIONS + VIDEO_EXTENSIONS)
                        # have to filter out all the Apple junk
                        and '.AppleDouble' not in dir and not file.startswith('.')):
                    full_file = os.path.join(dir, file)
                    try:
                        mod_tm = os.path.getmtime(full_file)
                    except:
                        # This file must have been removed since the listdir
                        self.__logger.warning("File '%s' no longer exists", full_file)
                        continue
                    found = self.__db.execute(sql_select, (base, extension.lstrip("."), dir, mod_tm)).fetchone()
                    if not found:
                        out_of_date_files.append(full_file)
        return out_of_date_files

</code_context>

<issue_to_address>
**suggestion (code-quality):** Use `except Exception:` rather than bare `except:` ([`do-not-use-bare-except`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/do-not-use-bare-except/))

```suggestion
                    except Exception:
```
</issue_to_address>

### Comment 4
<location> `src/picframe/image_cache.py:470` </location>
<code_context>
    def __purge_missing_files_and_folders(self):
        # Find folders in the db that are no longer on disk
        self.__logger.debug('Checking for orphaned folders in cache')
        folder_id_list = []
        sql = 'SELECT folder_id, name from folder'
        if not self.__purge_files:
            sql += ' where missing = 0'
        for row in self.__db.execute(sql):
            if not os.path.exists(row['name']):
                folder_id_list.append([row['folder_id']])

        # Flag or delete any non-existent folders from the db. Note, deleting will automatically
        # remove orphaned records from the 'file' and 'meta' tables
        if len(folder_id_list):
            self.__db_write_lock.acquire()
            if self.__purge_files:
                self.__logger.info('Purging %d orphaned folders in cache', len(folder_id_list))
                self.__db.executemany('DELETE FROM folder WHERE folder_id = ?', folder_id_list)
            else:
                self.__logger.info('Marking %d orphaned folders in cache', len(folder_id_list))
                self.__db.executemany('UPDATE folder SET missing = 1 WHERE folder_id = ?', folder_id_list)
            self.__db_write_lock.release()

        # Find files in the db that are no longer on disk
        if self.__purge_files:
            file_id_list = []
            for row in self.__db.execute('SELECT file_id, fname from all_data'):
                if not os.path.exists(row['fname']):
                    file_id_list.append([row['file_id']])

            # Delete any non-existent files from the db. Note, this will automatically
            # remove matching records from the 'meta' table as well.
            self.__logger.info('Purging %d orphaned files in cache', len(file_id_list))
            if len(file_id_list):
                self.__db_write_lock.acquire()
                self.__db.executemany('DELETE FROM file WHERE file_id = ?', file_id_list)
                self.__db_write_lock.release()
            self.__purge_files = False

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Move assignment closer to its usage within a block ([`move-assign-in-block`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/move-assign-in-block/))
- Convert for loop into list comprehension [×2] ([`list-comprehension`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/list-comprehension/))
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

mod_tm = int(os.stat(dir).st_mtime)
try:
mod_tm = int(os.stat(dir).st_mtime)
except:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Bare except may mask unexpected errors when checking folder modification time.

Catch specific exceptions like FileNotFoundError or OSError instead of using a bare except to avoid suppressing unrelated errors.

mod_tm = int(os.stat(dir).st_mtime)
try:
mod_tm = int(os.stat(dir).st_mtime)
except:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use except Exception: rather than bare except: (do-not-use-bare-except)

Suggested change
except:
except Exception:

mod_tm = os.path.getmtime(full_file)
try:
mod_tm = os.path.getmtime(full_file)
except:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use except Exception: rather than bare except: (do-not-use-bare-except)

Suggested change
except:
except Exception:

@@ -442,8 +468,12 @@ def __get_meta_sql_from_dict(self, dict):

def __purge_missing_files_and_folders(self):
# Find folders in the db that are no longer on disk
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant