Skip to content

Update nsfw detection and improve tool latency#35

Open
Agent-Hellboy wants to merge 25 commits intomainfrom
improve_nfsw_detection_and_tool_latency
Open

Update nsfw detection and improve tool latency#35
Agent-Hellboy wants to merge 25 commits intomainfrom
improve_nfsw_detection_and_tool_latency

Conversation

@Agent-Hellboy
Copy link
Owner

@Agent-Hellboy Agent-Hellboy commented Jun 2, 2025

fixes #33 #34

Summary by CodeRabbit

  • New Features

    • Introduced advanced NSFW detection using a TensorFlow Lite model for image scanning.
    • Added detailed scan results, including NSFW probability scores for each file.
    • Improved statistics reporting with counts and lists of NSFW and safe files.
  • Improvements

    • Enhanced performance by processing images in memory without temporary file storage.
    • Clarified and expanded build prerequisites and instructions in the documentation.
    • Unified naming conventions and improved dependency handling for easier setup.
    • Updated build workflow to include TensorFlow Lite compilation and streamlined configuration.
    • Refined usage instructions with options for local folder scanning and thread sanitizer builds.
    • Improved GitHub Actions workflow with detailed dependency setup and build verification.
    • Enhanced macOS configure script with checks for TensorFlow Lite and FlatBuffers dependencies.
    • Added an installation script to automate dependency setup and library builds.
    • Replaced naive skin detection with a robust class-based NSFW detection implementation.
    • Improved scanning process by integrating the NSFW detector throughout recursive and asynchronous tasks.
    • Updated project and executable naming to "iPurity" for consistency.
  • Bug Fixes

    • Improved error handling for missing dependencies and model initialization failures.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 2, 2025

Walkthrough

This update introduces a class-based NSFW detection system using TensorFlow Lite and OpenCV, refactors image scanning to operate in-memory without temporary files, and enhances dependency management in the build system. The project and executable are renamed to "iPurity," and new header files formalize public interfaces for scanning and detection.

Changes

File(s) Change Summary
CMakeLists.txt Project and target renamed to "iPurity"; improved dependency detection for OpenCV, TensorFlow Lite, and others; updated install and tarball naming; enhanced error handling.
README.md Expanded and clarified prerequisites; added TensorFlow Lite, libplist, and CMake; improved build instructions.
src/main.cpp NSFWDetector instance is initialized and used for scanning; threshold handling updated; integrates new detector logic.
src/nsfw_detector.cpp, src/nsfw_detector.h Introduced NSFWDetector class for TensorFlow Lite-based image analysis; added model loading, preprocessing, and detection methods; removed naive skin detection.
src/scanner.cpp, src/scanner.h Refactored to download files into memory buffers; image decoding and NSFW detection now use NSFWDetector; updated function signatures and statistics reporting.
.github/workflows/c-cpp.yml Updated CI to install additional dependencies; added Bazel build step for TensorFlow Lite library; improved CMake configuration commands.
configure Enhanced configure script to check for TensorFlow Lite and FlatBuffers dependencies with detailed error messages and path discovery.
install.sh Added installation script automating dependency setup including Homebrew packages, FlatBuffers build, and TensorFlow Lite build via Bazel.
include/nsfw_detector.h Removed declaration of naiveNSFWCheck function; added NSFWDetector class declaration in new header.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Main
    participant NSFWDetector
    participant Scanner

    User->>Main: Start iPurity
    Main->>NSFWDetector: initialize(modelPath)
    NSFWDetector-->>Main: Initialization status
    Main->>Scanner: scan_directory(pool, path, stats, threshold, detector)
    loop For each image file
        Scanner->>Scanner: download_file_to_buffer(...)
        Scanner->>NSFWDetector: detectNSFW(image)
        NSFWDetector-->>Scanner: NSFW probability
        Scanner->>Scanner: Update stats & output result
    end
    Scanner-->>Main: Scanning complete
Loading

Assessment against linked issues

Objective Addressed Explanation
Replace naive skin detection with TensorFlow Lite-based deep learning model (#33) NSFWDetector class implemented using TensorFlow Lite; replaces previous naive detection logic.

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes detected relative to the objectives in the linked issues.

Poem

In the warren where code bunnies dwell,
iPurity hops with a new name to tell.
With TensorFlow Lite and OpenCV in tow,
Images are scanned—no temp files below!
NSFWDetector sniffs with care,
All in memory, swift as air.
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2af655d and a9b097e.

📒 Files selected for processing (2)
  • CMakeLists.txt (1 hunks)
  • src/nsfw_detector.cpp (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • CMakeLists.txt
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (3)
src/nsfw_detector.cpp (3)

10-16: LGTM: Proper resource management.

The constructor and destructor correctly manage the initialization state and properly clean up TensorFlow Lite resources using smart pointer reset().


18-42: LGTM: Robust model path resolution.

The function provides good fallback logic to locate the model file across different installation paths, including Homebrew locations and development directories.


44-82: LGTM: Comprehensive model initialization with proper error handling.

The initialization method properly loads the TensorFlow Lite model, builds the interpreter, allocates tensors, and provides detailed error messages for debugging.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
src/nsfw_detector.cpp (3)

8-12: Redundant reset() calls in destructor

The explicit reset() calls are unnecessary since std::unique_ptr automatically releases resources in its destructor.

 NSFWDetector::~NSFWDetector() {
-    // Cleanup resources
-    interpreter.reset();
-    model.reset();
+    // Destructor - unique_ptrs automatically clean up
 }

88-91: Document the model output format assumption

The code assumes output[1] contains the NSFW probability, but this depends on the specific model being used.

Add a comment documenting this assumption:

     // Get output
     float* output = interpreter->typed_output_tensor<float>(0);
-    return output[1]; // Assuming output[1] is the NSFW probability
+    // Model output format: [safe_probability, nsfw_probability]
+    // Return the NSFW probability (index 1)
+    return output[1];

110-141: Consider refactoring for consistency with NSFWDetector

The naiveNSFWCheck function still uses file paths while NSFWDetector uses cv::Mat. Consider refactoring for consistency.

Refactor to accept cv::Mat instead of file path:

-bool naiveNSFWCheck(const std::string& imagePath, float skinThreshold) {
-    // 1. Load the image in BGR format
-    cv::Mat imgBGR = cv::imread(imagePath, cv::IMREAD_COLOR);
-    if (imgBGR.empty()) {
-        std::cerr << "Could not load image: " << imagePath << std::endl;
-        return false;
-    }
+bool naiveNSFWCheck(const cv::Mat& imgBGR, float skinThreshold) {
+    if (imgBGR.empty()) {
+        return false;
+    }
CMakeLists.txt (2)

8-12: Redundant check after REQUIRED flag

When using find_package(OpenCV REQUIRED), CMake will automatically fail if OpenCV is not found. The additional check is unnecessary.

Remove the redundant check:

 # Find OpenCV
 find_package(OpenCV REQUIRED)
-if(NOT OpenCV_FOUND)
-    message(FATAL_ERROR "OpenCV not found! Please install OpenCV and set OpenCV_DIR if needed.")
-endif()

38-38: Remove unused IMOBILEDEVICE_LIBS variable

The IMOBILEDEVICE_LIBS variable is defined but not used in linking. The libraries are linked directly instead.

Remove the unused variable:

-# Set additional libraries (iMobileDevice)
-set(IMOBILEDEVICE_LIBS "-limobiledevice -lplist")

Also applies to: 56-61

src/scanner.cpp (1)

24-41: Consider memory limits for large files

The in-memory approach improves performance but could cause issues with very large image files. Consider adding a size check or streaming approach for files above a threshold.

Add a size limit check:

// Add before the read loop
afc_file_info_t* fileInfo = NULL;
if (afc_get_file_info(afc, remotePath, &fileInfo) == AFC_E_SUCCESS) {
    // Parse file size from info
    // If size > threshold, handle differently or warn
    afc_dictionary_free(fileInfo);
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f227af and 2555085.

📒 Files selected for processing (7)
  • CMakeLists.txt (3 hunks)
  • README.md (4 hunks)
  • src/main.cpp (2 hunks)
  • src/nsfw_detector.cpp (1 hunks)
  • src/nsfw_detector.h (1 hunks)
  • src/scanner.cpp (4 hunks)
  • src/scanner.h (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/nsfw_detector.h (1)
src/nsfw_detector.cpp (2)
  • NSFWDetector (6-6)
  • NSFWDetector (8-12)
src/nsfw_detector.cpp (1)
src/nsfw_detector.h (5)
  • NSFWDetector (13-13)
  • NSFWDetector (14-14)
  • modelPath (16-16)
  • input (24-24)
  • image (17-17)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (13)
README.md (2)

21-26: LGTM! Clear documentation of new dependencies.

The updated prerequisites section accurately reflects the new dependencies introduced with the TensorFlow Lite-based NSFW detection system. The additions are well-organized and provide clear guidance to users.


49-49: Helpful user guidance for dependency issues.

This note provides practical guidance for users encountering build issues due to missing dependencies, improving the overall user experience.

src/main.cpp (3)

32-32: Good default threshold choice.

The default threshold of 0.5f aligns with the NSFW_THRESHOLD constant defined in the NSFWDetector class, ensuring consistency across the codebase.


62-62: Clean integration of detector into scanning workflow.

The updated function call properly passes the detector pointer, enabling the new TensorFlow Lite-based NSFW detection during directory scanning.


25-30:

Details

❓ Verification inconclusive

Verify the hardcoded model path exists.

The initialization looks clean with proper error handling. However, the hardcoded model path "models/nsfw_model.tflite" should be verified to ensure the file exists in the expected location relative to the executable.


🏁 Script executed:

#!/bin/bash
# Description: Check if the NSFW model file exists in the expected location
# Expected: Find the model file in the models directory

fd "nsfw_model.tflite" --type f

Length of output: 31


Verify presence of NSFW model file

I ran a search for models/nsfw_model.tflite but didn’t locate it in the repo. Please confirm that:

  • The file models/nsfw_model.tflite exists at runtime (relative to your executable’s working directory).
  • It’s included in your repository or distribution package, or update the hardcoded path accordingly.

No code changes are needed if the file is already in place—just ensure it’s committed and deployed.

src/scanner.h (1)

1-18: Well-designed interface with clean separation of concerns.

The header file provides a clean interface for the scanning functionality:

  • Proper header guards
  • Well-structured ScanStats for tracking scan results
  • Clear function signature that accepts the NSFWDetector pointer
  • Minimal and appropriate includes

The design promotes good modularity and testability.

src/nsfw_detector.h (2)

11-27: Excellent class design following RAII principles.

The NSFWDetector class demonstrates solid design principles:

  • Smart pointers for automatic resource management of TensorFlow Lite objects
  • Clear public interface with initialize and detectNSFW methods
  • Proper encapsulation with private preprocessing method
  • Appropriate constants (INPUT_SIZE = 224 is standard for image classification)
  • The NSFW_THRESHOLD = 0.5f constant aligns with the default used in main.cpp

The interface supports both initialization error checking and per-image detection, enabling robust integration into the scanning workflow.


20-22: Smart pointer usage ensures proper resource cleanup.

Using std::unique_ptr for the TensorFlow Lite model and interpreter ensures automatic cleanup when the detector is destroyed. This aligns well with the destructor implementation seen in the relevant code snippets from nsfw_detector.cpp.

src/nsfw_detector.cpp (1)

14-40: Well-structured initialization with proper error handling

The initialization method properly loads the model, builds the interpreter, and allocates tensors with clear error messages at each step.

CMakeLists.txt (2)

6-6: Correct CMake boolean value

Good fix - ON is the standard CMake boolean value.


14-19: Good TensorFlow Lite detection logic

The detection handles multiple library name variants and provides clear error messages.

src/scanner.cpp (2)

43-88: Excellent refactoring to in-memory processing

The conversion from file-based to buffer-based processing improves performance, and the probability score display provides valuable feedback. Thread safety is properly maintained.


90-124: Correct propagation of NSFWDetector instance

The function properly passes the detector instance through recursive directory scanning and asynchronous processing.

Comment on lines +42 to +63
bool NSFWDetector::preprocessImage(const cv::Mat& input, float* output) {
cv::Mat resized;
cv::resize(input, resized, cv::Size(INPUT_SIZE, INPUT_SIZE));

// Convert to RGB and normalize
cv::Mat rgb;
cv::cvtColor(resized, rgb, cv::COLOR_BGR2RGB);
rgb.convertTo(rgb, CV_32FC3, 1.0/255.0);

// Copy data to input tensor
float* input_tensor = interpreter->typed_input_tensor<float>(0);
for (int y = 0; y < INPUT_SIZE; y++) {
for (int x = 0; x < INPUT_SIZE; x++) {
cv::Vec3f pixel = rgb.at<cv::Vec3f>(y, x);
*input_tensor++ = pixel[0];
*input_tensor++ = pixel[1];
*input_tensor++ = pixel[2];
}
}

return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Critical: Unused parameter and inefficient implementation

The method has several issues:

  1. The output parameter is never used - the method writes directly to the interpreter's input tensor
  2. No error handling or validation
  3. Manual pixel copying is inefficient

Refactor to remove the unused parameter and improve efficiency:

-bool NSFWDetector::preprocessImage(const cv::Mat& input, float* output) {
+bool NSFWDetector::preprocessImage(const cv::Mat& input) {
+    if (!interpreter) {
+        return false;
+    }
+    
     cv::Mat resized;
     cv::resize(input, resized, cv::Size(INPUT_SIZE, INPUT_SIZE));
     
     // Convert to RGB and normalize
     cv::Mat rgb;
     cv::cvtColor(resized, rgb, cv::COLOR_BGR2RGB);
     rgb.convertTo(rgb, CV_32FC3, 1.0/255.0);
     
     // Copy data to input tensor
     float* input_tensor = interpreter->typed_input_tensor<float>(0);
-    for (int y = 0; y < INPUT_SIZE; y++) {
-        for (int x = 0; x < INPUT_SIZE; x++) {
-            cv::Vec3f pixel = rgb.at<cv::Vec3f>(y, x);
-            *input_tensor++ = pixel[0];
-            *input_tensor++ = pixel[1];
-            *input_tensor++ = pixel[2];
-        }
-    }
+    // More efficient memory copy
+    if (rgb.isContinuous()) {
+        memcpy(input_tensor, rgb.ptr<float>(), INPUT_SIZE * INPUT_SIZE * 3 * sizeof(float));
+    } else {
+        for (int y = 0; y < INPUT_SIZE; y++) {
+            memcpy(input_tensor + y * INPUT_SIZE * 3, 
+                   rgb.ptr<float>(y), 
+                   INPUT_SIZE * 3 * sizeof(float));
+        }
+    }
     
     return true;
 }

Also update the call in detectNSFW:

-    float* input = interpreter->typed_input_tensor<float>(0);
-    if (!preprocessImage(image, input)) {
+    if (!preprocessImage(image)) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
bool NSFWDetector::preprocessImage(const cv::Mat& input, float* output) {
cv::Mat resized;
cv::resize(input, resized, cv::Size(INPUT_SIZE, INPUT_SIZE));
// Convert to RGB and normalize
cv::Mat rgb;
cv::cvtColor(resized, rgb, cv::COLOR_BGR2RGB);
rgb.convertTo(rgb, CV_32FC3, 1.0/255.0);
// Copy data to input tensor
float* input_tensor = interpreter->typed_input_tensor<float>(0);
for (int y = 0; y < INPUT_SIZE; y++) {
for (int x = 0; x < INPUT_SIZE; x++) {
cv::Vec3f pixel = rgb.at<cv::Vec3f>(y, x);
*input_tensor++ = pixel[0];
*input_tensor++ = pixel[1];
*input_tensor++ = pixel[2];
}
}
return true;
}
bool NSFWDetector::preprocessImage(const cv::Mat& input) {
if (!interpreter) {
return false;
}
cv::Mat resized;
cv::resize(input, resized, cv::Size(INPUT_SIZE, INPUT_SIZE));
// Convert to RGB and normalize
cv::Mat rgb;
cv::cvtColor(resized, rgb, cv::COLOR_BGR2RGB);
rgb.convertTo(rgb, CV_32FC3, 1.0/255.0);
// Copy data to input tensor
float* input_tensor = interpreter->typed_input_tensor<float>(0);
// More efficient memory copy
if (rgb.isContinuous()) {
memcpy(input_tensor,
rgb.ptr<float>(),
INPUT_SIZE * INPUT_SIZE * 3 * sizeof(float));
} else {
for (int y = 0; y < INPUT_SIZE; y++) {
memcpy(input_tensor + y * INPUT_SIZE * 3,
rgb.ptr<float>(y),
INPUT_SIZE * 3 * sizeof(float));
}
}
return true;
}
🤖 Prompt for AI Agents
In src/nsfw_detector.cpp between lines 42 and 63, the preprocessImage function
has an unused output parameter and inefficient manual pixel copying. Remove the
unused float* output parameter from the function signature and update all calls
accordingly, including detectNSFW. Replace the manual pixel-by-pixel copying
with a more efficient method, such as directly copying or using OpenCV functions
to fill the interpreter's input tensor. Add basic validation to check input
image size and type before processing to improve robustness.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
configure (1)

1-172: ⚠️ Potential issue

Critical: CMake configuration in wrong file.

This file is named configure but contains CMakeLists.txt content. This will break the build system as:

  1. CMake expects the configuration to be in a file named CMakeLists.txt
  2. The GitHub Actions workflow and README instructions reference CMake commands that require CMakeLists.txt

The content itself looks well-structured with proper dependency detection and error messages, but it needs to be in the correct file.

Please move this content to CMakeLists.txt and remove the configure file, or if you need a configure script for compatibility, create a simple shell script that invokes CMake.

🧹 Nitpick comments (6)
.github/workflows/c-cpp.yml (1)

21-27: Consider caching the TensorFlow Lite build artifacts.

Building TensorFlow Lite with Bazel can be time-consuming. Consider using GitHub Actions cache to store the built artifacts and speed up subsequent workflow runs.

Would you like me to provide a caching configuration for the TensorFlow Lite build artifacts?

README.md (4)

85-89: Add language specifier to the code block.

The code block should specify the language for proper syntax highlighting.

-   ```
+   ```text
    /Users/agent-hellboy/iPurity/src/tensorflow/bazel-bin/tensorflow/lite/libtensorflowlite.so
    ```
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

86-86: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


91-95: Add language specifier to the code block.

The code block should specify the language for proper syntax highlighting.

-   ```
+   ```text
    /Users/agent-hellboy/iPurity/src/tensorflow  (root containing `tensorflow/lite/interpreter.h`)
    ```
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

92-92: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


96-100: Add language specifier to the code block.

The code block should specify the language for proper syntax highlighting.

-   ```
+   ```text
    /usr/local/include/flatbuffers   (via Homebrew)
    ```
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

97-97: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


128-132: Address the TODO comment.

There's a TODO about configuring the build for macOS to avoid passing build flags during CMake configuration. This should be tracked or resolved.

Would you like me to help implement the macOS-specific configuration or create an issue to track this task?

configure (1)

26-43: Consider making find paths more flexible.

Using NO_DEFAULT_PATH in find commands might be too restrictive and could prevent CMake from finding TensorFlow Lite in non-standard installations.

Consider removing NO_DEFAULT_PATH or providing a CMake option to disable it:

 find_path(
     TFLITE_INCLUDE_DIR
     NAMES tensorflow/lite/interpreter.h
     PATHS
       ${PROJECT_SOURCE_DIR}/src/tensorflow
       /usr/local/include
       /opt/homebrew/include
-      NO_DEFAULT_PATH
 )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2555085 and 3002e0f.

📒 Files selected for processing (4)
  • .github/workflows/c-cpp.yml (1 hunks)
  • README.md (3 hunks)
  • configure (1 hunks)
  • src/scanner.cpp (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/scanner.cpp
🧰 Additional context used
🪛 LanguageTool
README.md

[grammar] ~90-~90: The verb ‘include’ does not usually follow articles like ‘the’. Check that ‘include’ is spelled correctly; using ‘include’ as a noun may be non-standard.
Context: .../libtensorflowlite.so 5. **Note the include directory**: /Users/agent-he...

(A_INFINITIVE)

🪛 markdownlint-cli2 (0.17.2)
README.md

86-86: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


92-92: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


97-97: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🔇 Additional comments (3)
.github/workflows/c-cpp.yml (2)

17-20: LGTM! Dependencies align with the new TensorFlow Lite integration.

The addition of flatbuffers, cmake, make, and bazel is necessary for building TensorFlow Lite from source.


31-36: Good improvement to the build directory handling.

The cd build || mkdir build && cd build pattern properly handles both cases where the build directory exists or needs to be created.

README.md (1)

21-100: Excellent documentation improvements!

The comprehensive prerequisites section with detailed installation steps for each dependency greatly improves the developer experience. The step-by-step TensorFlow Lite build instructions are particularly helpful.

🧰 Tools
🪛 LanguageTool

[grammar] ~90-~90: The verb ‘include’ does not usually follow articles like ‘the’. Check that ‘include’ is spelled correctly; using ‘include’ as a noun may be non-standard.
Context: .../libtensorflowlite.so 5. **Note the include directory**: /Users/agent-he...

(A_INFINITIVE)

🪛 markdownlint-cli2 (0.17.2)

86-86: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


92-92: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


97-97: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
.github/workflows/c-cpp.yml (4)

17-24: Cache Homebrew packages to speed up workflow
Running brew update and reinstalling dependencies on every execution adds significant latency. Consider caching Homebrew’s cache directory and installed formulae using actions/cache, then restoring it on subsequent runs.

Proposed diff:

-      - name: Install dependencies
-        run: |
-          brew update
-          brew unlink bazelisk || true
-          brew install pkg-config opencv libimobiledevice libplist flatbuffers cmake make bazel
+      - name: Cache Homebrew packages
+        uses: actions/cache@v3
+        with:
+          path: ~/Library/Caches/Homebrew
+          key: ${{ runner.os }}-brew-${{ hashFiles('**/Brewfile') }}
+
+      - name: Install dependencies
+        run: |
+          brew update --quiet
+          brew unlink bazelisk || true
+          brew install --quiet pkg-config opencv libimobiledevice libplist flatbuffers cmake make bazel

27-27: Shallow-clone TensorFlow to reduce clone time
Cloning the entire TensorFlow history can be slow and bandwidth-heavy. Since only the latest state is needed for TFLite, use a depth-limited clone:

- git clone https://github.com/tensorflow/tensorflow.git src/tensorflow
+ git clone --depth 1 https://github.com/tensorflow/tensorflow.git src/tensorflow

31-36: Optimize Bazel build performance with caching and parallelism
Building TensorFlow Lite from source each run is costly. You can:

  1. Cache Bazel’s output base (~/.cache/bazel) across runs.
  2. Increase parallelism with --jobs=$(sysctl -n hw.ncpu).

Example caching step:

- name: Cache Bazel output
  uses: actions/cache@v3
  with:
    path: ~/.cache/bazel
    key: ${{ runner.os }}-bazel-${{ hashFiles('**/BUILD') }}

And update the build command:

- bazel build -c opt //tensorflow/lite:libtensorflowlite.dylib
+ bazel build -c opt --jobs=$(sysctl -n hw.ncpu) //tensorflow/lite:libtensorflowlite.dylib

42-46: Refactor hardcoded paths into reusable environment variables
The CMake invocation repeats long paths. Defining an env block at the job or step level improves readability and reduces maintenance overhead. For example:

env:
  TFLITE_ROOT: ${{ github.workspace }}/src/tensorflow
  TFLITE_LIB: ${{ env.TFLITE_ROOT }}/bazel-bin/tensorflow/lite/libtensorflowlite.dylib

- cmake \
-   -DTFLITE_INCLUDE_DIR=${{ github.workspace }}/src/tensorflow \
-   -DTFLITE_LIB=${{ github.workspace }}/src/tensorflow/bazel-bin/tensorflow/lite/libtensorflowlite.dylib \
+ cmake \
+   -DTFLITE_INCLUDE_DIR=${{ env.TFLITE_ROOT }} \
+   -DTFLITE_LIB=${{ env.TFLITE_LIB }} \
    -DCMAKE_BUILD_TYPE=Release \
    ..

This makes future path adjustments much easier.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8722d1b and 0303026.

📒 Files selected for processing (1)
  • .github/workflows/c-cpp.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
.github/workflows/c-cpp.yml (4)

17-17: Leverage caching to speed up macOS builds
Running brew update on every workflow run can slow down CI. Consider using actions/cache to cache Homebrew’s keg files or the Homebrew/homebrew-cache action to improve build latency.


19-21: Unlinking Bazelisk is safe but can be streamlined
The brew unlink bazelisk || true step correctly avoids failures if Bazelisk isn't installed. Alternatively, you could use brew link --overwrite bazelisk to ensure a clean link, or manage all deps via a Brewfile and brew bundle for consistency.


22-23: Pin Homebrew dependencies or use a Brewfile
Installing packages without version constraints can lead to unexpected upgrades and breakages. For reproducible builds, consider using a Brewfile with brew bundle or explicitly pin critical packages (e.g., bazelisk@X.Y.Z).


35-38: Optimize TensorFlow checkout and build steps
Cloning the full TensorFlow repo and running ./configure without parameters can be time-consuming. To reduce CI latency:

  • Shallow clone: git clone --depth 1 …
  • Supply explicit flags to ./configure instead of piping empty input.
  • Cache Bazel outputs with actions/cache keyed on workspace and Bazel version.

These changes will make your TFLite build faster and more robust.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0303026 and 55fc392.

📒 Files selected for processing (1)
  • .github/workflows/c-cpp.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (1)
.github/workflows/c-cpp.yml (1)

44-48: Verify CMake include/lib paths for TFLite
Ensure TFLITE_INCLUDE_DIR points to the actual header folder (e.g., src/tensorflow/tensorflow/lite) and not the repo root. For example:

- -DTFLITE_INCLUDE_DIR=${{ github.workspace }}/src/tensorflow \
+ -DTFLITE_INCLUDE_DIR=${{ github.workspace }}/src/tensorflow/tensorflow/lite

Also confirm your CMakeLists.txt uses these variables in target_include_directories and target_link_libraries.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
.github/workflows/c-cpp.yml (4)

17-18: Eliminate redundant brew unlink bazelisk
The brew install bazelisk step will handle linking automatically. Unlinking Bazelisk beforehand adds an extra step without clear benefit.


27-30: Use shallow clone for TensorFlow to reduce latency
Cloning the full TensorFlow repository can be very slow. A shallow clone will speed up CI:

- git clone https://github.com/tensorflow/tensorflow.git src/tensorflow
+ git clone --depth 1 https://github.com/tensorflow/tensorflow.git src/tensorflow

31-37: Cache Bazelisk and Bazel build outputs to improve CI latency
Building TensorFlow Lite from scratch on every run is time-consuming. Consider adding actions/cache entries for:

  1. Bazelisk installation (e.g., the Homebrew Cellar path)
  2. Bazel’s cache directory (~/.cache/bazel)

This will significantly reduce build times for subsequent workflows.


59-63: Re-enable CI tests for build verification
The commented-out check and distcheck targets help ensure regressions are caught early. Consider enabling at least a basic test step to validate the new NSFW detector integration.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ddde90e and 065decc.

📒 Files selected for processing (1)
  • .github/workflows/c-cpp.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (2)
.github/workflows/c-cpp.yml (2)

23-26: Appropriate Bazelisk PATH configuration
Appending Bazelisk’s bin directory to $GITHUB_PATH avoids permission issues on macOS runners and ensures bazel resolves to Bazelisk. This effectively addresses the previous symlink concern.


45-47: Verify TensorFlow Lite include path in CMake configuration
-DTFLITE_INCLUDE_DIR=${{ github.workspace }}/src/tensorflow may not directly point to the headers needed (e.g., tensorflow/lite/interpreter.h). Please confirm that your CMakeLists resolves the correct include subdirectories under src/tensorflow.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
.github/workflows/c-cpp.yml (4)

17-25: Cache Homebrew packages to speed up workflow
brew update and reinstalling packages on every run adds latency. Consider using actions/cache to persist Homebrew’s keg cellar (e.g., /usr/local/Homebrew/Cellar) and reusing installed formulae between jobs. You can key the cache on the macOS runner and brewfile checksum to avoid full updates each time.


35-42: Cache Bazel build outputs for speed
Building TFLite from source with Bazelisk can be slow on each run. Use actions/cache to persist Bazel’s output directories (e.g., ~/.cache/bazel, src/tensorflow/bazel-*) and add --disk_cache=$HOME/.cache/bazel to the Bazelisk command to reuse prior builds.


44-58: Quote variables in CMake invocation for robustness
While paths usually lack spaces, wrapping expansions in quotes prevents potential errors on different environments. For example:

cmake \
  -DTFLITE_INCLUDE_DIR="${{ github.workspace }}/src/tensorflow" \
  -DTFLITE_LIB="${{ github.workspace }}/src/tensorflow/bazel-bin/tensorflow/lite/libtensorflowlite.dylib" \
  ...

Also consider exporting BREW_PREFIX at the top of this block to clean up repeated computation.


65-73: Enable Test and Dist Check steps when ready
The commented-out Test and Dist Check targets are valuable for catching regressions. Once you have CTest targets or Bazel tests for the NSFW detector and scanner, consider re-enabling these to validate builds automatically.

Would you like help defining CTest targets in your CMakeLists.txt and unblocking these steps?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 065decc and 3e5415f.

📒 Files selected for processing (1)
  • .github/workflows/c-cpp.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (2)
.github/workflows/c-cpp.yml (2)

27-29: Good: Dynamically adding Bazelisk to PATH
Using brew --prefix bazelisk and appending to $GITHUB_PATH avoids symlink permission issues on macOS runners and ensures bazel resolves to Bazelisk correctly.


60-63: Approve: CMake build step
The invocation of cmake --build . correctly leverages the prior configuration and aligns with standard CMake workflows.

Comment on lines +31 to +33
- name: Clone TensorFlow
run: |
git clone https://github.com/tensorflow/tensorflow.git src/tensorflow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Shallow-clone TensorFlow to reduce clone time
Cloning the full TensorFlow repo is time-consuming. A shallow clone (e.g., git clone --depth 1 ...) will greatly speed up this step.

🤖 Prompt for AI Agents
In .github/workflows/c-cpp.yml at lines 31 to 33, the git clone command clones
the entire TensorFlow repository, which is slow. Modify the git clone command to
perform a shallow clone by adding the --depth 1 option to reduce clone time
significantly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
configure (3)

18-26: Consider relaxing NO_DEFAULT_PATH restrictions for better portability.

The NO_DEFAULT_PATH flag prevents CMake from searching standard system locations, which could cause issues on systems where dependencies are installed in standard locations not explicitly listed in PATHS.

Consider this approach for better portability:

 find_path(
     TFLITE_INCLUDE_DIR
     NAMES tensorflow/lite/interpreter.h
     PATHS
       ${PROJECT_SOURCE_DIR}/src/tensorflow
       /usr/local/include
       /opt/homebrew/include
-      NO_DEFAULT_PATH
 )
 find_library(
     TFLITE_LIB
     NAMES tensorflow-lite tensorflowlite
     PATHS
       ${PROJECT_SOURCE_DIR}/src/tensorflow/bazel-bin/tensorflow/lite
       /usr/local/lib
       /opt/homebrew/lib
-      NO_DEFAULT_PATH
 )

Also applies to: 31-35


37-48: Enhance error message with cross-platform instructions.

The error message provides helpful Bazel build instructions but is macOS-specific and could be improved for cross-platform compatibility.

Consider updating the error message:

 if (NOT TFLITE_INCLUDE_DIR OR NOT TFLITE_LIB)
     message(FATAL_ERROR
         "TensorFlow Lite not found!\n"
         " • Build TFLite via Bazel:\n"
         "     cd ${PROJECT_SOURCE_DIR}/src/tensorflow\n"
-        "     bazel build -c opt //tensorflow/lite:libtensorflowlite.so\n"
+        "     bazel build -c opt //tensorflow/lite:tensorflowlite\n"
         " • Then rerun CMake from project root with:\n"
         "     -DTFLITE_INCLUDE_DIR=${PROJECT_SOURCE_DIR}/src/tensorflow \\\n"
-        "     -DTFLITE_LIB=${PROJECT_SOURCE_DIR}/src/tensorflow/bazel-bin/tensorflow/lite/libtensorflowlite.so \\\n"
+        "     -DTFLITE_LIB=<path_to_built_library> \\\n"
         "     -DCMAKE_BUILD_TYPE=Release"
     )
 endif()

55-63: Apply NO_DEFAULT_PATH consistency recommendations.

Similar to the TensorFlow Lite discovery, these dependency searches would benefit from removing NO_DEFAULT_PATH for better cross-platform compatibility.

Also applies to: 79-83, 84-88, 89-93

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e5415f and 09e8959.

📒 Files selected for processing (2)
  • .github/workflows/c-cpp.yml (1 hunks)
  • configure (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/c-cpp.yml
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (2)
configure (2)

1-7: LGTM: Clean CMake setup with appropriate C++ standard.

The basic CMake configuration is well-structured with a reasonable minimum version requirement and proper C++17 standard enforcement.


108-114:

Details

✅ Verification successful

Verify source file existence and consistency.

Ensure all listed source files exist and match the current project structure, especially given the project rename to "iPurity".


🏁 Script executed:

#!/bin/bash
# Description: Verify all source files referenced in CMakeLists.txt exist
echo "Checking for source files referenced in CMakeLists.txt..."

files=(
    "src/main.cpp"
    "src/scanner.cpp" 
    "src/nsfw_detector.cpp"
    "src/afc_client_pool.cpp"
    "src/afc_helpers.cpp"
)

missing_files=()
for file in "${files[@]}"; do
    if [[ ! -f "$file" ]]; then
        missing_files+=("$file")
    fi
done

if [[ ${#missing_files[@]} -gt 0 ]]; then
    echo "Missing source files:"
    printf '%s\n' "${missing_files[@]}"
    exit 1
else
    echo "All source files found."
fi

# Also check if include directory exists
if [[ ! -d "include" ]]; then
    echo "Warning: include directory not found"
fi

Length of output: 652


✔️ Source files verified

All sources listed in configure (CMake) are present under src/, and the include/ directory exists. No further changes needed.

configure Outdated
Comment on lines +148 to +153
add_custom_target(configure
COMMAND ${CMAKE_COMMAND} -E echo "Ensure all dependencies are installed."
)
add_custom_target(check
COMMAND ${CMAKE_COMMAND} -E echo "No tests implemented."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Implement meaningful functionality for custom targets.

The configure and check targets currently provide minimal functionality. Consider implementing actual dependency checking and test execution.

 add_custom_target(configure
-    COMMAND ${CMAKE_COMMAND} -E echo "Ensure all dependencies are installed."
+    COMMAND ${CMAKE_COMMAND} -E echo "Configuration completed successfully. All dependencies found."
+    DEPENDS iPurity
 )
 add_custom_target(check
-    COMMAND ${CMAKE_COMMAND} -E echo "No tests implemented."
+    COMMAND ${CMAKE_COMMAND} -E echo "Running basic executable test..."
+    COMMAND $<TARGET_FILE:iPurity> --version 2>/dev/null || echo "iPurity executable built successfully"
+    DEPENDS iPurity
 )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
add_custom_target(configure
COMMAND ${CMAKE_COMMAND} -E echo "Ensure all dependencies are installed."
)
add_custom_target(check
COMMAND ${CMAKE_COMMAND} -E echo "No tests implemented."
)
add_custom_target(configure
COMMAND ${CMAKE_COMMAND} -E echo "Configuration completed successfully. All dependencies found."
DEPENDS iPurity
)
add_custom_target(check
COMMAND ${CMAKE_COMMAND} -E echo "Running basic executable test..."
COMMAND $<TARGET_FILE:iPurity> --version 2>/dev/null || echo "iPurity executable built successfully"
DEPENDS iPurity
)
🤖 Prompt for AI Agents
In the configure file around lines 148 to 153, the custom targets 'configure'
and 'check' currently only echo messages without performing real actions. Update
the 'configure' target to include actual dependency checks, such as verifying
required libraries or tools are installed, and update the 'check' target to run
the project's test suite or relevant validation commands. Replace the echo
commands with appropriate CMake commands or scripts that implement these
functionalities.

configure Outdated
${TFLITE_LIB}
${IMOBILEDEVICE_LIB}
${PLIST_LIB}
"-framework Accelerate" # TFLite on macOS needs Accelerate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Verify Accelerate framework requirement across platforms.

The hardcoded Accelerate framework is macOS-specific and may cause build failures on other platforms.

+# Platform-specific libraries
+if(APPLE)
     "-framework Accelerate"   # TFLite on macOS needs Accelerate
+elseif(UNIX AND NOT APPLE)
+    # Add Linux-specific libraries if needed
+endif()

This ensures the framework is only linked on macOS where it's available and required.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"-framework Accelerate" # TFLite on macOS needs Accelerate
# Platform-specific libraries
if(APPLE)
"-framework Accelerate" # TFLite on macOS needs Accelerate
elseif(UNIX AND NOT APPLE)
# Add Linux-specific libraries if needed
endif()
🤖 Prompt for AI Agents
In the configure file at line 142, the Accelerate framework is hardcoded, which
is specific to macOS and may cause build failures on other platforms. Modify the
script to conditionally include the "-framework Accelerate" flag only when the
build is targeting macOS, ensuring it is not linked on unsupported platforms.

configure Outdated
Comment on lines +154 to +158
add_custom_target(dist
COMMAND ${CMAKE_COMMAND} -E make_directory ${PROJECT_SOURCE_DIR}/dist
COMMAND ${CMAKE_COMMAND} -E chdir ${PROJECT_SOURCE_DIR} /usr/bin/tar -czf dist/iPurity.tar.gz CMakeLists.txt LICENSE README.md configure include src
COMMENT "Creating distribution tarball..."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix potential portability issues in dist target.

The hardcoded /usr/bin/tar path and file list could cause issues on different systems and may become outdated as the project evolves.

 add_custom_target(dist
     COMMAND ${CMAKE_COMMAND} -E make_directory ${PROJECT_SOURCE_DIR}/dist
-    COMMAND ${CMAKE_COMMAND} -E chdir ${PROJECT_SOURCE_DIR} /usr/bin/tar -czf dist/iPurity.tar.gz CMakeLists.txt LICENSE README.md configure include src
+    COMMAND ${CMAKE_COMMAND} -E tar czf ${PROJECT_SOURCE_DIR}/dist/iPurity.tar.gz CMakeLists.txt LICENSE README.md include src
     COMMENT "Creating distribution tarball..."
 )

This uses CMake's built-in tar functionality which is more portable and removes the hardcoded file list dependency.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In the configure file around lines 154 to 158, the dist target uses a hardcoded
path to tar and a fixed file list, which can cause portability issues. Replace
the /usr/bin/tar command with CMake's built-in tar functionality to create the
archive, and use a variable or glob pattern to dynamically include project files
instead of a hardcoded list. This will improve portability and maintainability.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
CMakeLists.txt (3)

159-159: Address platform-specific Accelerate framework linking.

The hardcoded Accelerate framework will cause build failures on non-macOS platforms.

This was previously identified in past reviews. The Accelerate framework should be conditionally linked only on macOS.


166-170: Custom targets need meaningful implementation.

As noted in previous reviews, these targets still provide minimal functionality and should implement actual dependency checking and test execution.


173-173: Fix hardcoded tar path and file list in dist target.

As previously identified, the hardcoded /usr/bin/tar path and fixed file list cause portability issues and maintenance overhead.

🧹 Nitpick comments (5)
CMakeLists.txt (3)

25-42: Improve TensorFlow Lite path discovery robustness.

The current NO_DEFAULT_PATH approach may be too restrictive and could miss valid installations in standard locations.

Consider this more flexible approach:

 find_path(
     TFLITE_INCLUDE_DIR
     NAMES tensorflow/lite/interpreter.h
     PATHS
       ${PROJECT_SOURCE_DIR}/src/tensorflow
       /usr/local/include
       /opt/homebrew/include
-      NO_DEFAULT_PATH
 )
 find_library(
     TFLITE_LIB
     NAMES tensorflow-lite tensorflowlite
     PATHS
       ${PROJECT_SOURCE_DIR}/src/tensorflow/bazel-bin/tensorflow/lite
       /usr/local/lib
       /opt/homebrew/lib
-      NO_DEFAULT_PATH
 )

67-92: Simplify FlatBuffers discovery logic.

The current approach is overly complex and could be streamlined while maintaining functionality.

Consider this simplified approach:

-# Bazel always places the FlatBuffers v24 headers under:
-#   <TF-SRC>/bazel-bin/external/flatbuffers/include/flatbuffers/flatbuffers.h
-#
-# Assume CMakeLists.txt is in project root, and TFLite was built via Bazel
-set(FLAT_CANDIDATE1
-    "${CMAKE_SOURCE_DIR}/src/tensorflow/bazel-bin/external/flatbuffers/include/flatbuffers/flatbuffers.h"
-    )
-if (EXISTS "${FLAT_CANDIDATE1}")
-    # Remove "/flatbuffers/flatbuffers.h" to point FLATBUFFERS_INCLUDE_DIR at "…/include"
-    get_filename_component(FLATBUFFERS_INCLUDE_DIR "${FLAT_CANDIDATE1}" DIRECTORY)
-    message(STATUS "→ Using Bazel's FlatBuffers headers: ${FLATBUFFERS_INCLUDE_DIR}")
-else()
-    # Fallback: check for Homebrew‐installed FlatBuffers v24 (Intel or Apple Silicon)
-    find_path(
-        FLATBUFFERS_INCLUDE_DIR
-        NAMES flatbuffers/flatbuffers.h
-        PATHS /usr/local/include /opt/homebrew/include
-        NO_DEFAULT_PATH
-    )
-    if (NOT FLATBUFFERS_INCLUDE_DIR)
-        message(FATAL_ERROR
-            "FlatBuffers headers not found!\n"
-            "  • Bazel's copy (expected at ${CMAKE_SOURCE_DIR}/src/tensorflow/bazel-bin/external/flatbuffers/include/flatbuffers/flatbuffers.h) is missing.\n"
-            "  • Or Homebrew copy not found. Please run:\n"
-            "      brew install flatbuffers@24\n"
-            "    (or copy Bazel's FlatBuffers into /usr/local/include/flatbuffers)."
-        )
-    endif()
-    message(STATUS "→ Using system FlatBuffers headers: ${FLATBUFFERS_INCLUDE_DIR}")
-endif()
+find_path(
+    FLATBUFFERS_INCLUDE_DIR
+    NAMES flatbuffers/flatbuffers.h
+    PATHS
+        ${CMAKE_SOURCE_DIR}/src/tensorflow/bazel-bin/external/flatbuffers/include
+        /usr/local/include
+        /opt/homebrew/include
+)
+if (NOT FLATBUFFERS_INCLUDE_DIR)
+    message(FATAL_ERROR
+        "FlatBuffers headers not found!\n"
+        "  • Install via Homebrew: brew install flatbuffers\n"
+        "  • Or build TensorFlow Lite via Bazel to generate FlatBuffers headers"
+    )
+endif()
+message(STATUS "→ FlatBuffers headers: ${FLATBUFFERS_INCLUDE_DIR}")

96-110: Remove NO_DEFAULT_PATH for better flexibility.

Similar to TensorFlow Lite, the NO_DEFAULT_PATH restriction may prevent finding valid system installations.

 find_path(
     IMOBILEDEVICE_INCLUDE_DIR
     NAMES libimobiledevice/libimobiledevice.h
-    PATHS /usr/local/include /opt/homebrew/include NO_DEFAULT_PATH
+    PATHS /usr/local/include /opt/homebrew/include
 )
 find_library(
     IMOBILEDEVICE_LIB
     NAMES imobiledevice
-    PATHS /usr/local/lib /opt/homebrew/lib NO_DEFAULT_PATH
+    PATHS /usr/local/lib /opt/homebrew/lib
 )
 find_library(
     PLIST_LIB
     NAMES plist
-    PATHS /usr/local/lib /opt/homebrew/lib NO_DEFAULT_PATH
+    PATHS /usr/local/lib /opt/homebrew/lib
 )
configure (2)

94-94: Consider making the script more portable.

The script is currently Apple Silicon specific, which may limit its usefulness as the project grows.

Consider parameterizing the architecture check:

-PROJECT_ROOT="$(cd "$(dirname "$0")" && pwd)"
+PROJECT_ROOT="$(cd "$(dirname "$0")" && pwd)"
+ARCH=$(uname -m)
+
+# Define platform-specific paths
+if [ "$ARCH" = "arm64" ]; then
+    HOMEBREW_PREFIX="/opt/homebrew"
+elif [ "$ARCH" = "x86_64" ]; then
+    HOMEBREW_PREFIX="/usr/local"
+else
+    echo "Warning: Unsupported architecture: $ARCH"
+    HOMEBREW_PREFIX="/usr/local"
+fi

Then use $HOMEBREW_PREFIX throughout instead of hardcoded paths.


53-88: Improve temporary file cleanup robustness.

The current approach could leave temporary files if the script exits unexpectedly.

Consider using a trap for cleanup:

+# Set up cleanup trap
+cleanup() {
+    rm -f "$TMPFILE" 2>/dev/null || true
+}
+trap cleanup EXIT
+
 TMPFILE=$(mktemp /tmp/check_imobile.XXXXXX.cpp)
 cat << EOF > "$TMPFILE"
 #include <libimobiledevice/libimobiledevice.h>
 int main() { return 0; }
 EOF

 # Compile test (Homebrew on Apple Silicon typically lives under /opt/homebrew/include)
 if ! g++ -I/opt/homebrew/include -c "$TMPFILE" -o /dev/null 2>/dev/null; then
   echo "Error: Could not compile a test program with libimobiledevice."
   echo "       Make sure libimobiledevice is installed: brew install libimobiledevice"
-  rm -f "$TMPFILE"
   exit 1
 fi
-rm -f "$TMPFILE"
 echo "libimobiledevice: found."

Apply the same pattern to the libplist check.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09e8959 and e3715e9.

📒 Files selected for processing (3)
  • .github/workflows/c-cpp.yml (1 hunks)
  • CMakeLists.txt (1 hunks)
  • configure (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/c-cpp.yml
🔇 Additional comments (5)
CMakeLists.txt (2)

2-2: LGTM: Project name updated consistently.

The project name change from "ipurity" to "iPurity" improves branding consistency.


4-6: Good: C++ standard modernization.

Upgrading from C++11 to C++17 is appropriate for modern TensorFlow Lite integration and provides access to better language features.

configure (3)

92-116: Well-implemented TensorFlow Lite header detection.

The comprehensive path checking with clear error messages and actionable instructions is excellent. The logic correctly prioritizes different installation locations.


118-138: Good TensorFlow Lite library detection logic.

The library detection follows the same robust pattern as the header detection, providing multiple fallback paths and helpful error messages.


140-167: FlatBuffers detection is thorough and well-structured.

The logic correctly prioritizes Bazel's output over system installations and provides comprehensive error messages with installation instructions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
.github/workflows/c-cpp.yml (1)

53-53: 🛠️ Refactor suggestion

Shallow clone and pin TensorFlow version for reproducible, faster builds.

Cloning the full TensorFlow repo on every run is both slow and non-reproducible. Consider:

- git clone https://github.com/tensorflow/tensorflow.git src/tensorflow
+ git clone --depth 1 --branch v2.11.0 https://github.com/tensorflow/tensorflow.git src/tensorflow

Adjust v2.11.0 (or your desired release) to match the TFLite compatibility matrix.

🧹 Nitpick comments (2)
.github/workflows/c-cpp.yml (2)

17-23: Install dependencies step is correctly configured.

Updating Homebrew and installing core packages and Bazelisk prepares the environment appropriately. To further improve CI build times, consider caching Homebrew and Bazel artifacts using actions/cache.


27-30: Shallow clone FlatBuffers to reduce clone time.

Instead of cloning the entire repo then checking out a tag, combine both into one command:

- git clone https://github.com/google/flatbuffers.git flatbuffers_src
- cd flatbuffers_src && git checkout v24.3.25
+ git clone --depth 1 --branch v24.3.25 https://github.com/google/flatbuffers.git flatbuffers_src

This cuts network overhead and eliminates the separate checkout step.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3715e9 and bfd82d7.

📒 Files selected for processing (1)
  • .github/workflows/c-cpp.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (4)
.github/workflows/c-cpp.yml (4)

32-40: FlatBuffers build and install steps look solid.

The CMake configuration with a private install prefix and subsequent make install ensures headers and libraries are correctly staged under $GITHUB_WORKSPACE/flatbuffers_install.


42-45: Documentation comments for FlatBuffers artifacts are clear and accurate.


49-49: Dynamically adding Bazelisk to PATH aligns with best practices.

Using echo "$(brew --prefix bazelisk)/bin" >> $GITHUB_PATH avoids permission issues on macOS runners and ensures the bazel command resolves to Bazelisk per .bazelversion.


68-81: Verify CMake variable paths for TFLite and FlatBuffers.

Please confirm that:

  • TFLITE_INCLUDE_DIR is pointing at the actual TensorFlow Lite headers (e.g. src/tensorflow/tensorflow/lite).
  • TFLITE_LIB matches the output path of bazel-bin/tensorflow/lite/libtensorflowlite.dylib.

Also, consider quoting all ${…} expansions to guard against potential spaces in the workspace path.

Comment on lines +58 to +61
# Use defaults for configure (we only need the C++ TFLite build)
yes "" | ./configure
# bazelisk will download Bazel 7.4.1 per the .bazelversion
bazelisk build -c opt //tensorflow/lite:libtensorflowlite.dylib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make TensorFlow configure step non-interactive and explicit.

Piping yes "" can break if new prompts are introduced. Instead, export the required TF_… environment variables:

env:
  TF_NEED_CUDA: 0
  TF_NEED_ROCM: 0
  TF_NEED_OPENCL_SYCL: 0
  TF_ENABLE_XLA: 0
  TF_NEED_TENSORRT: 0
run: |
  ./configure

This approach is more robust and documents exactly which TFLite features are enabled.

🤖 Prompt for AI Agents
In .github/workflows/c-cpp.yml around lines 58 to 61, replace the current
TensorFlow configure step that pipes 'yes ""' to make it non-interactive and
explicit by exporting the necessary TF_ environment variables such as
TF_NEED_CUDA, TF_NEED_ROCM, TF_NEED_OPENCL_SYCL, TF_ENABLE_XLA, and
TF_NEED_TENSORRT all set to 0 before running ./configure. This change ensures
the configure step is robust against new prompts and clearly documents the
enabled TFLite features.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
.github/workflows/c-cpp.yml (2)

53-54: Duplicate: Shallow-clone TensorFlow to speed up cloning
As noted in an earlier review, cloning the full TensorFlow repo adds significant time. Replace with:

- git clone https://github.com/tensorflow/tensorflow.git src/tensorflow
+ git clone --depth 1 --branch master https://github.com/tensorflow/tensorflow.git src/tensorflow

58-60: 🛠️ Refactor suggestion

Duplicate: Make TensorFlow configure non-interactive and explicit
Piping yes "" can break if new prompts are introduced. Instead, export explicit TF_* environment variables:

- yes "" | ./configure
+ env TF_NEED_CUDA=0 TF_NEED_ROCM=0 TF_NEED_OPENCL_SYCL=0 TF_ENABLE_XLA=0 TF_NEED_TENSORRT=0 ./configure

This documents exactly which features are enabled and guards against future interactive prompts.

🧹 Nitpick comments (3)
.github/workflows/c-cpp.yml (3)

17-24: Consider caching Homebrew dependencies and minimizing brew update overhead
Running brew update on every CI run can be slow, and reinstalling the same packages adds latency. You could:

  • Use actions/cache to cache Homebrew’s download/build cache (~/Library/Caches/Homebrew).
  • Only run brew update when necessary (e.g., on a schedule or when Brewfile.lock.json changes).

This will significantly reduce tool latency in your workflow.


28-42: Shallow-clone FlatBuffers and cache its build artifacts
Cloning the entire FlatBuffers repo and rebuilding on every run adds time. Consider:

  1. Shallow-cloning at the tagged version:
- git clone https://github.com/google/flatbuffers.git flatbuffers_src
+ git clone --depth 1 --branch v24.3.25 https://github.com/google/flatbuffers.git flatbuffers_src
  1. Caching flatbuffers_install (or the build directory) with actions/cache to avoid a full rebuild each CI run.

77-81: Specify build type and verify CMake variables
By default, CMake may not use an optimized build. Consider adding:

- cmake \
+ cmake -DCMAKE_BUILD_TYPE=Release \

Also, please verify that:

  1. TFLITE_INCLUDE_DIR points to the correct header path (e.g., src/tensorflow/tensorflow/lite/include).
  2. CMakeLists properly locates both the FlatBuffers include and library (you may need to pass a -DFLATBUFFERS_LIB or adjust find_library).
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bfd82d7 and 7c84522.

📒 Files selected for processing (1)
  • .github/workflows/c-cpp.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (1)
.github/workflows/c-cpp.yml (1)

49-50: Dynamic PATH addition for Bazelisk is correct
Appending Bazelisk’s bin directory to $GITHUB_PATH avoids permission issues on macOS runners and resolves previous symlink failures.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
.github/workflows/c-cpp.yml (1)

59-63: ✔️ Bazelisk PATH update addresses earlier symlink issue
This implementation adds Bazelisk’s bin directory to GITHUB_PATH instead of creating a symlink under /usr/local/bin, resolving the permission concerns on GitHub macOS runners.

🧹 Nitpick comments (5)
.github/workflows/c-cpp.yml (5)

12-12: Expand CI to Linux runners for broader coverage
Currently the build is only running on macOS (macos-latest). For cross-platform support and to catch Linux-specific issues, consider using a build matrix that includes ubuntu-latest (and potentially Windows too if needed).


19-27: Cache or streamline Homebrew setup to reduce CI latency
Running brew update and installing multiple packages on every CI run can be time-consuming. Consider caching Homebrew’s download caches or pre-installing dependencies in a custom environment, or using actions/cache on /usr/local/Homebrew/Library/Homebrew to speed up subsequent runs.


38-42: Use a shallow clone for FlatBuffers to speed up checkout
The workflow clones the entire FlatBuffers history before checking out v24.3.25. You can reduce clone time and data transfer by using:

-git clone https://github.com/google/flatbuffers.git flatbuffers_src
+git clone --branch v24.3.25 --depth 1 https://github.com/google/flatbuffers.git flatbuffers_src

65-69: Shallow clone TensorFlow repository to accelerate CI
Cloning the full TensorFlow repo is time-consuming. Consider:

-git clone https://github.com/tensorflow/tensorflow.git src/tensorflow
+git clone --depth 1 https://github.com/tensorflow/tensorflow.git src/tensorflow

This will significantly reduce clone time.


87-90: Remove redundant yes "" piping now that TF_* vars are set
Since all TensorFlow configure prompts are answered via environment variables, piping yes "" is no longer necessary and may mask future prompts. You can simplify to:

-yes "" | ./configure 2>&1 | tee configure.log
+./configure 2>&1 | tee configure.log
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5df51e and 0259389.

📒 Files selected for processing (2)
  • .github/workflows/c-cpp.yml (1 hunks)
  • CMakeLists.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • CMakeLists.txt
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (8)
.github/workflows/c-cpp.yml (8)

28-35: Logging installed versions is helpful for debugging, and this block looks correct.


43-57: The subsequent FlatBuffers build and installation steps follow best practices and include clear logging. No issues detected.


71-79: Explicitly exporting the TF_* environment variables ensures the TensorFlow configure script runs non-interactively and documents which features are enabled.


102-118: Printing directory contents and paths here aids in debugging build failures. This debug block looks intentional and correct.


143-150: Explicitly echoing all CMake variables before configuration improves traceability and helps pinpoint misconfigurations.


153-160: The CMake invocation correctly passes all required variables, and piping to tee ensures logs are captured for debugging.


162-163: Capturing the first 200 lines of the configuration log is a good balance between verbosity and relevance.


167-170: The final build step logs output succinctly and provides a tail of the log for quick inspection.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
.github/workflows/c-cpp.yml (6)

36-57: Consider Shallow Clone for FlatBuffers

Cloning the full FlatBuffers repo can be time-consuming. You could speed up this step by performing a shallow clone of the specific tag:

- git clone https://github.com/google/flatbuffers.git flatbuffers_src
- cd flatbuffers_src
- git checkout v24.3.25
+ git clone --depth 1 --branch v24.3.25 https://github.com/google/flatbuffers.git flatbuffers_src
+ cd flatbuffers_src

58-64: PATH Update Visibility

Writing to $GITHUB_PATH correctly extends the PATH for subsequent steps, but echoing $PATH within the same step will still show the old value. Consider removing or relocating the echo "→ Updated PATH: $PATH" to a later step for clarity.


67-70: Speed Up TensorFlow Clone

Cloning the entire TensorFlow repository is slow. Consider a shallow clone to reduce clone time:

- git clone https://github.com/tensorflow/tensorflow.git src/tensorflow
+ git clone --depth 1 https://github.com/tensorflow/tensorflow.git src/tensorflow

73-89: TensorFlow Configure Step

Setting explicit TF_* environment variables is great for non-interactive configuration. To further simplify, you can remove the yes "" piping since the env vars suppress prompts:

- yes "" | ./configure 2>&1 | tee configure.log
+ ./configure 2>&1 | tee configure.log

91-98: Build Invocation & Caching Idea

The Bazelisk build for TensorFlow Lite is well-instrumented. For faster CI runs, consider adding an actions/cache@v3 step for the Bazel output or workspace so subsequent builds reuse unchanged artifacts.


101-162: Verbose Debug Logging in CMake Step

You’ve added extensive debug echos to validate include paths, headers, libraries, and CMake variables. Once this step is stable, consider trimming or gating these logs (e.g., via a DEBUG environment variable) to reduce workflow noise and speed up log processing.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0259389 and 9e5e5a6.

📒 Files selected for processing (1)
  • .github/workflows/c-cpp.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (3)
.github/workflows/c-cpp.yml (3)

14-15: Correct and Clear Checkout Step

The checkout step uses actions/checkout@v4 with a clear name and up-to-date action.


19-34: Dependency Installation Steps Are Solid

The brew update, unlink, install core libraries, and version checks comprehensively handle both Intel and Apple Silicon prefixes.


165-171: Final Build Step Looks Good

The CMake build invocation and log summary provide clear feedback on the build outcome.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
install.sh (3)

30-34: Split local declaration and assignment to satisfy SC2155
ShellCheck recommends separating local and assignment to avoid masking return codes:

-    local tmpdir=$(mktemp -d /tmp/flatbuffers_install.XXXXXX)
+    local tmpdir
+    tmpdir=$(mktemp -d /tmp/flatbuffers_install.XXXXXX)

This change adheres to best practices and makes failures clearer.

🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 32-32: Declare and assign separately to avoid masking return values.

(SC2155)


54-59: Extend install prefix logic beyond macOS arm64 vs others
Currently the script only distinguishes arm64 (M1/M2 mac) vs /usr/local. For Linux or Linuxbrew installs, you may want to detect /home/linuxbrew/.linuxbrew or allow an override:

if [[ -n "$IPURITY_INSTALL_PREFIX" ]]; then
    INSTALL_PREFIX="$IPURITY_INSTALL_PREFIX"
elif [[ "$(uname -s)" == "Darwin" && "$(uname -m)" == "arm64" ]]; then
    INSTALL_PREFIX="/opt/homebrew"fi

This makes the script more portable.


153-158: Avoid ls | grep for symlink target discovery (SC2010)
Using ls piped to grep can mis-handle complex filenames. Instead, use shell globbing and a loop:

-    PLIST=$(ls libplist-*.dylib 2>/dev/null | grep -v '++' | head -n 1)
+    PLIST=""
+    for file in libplist-*.dylib; do
+        [[ -e "$file" && "$file" != *'++'* ]] && PLIST="$file" && break
+    done

Repeat similarly for libimobiledevice and libplist++.

🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 154-154: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

(SC2010)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e5e5a6 and b0f089c.

📒 Files selected for processing (1)
  • install.sh (1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
install.sh

[warning] 32-32: Declare and assign separately to avoid masking return values.

(SC2155)


[warning] 154-154: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

(SC2010)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build

install.sh Outdated
Comment on lines +26 to +52
# Function to install FlatBuffers v24.3.25 from source
install_flatbuffers() {
local install_dir="$1"
echo "=== Installing FlatBuffers v24.3.25 from source ==="

# Create a temporary directory for building
local tmpdir=$(mktemp -d /tmp/flatbuffers_install.XXXXXX)
cd "$tmpdir"

echo "Cloning FlatBuffers v24.3.25..."
git clone https://github.com/google/flatbuffers.git flatbuffers_src
cd flatbuffers_src
git checkout v24.3.25

echo "Building and installing FlatBuffers..."
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX="${install_dir}" \
..
cmake --build . --target install

# Clean up
cd /tmp
rm -rf "$tmpdir"

echo "FlatBuffers v24.3.25 installed to ${install_dir}"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ensure install_flatbuffers always cleans up its temp directory
If the build or install steps fail, the temporary directory remains. Introduce a trap to remove it on any exit, for example:

 install_flatbuffers() {
     local install_dir="$1"
     echo "=== Installing FlatBuffers v24.3.25 from source ==="
-    
-    # Create a temporary directory for building
-    local tmpdir=$(mktemp -d /tmp/flatbuffers_install.XXXXXX)
-    cd "$tmpdir"
+    # Create a temporary directory for building
+    local tmpdir
+    tmpdir=$(mktemp -d /tmp/flatbuffers_install.XXXXXX)
+    # Clean up on exit (success or failure)
+    trap 'rm -rf "$tmpdir"' EXIT
+    cd "$tmpdir"
 
     echo "Cloning FlatBuffers v24.3.25..."
     git clone https://github.com/google/flatbuffers.git flatbuffers_src
     cd flatbuffers_src
     git checkout v24.3.25
     …
 }

This guarantees no orphaned build directories.

🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 32-32: Declare and assign separately to avoid masking return values.

(SC2155)

🤖 Prompt for AI Agents
In install.sh around lines 26 to 52, the install_flatbuffers function does not
clean up the temporary directory if the build or install steps fail. To fix
this, add a trap command at the start of the function to remove the temporary
directory on any exit, ensuring the temp directory is always deleted regardless
of success or failure. This involves setting trap 'rm -rf "$tmpdir"' EXIT right
after creating the tmpdir variable.

Comment on lines +1 to +4
#!/bin/bash

set -e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance strict error handling with pipefail and undefined variable checks
To catch failures in piped commands and unset variables, extend the safety flags at the top of the script:

- set -e
+ set -euo pipefail

This will cause the script to fail fast on any undefined variable (-u) and on failures in any part of a pipeline (-o pipefail).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#!/bin/bash
set -e
#!/bin/bash
set -euo pipefail
🤖 Prompt for AI Agents
In install.sh at lines 1 to 4, enhance the script's error handling by adding the
flags `-u` and `-o pipefail` to the existing `set -e` command. This means
changing the line to `set -euo pipefail` to ensure the script exits on unset
variables and on any failure within piped commands.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
install.sh (2)

2-2: 🛠️ Refactor suggestion

Enhance strict error handling
Include undefined variable and pipeline failure checks to ensure the script fails fast on errors.

- set -e
+ set -euo pipefail

25-47: 🛠️ Refactor suggestion

Ensure cleanup on failure in install_flatbuffers
Add a trap 'sudo rm -rf "$tmpdir"' EXIT right after creating the temporary directory to guarantee cleanup on any exit (success or failure).

 install_flatbuffers() {
     local install_dir="$1"
     echo "=== Installing FlatBuffers v24.3.25 from source into ${install_dir} ==="
-    local tmpdir
-    tmpdir=$(mktemp -d /tmp/flatbuffers_install.XXXXXX)
+    local tmpdir
+    tmpdir=$(mktemp -d /tmp/flatbuffers_install.XXXXXX)
+    # Clean up on exit (success or failure)
+    trap 'sudo rm -rf "$tmpdir"' EXIT
     echo "→ Cloning FlatBuffers into temporary dir: $tmpdir"
     git clone https://github.com/google/flatbuffers.git "$tmpdir/flatbuffers_src"
     ...
🧹 Nitpick comments (5)
CMakeLists.txt (3)

151-157: Include directories setup
Explicit ordering for include paths is good. Consider adding PLIST_INCLUDE_DIR if a separate include path for libplist headers is needed.


162-164: Compiler warning flags
Adding -Wall -Wextra for GCC/Clang is appropriate. Consider adding -Werror in CI to catch warnings early.


181-183: Custom configure target
This target echoes dependency checks. Consider whether this adds necessary value versus using cmake --help.

install.sh (2)

121-158: TensorFlow Lite build process
The non-interactive configure and build steps are handled cleanly, with logs captured.

Consider exposing CUDA or other flags as script parameters if needed in future builds.


169-174: Use glob loop for libplist symlink
Similarly, replace ls | grep for libplist++ and libplist:

- PLIST_VER=$(ls libplist-*.dylib 2>/dev/null | grep -v '++' | head -n1)
+ for f in libplist-*.dylib; do
+   if [[ -f "$f" && "$f" != *"++"* ]]; then
+     PLIST_VER="$f"
+     sudo ln -sf "$PLIST_VER" libplist.dylib
+     break
+   fi
+ done
🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 169-169: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

(SC2010)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0f089c and aee44cb.

📒 Files selected for processing (3)
  • CMakeLists.txt (1 hunks)
  • configure (5 hunks)
  • install.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • configure
🧰 Additional context used
🪛 Shellcheck (0.10.0)
install.sh

[warning] 169-169: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

(SC2010)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (23)
CMakeLists.txt (14)

2-6: Project renaming and C++ standard enforcement
The project name is updated to "iPurity" and C++17 is correctly enforced with CMAKE_CXX_STANDARD_REQUIRED.


12-15: Cache variable for FlatBuffers include path
Allowing users to override FLATBUFFERS_INCLUDE_DIR via CMake cache is a good enhancement.


18-23: Cache variables for TensorFlow Lite
Introducing TFLITE_INCLUDE_DIR and TFLITE_LIB as cache variables offers flexibility for custom installs.


26-34: Cache variables for libimobiledevice & libplist
The user-override pattern for headers and libraries is consistently implemented.


39-42: OpenCV discovery
Using find_package(OpenCV REQUIRED) with status messages ensures clear feedback on include dirs and libs.


46-52: Finding TensorFlow Lite headers
The fallback to standard system paths and clear error in case of failure aligns with best practices.


54-60: Finding TensorFlow Lite library
The approach correctly handles both macOS .dylib and .so naming conventions.


62-69: Error handling for missing TensorFlow Lite
The fatal error provides succinct guidance on installing dependencies or specifying paths.


70-71: Verbose TFLite status messages
Echoing the resolved include directory and library path helps with debugging.


76-92: FlatBuffers discovery & fallback
The two-stage lookup (system vs user-provided) with clear status or error messages covers common scenarios.


98-135: libimobiledevice & libplist discovery
The script locates headers and libraries with user overrides, fallback find_path/find_library, and explicit fatal errors. This pattern is consistent and well-documented.


140-146: Executable target definition
Including nsfw_detector.cpp aligns with the new NSFW detection class, and the source ordering is clear.


169-175: Linking libraries
The list correctly includes OpenCV, TensorFlow Lite, libimobiledevice, plist, and macOS Accelerate framework.


191-192: Installation target
Installing the runtime under bin follows conventions.

install.sh (9)

7-23: Helper functions structure
The command_exists, brew_package_exists, and install_brew_package functions provide a clean abstraction for package checks and installs.


52-60: Homebrew prefix detection
The logic correctly differentiates between Intel and Apple Silicon prefixes.


62-69: Ensuring Homebrew installation
The check-and-install pattern for Homebrew covers both presence and absence scenarios.


73-82: Brew package installation idempotency
Using install_brew_package ensures packages are only installed if missing.


83-107: Bazel symlink handling
Unlinking the bazel formula and symlinking bazelisk provides a robust solution to avoid conflicts.


109-117: Validate OpenCV availability
The check ensures OpenCV is found via pkg-config and provides clear error messaging.


118-120: FlatBuffers install invocation
The script delegates FlatBuffers installation to install_flatbuffers, aligning with CMake expectations.


159-182: Create unversioned symlinks for mobiledevice & plist
Automating the symlink creation improves usability. See below for glob-based improvements.

🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 169-169: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

(SC2010)


201-206: Clear final instructions for building with CMake
The summary instructions correctly guide users to create a build directory and run CMake.

Comment on lines 186 to 190
add_custom_target(dist
COMMAND ${CMAKE_COMMAND} -E make_directory ${PROJECT_SOURCE_DIR}/dist
COMMAND ${CMAKE_COMMAND} -E chdir ${PROJECT_SOURCE_DIR} /usr/bin/tar -czf dist/ipurity.tar.gz CMakeLists.txt LICENSE README.md configure include src
COMMENT "Creating distribution tarball..."
)

# Installation
install(TARGETS ipurity
RUNTIME DESTINATION bin
COMMAND ${CMAKE_COMMAND} -E make_directory ${PROJECT_SOURCE_DIR}/dist
COMMAND ${CMAKE_COMMAND} -E chdir ${PROJECT_SOURCE_DIR} /usr/bin/tar -czf dist/iPurity.tar.gz CMakeLists.txt LICENSE README.md configure include src
COMMENT "Creating distribution tarball..."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Distribution target portability
Using an absolute path /usr/bin/tar may not be portable. Consider using CMake -E tar or locating tar via find_program.

🤖 Prompt for AI Agents
In CMakeLists.txt around lines 186 to 190, the dist target uses an absolute path
/usr/bin/tar which is not portable across systems. Replace the hardcoded tar
command with CMake's built-in tar command using `cmake -E tar` or use
`find_program` to locate the tar executable dynamically. Update the COMMAND line
to use the portable tar invocation to ensure compatibility on different
platforms.

Comment on lines +186 to +190
echo ""
echo "=== Installation Summary ==="
echo " • pkg-config installed"
echo " • cmake installed"
echo " • make installed"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Distribution target portability
Using an absolute path /usr/bin/tar may not be portable. Consider using CMake -E tar or locating tar via find_program.

🤖 Prompt for AI Agents
In install.sh around lines 186 to 190, the script uses an absolute path
/usr/bin/tar which may not be portable across different distributions. Replace
the hardcoded path by either using CMake's built-in command `cmake -E tar` for
tar operations or locate the tar executable dynamically using `find_program` to
ensure portability. Update all tar command invocations accordingly to avoid
relying on a fixed path.

Comment on lines +163 to +168
IMOBILE_VER=$(ls libimobiledevice-*.dylib 2>/dev/null | head -n1)
if [[ -n "$IMOBILE_VER" ]]; then
echo " Found $IMOBILE_VER → symlinking to libimobiledevice.dylib"
sudo ln -sf "$IMOBILE_VER" libimobiledevice.dylib
fi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace ls | grep with a safe glob loop
Using ls | grep can break on weird filenames. Use a simple glob to find the first matching file:

- IMOBILE_VER=$(ls libimobiledevice-*.dylib 2>/dev/null | head -n1)
- if [[ -n "$IMOBILE_VER" ]]; then
+ for f in libimobiledevice-*.dylib; do
+   if [[ -f "$f" ]]; then
+     IMOBILE_VER="$f"
     echo "   Found $IMOBILE_VER → symlinking to libimobiledevice.dylib"
     sudo ln -sf "$IMOBILE_VER" libimobiledevice.dylib
- fi
+     break
+   fi
+ done
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
IMOBILE_VER=$(ls libimobiledevice-*.dylib 2>/dev/null | head -n1)
if [[ -n "$IMOBILE_VER" ]]; then
echo " Found $IMOBILE_VER → symlinking to libimobiledevice.dylib"
sudo ln -sf "$IMOBILE_VER" libimobiledevice.dylib
fi
for f in libimobiledevice-*.dylib; do
if [[ -f "$f" ]]; then
IMOBILE_VER="$f"
echo " Found $IMOBILE_VER → symlinking to libimobiledevice.dylib"
sudo ln -sf "$IMOBILE_VER" libimobiledevice.dylib
break
fi
done
🤖 Prompt for AI Agents
In install.sh around lines 163 to 168, replace the use of `ls` piped to `grep`
with a safer glob pattern loop to find the first matching
libimobiledevice-*.dylib file. Use a shell glob to iterate over matching files
and assign the first found filename to IMOBILE_VER, then proceed with the
symlink creation as before. This avoids issues with filenames containing spaces
or special characters.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
src/nsfw_detector.cpp (1)

93-108: 🛠️ Refactor suggestion

Remove unused legacy code.

This static function appears to be leftover from the old naive skin detection implementation and is not used anywhere in the current class-based approach.

Remove the unused static function:

-/**
- * Check if a pixel (in YCrCb) is within a naive "skin" range.
- * This function assumes the pixel is in [Y, Cr, Cb] order.
- */
-static bool isSkinPixel(const cv::Vec3b& ycrcb) {
-    uchar Cr = ycrcb[1];
-    uchar Cb = ycrcb[2];
-
-    // Example naive thresholds for skin detection:
-    //   140 < Cr < 175
-    //   100 < Cb < 135
-    if (Cr >= 140 && Cr <= 175 && Cb >= 100 && Cb <= 135) {
-        return true;
-    }
-    return false;
-}
♻️ Duplicate comments (1)
src/nsfw_detector.cpp (1)

42-63: ⚠️ Potential issue

Critical: Duplicate issue - unused parameter and inefficient implementation.

This issue was already identified in a previous review. The output parameter is unused and the pixel-by-pixel copying is inefficient.

Apply the previously suggested fix to remove the unused parameter and improve efficiency:

-bool NSFWDetector::preprocessImage(const cv::Mat& input, float* output) {
+bool NSFWDetector::preprocessImage(const cv::Mat& input) {
+    if (!interpreter) {
+        return false;
+    }
+    
     cv::Mat resized;
     cv::resize(input, resized, cv::Size(INPUT_SIZE, INPUT_SIZE));
     
     // Convert to RGB and normalize
     cv::Mat rgb;
     cv::cvtColor(resized, rgb, cv::COLOR_BGR2RGB);
     rgb.convertTo(rgb, CV_32FC3, 1.0/255.0);
     
     // Copy data to input tensor
     float* input_tensor = interpreter->typed_input_tensor<float>(0);
-    for (int y = 0; y < INPUT_SIZE; y++) {
-        for (int x = 0; x < INPUT_SIZE; x++) {
-            cv::Vec3f pixel = rgb.at<cv::Vec3f>(y, x);
-            *input_tensor++ = pixel[0];
-            *input_tensor++ = pixel[1];
-            *input_tensor++ = pixel[2];
-        }
-    }
+    // More efficient memory copy
+    if (rgb.isContinuous()) {
+        memcpy(input_tensor, rgb.ptr<float>(), INPUT_SIZE * INPUT_SIZE * 3 * sizeof(float));
+    } else {
+        for (int y = 0; y < INPUT_SIZE; y++) {
+            memcpy(input_tensor + y * INPUT_SIZE * 3, 
+                   rgb.ptr<float>(y), 
+                   INPUT_SIZE * 3 * sizeof(float));
+        }
+    }
     
     return true;
 }
🧹 Nitpick comments (1)
src/nsfw_detector.cpp (1)

14-40: Enhance model initialization with file validation.

The TensorFlow Lite model initialization logic is sound, but could benefit from additional validation.

Consider adding file existence validation before attempting to load the model:

 bool NSFWDetector::initialize(const std::string& modelPath) {
+    // Check if model file exists
+    std::ifstream file(modelPath);
+    if (!file.good()) {
+        std::cerr << "Model file does not exist: " << modelPath << std::endl;
+        return false;
+    }
+    
     // Load the TensorFlow Lite model
     model = tflite::FlatBufferModel::BuildFromFile(modelPath.c_str());
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aee44cb and 2af655d.

📒 Files selected for processing (2)
  • include/nsfw_detector.h (1 hunks)
  • src/nsfw_detector.cpp (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/nsfw_detector.cpp (1)
src/nsfw_detector.h (5)
  • NSFWDetector (13-13)
  • NSFWDetector (14-14)
  • modelPath (16-16)
  • input (24-24)
  • image (17-17)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (3)
include/nsfw_detector.h (1)

1-11: LGTM! Clean refactoring.

The removal of the naive function declaration and keeping only the essential constants aligns well with the new class-based TensorFlow Lite implementation.

src/nsfw_detector.cpp (2)

6-6: LGTM! Proper constructor initialization.

The constructor correctly initializes the isInitialized flag to false.


8-12: Good resource management in destructor.

Proper cleanup of TensorFlow Lite resources using smart pointer reset.

Comment on lines 1 to +5
#include "nsfw_detector.h"

#include <iostream>
#include <opencv2/opencv.hpp>
#include <string>
#include <iostream>
#include <fstream>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: Missing TensorFlow Lite includes.

The implementation uses TensorFlow Lite classes and constants but is missing the required header includes. This will cause compilation errors.

Add the missing TensorFlow Lite includes:

 #include "nsfw_detector.h"
 #include <opencv2/opencv.hpp>
+#include <tensorflow/lite/model.h>
+#include <tensorflow/lite/interpreter.h>
+#include <tensorflow/lite/kernels/register.h>
+#include <tensorflow/lite/c/common.h>
 #include <iostream>
 #include <fstream>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#include "nsfw_detector.h"
#include <iostream>
#include <opencv2/opencv.hpp>
#include <string>
#include <iostream>
#include <fstream>
#include "nsfw_detector.h"
#include <opencv2/opencv.hpp>
#include <tensorflow/lite/model.h>
#include <tensorflow/lite/interpreter.h>
#include <tensorflow/lite/kernels/register.h>
#include <tensorflow/lite/c/common.h>
#include <iostream>
#include <fstream>
🤖 Prompt for AI Agents
In src/nsfw_detector.cpp around lines 1 to 5, the file is missing the necessary
TensorFlow Lite header includes required for using TensorFlow Lite classes and
constants. Add the appropriate TensorFlow Lite include directives at the top of
the file to ensure the code compiles correctly without missing symbol errors.

Comment on lines +65 to +91
float NSFWDetector::detectNSFW(const cv::Mat& image) {
if (!isInitialized) {
std::cerr << "Detector not initialized" << std::endl;
return -1.0f;
}

if (image.empty()) {
std::cerr << "Input image is empty" << std::endl;
return -1.0f;
}

// Preprocess image
float* input = interpreter->typed_input_tensor<float>(0);
if (!preprocessImage(image, input)) {
return -1.0f;
}

// Run inference
if (interpreter->Invoke() != kTfLiteOk) {
std::cerr << "Failed to run inference" << std::endl;
return -1.0f;
}

// Get output
float* output = interpreter->typed_output_tensor<float>(0);
return output[1]; // Assuming output[1] is the NSFW probability
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix function call and add output validation.

The method has good error handling but needs updates to match the preprocessImage fix and validate model output.

Apply these fixes:

     // Preprocess image
-    float* input = interpreter->typed_input_tensor<float>(0);
-    if (!preprocessImage(image, input)) {
+    if (!preprocessImage(image)) {
         return -1.0f;
     }

     // Run inference
     if (interpreter->Invoke() != kTfLiteOk) {
         std::cerr << "Failed to run inference" << std::endl;
         return -1.0f;
     }

     // Get output
     float* output = interpreter->typed_output_tensor<float>(0);
+    // Validate output tensor
+    TfLiteTensor* output_tensor = interpreter->output_tensor(0);
+    if (output_tensor->dims->size < 1 || output_tensor->dims->data[0] < 2) {
+        std::cerr << "Invalid output tensor shape" << std::endl;
+        return -1.0f;
+    }
     return output[1]; // Assuming output[1] is the NSFW probability

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/nsfw_detector.cpp between lines 65 and 91, update the detectNSFW function
to call preprocessImage with the correct parameters as per its updated
signature, and add validation for the model output to ensure it is within
expected bounds before returning output[1]. Adjust error handling to log or
handle invalid output cases appropriately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replaced the naive skin detection with a TensorFlow Lite-based deep learning model

1 participant