Feature main#54
Closed
Shubham-Khichi wants to merge 9 commits into
Closed
Conversation
Fixed MCP Docker Build Failure: Resolved the build error for the mcp service by removing the invalid readme reference in fast-markdown-mcp/pyproject.toml.
Refactored File Handling (Removed In-Memory Storage):
Investigated the complex in-memory file handling mechanism and its inconsistencies.
Removed the in-memory storage logic from backend/app/crawler.py.
Removed the associated API endpoints (/api/memory-files, /api/memory-files/{file_id}) from backend/app/main.py.
Added a new backend API endpoint (/api/storage/file-content) to read files directly from the storage/markdown directory.
Deleted the old frontend API proxy route (app/api/memory-file/route.ts).
Created a new frontend API proxy route (app/api/storage/file-content/route.ts).
Updated frontend components (StoredFiles.tsx, DiscoveredFiles.tsx) to use the new API route for downloading file content.
Documentation: Created markdown plans for the MCP build fix and the in-memory feature removal.
This simplifies the architecture by relying solely on disk-based consolidated files in storage/markdown. Please remember to test the file download functionality after restarting the services.
This commit addresses several issues and implements enhancements across the crawling workflow: Fixes: - Resolved 400 Bad Request error caused by incorrect query parameter (`file_path`) in the file content API route. - Fixed backend `NameError` (`set_task_context`) in crawler.py that prevented result file saving. - Corrected 500 Internal Server Error caused by Docker networking issue (localhost vs. service name) in the file content API route proxy. - Ensured 'Data Extracted' statistic is correctly saved in the backend status and displayed in the UI. UI Enhancements: - Made "Consolidated Files" section persistent, rendering as soon as a job ID is available. - Relocated "Crawl Selected" button inline with status details. - Updated "Crawl Selected" button to show dynamic count and disable appropriately. - Renamed "Job Status" section title to "Discovered Pages". - Renamed "Processing Summary" section title to "Statistics". - Removed the unused "Extracted Content" display section. Backend Enhancements: - Implemented file appending logic in crawler.py for consolidated `.md` and `.json` files. Subsequent crawls for the same job now append data and update timestamps instead of overwriting. Changelog: ### Added - Backend logic to append new crawl results to existing consolidated `.md` and `.json` files for the same job ID. - Dynamic count display to "Crawl Selected" button. ### Changed - "Consolidated Files" section now appears persistently once a job is initiated. - "Crawl Selected" button relocated inline with status details and disables after initiating crawl. - Renamed "Job Status" section title to "Discovered Pages". - Renamed "Processing Summary" section title to "Statistics". - Updated backend status management to correctly store and transmit the 'Data Extracted' statistic. ### Fixed - Resolved 400 Bad Request error when fetching file content due to incorrect query parameter name. - Fixed backend `NameError` in crawler that prevented saving crawl results. - Resolved 500 Internal Server Error when fetching `.json` file content due to Docker networking issue in API proxy route. - Corrected display issue where 'Data Extracted' statistic showed "N/A" instead of the actual value. ### Removed - Removed the unused "Extracted Content" display section from the UI.
feat(frontend): Update Consolidated Files component for polling and downloads - Implements polling every 10 seconds in ConsolidatedFiles.tsx to automatically refresh the list of files from the /api/storage endpoint, ensuring newly added files appear in the UI. - Modifies the MD and JSON icon links to point to the /api/storage/download endpoint and adds the 'download' attribute, triggering file downloads instead of opening content in the browser.
Introduces a new `CrawlUrls` component to display and manage discovered URLs during a crawl job. This component utilizes Shadcn UI elements (Table, Checkbox, Badge, Tooltip) to provide a detailed view of individual URL statuses, handle URL selection for targeted actions, and display status updates driven by polling managed in `app/page.tsx`. Key changes include: - Creation of the `CrawlUrls` component for URL list display and interaction. - Refactoring of `CrawlStatusMonitor` to focus solely on displaying the overall job status within a Dialog component. - Updates to `app/page.tsx` to manage essential state (job ID, job status, selected URLs) and orchestrate the polling mechanism for fetching URL-specific status updates. - Fixed UI bugs where status icons were not updating correctly and checkbox selection state was inconsistent. - Adjusted the styling of the info icon button for better contrast as per user feedback. These frontend enhancements align with the ongoing backend redesign, supporting the new job-based status management and polling architecture for more granular progress tracking. Updated documentation in `docs/features/` (adjust_info_button_style_plan.md, fix_discovered_pages_ui_bugs.md, create_crawl_urls_component_plan.md, crawl_status_monitoring_plan.md) to reflect the completion of related tasks.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.