fix: Make playwright optional for Railway deployment#4
Conversation
- Add full support for national teams across all club endpoints
- Add new /clubs/{club_id}/competitions endpoint to retrieve club competitions
- Add isNationalTeam field to Club Profile response schema
- Make Club Profile fields optional to accommodate national teams
- Enhance Club Players endpoint to handle national team HTML structure
- Update XPath expressions to support both club and national team structures
- Add intelligent detection logic for national teams
- Maintain backward compatibility with existing club endpoints
This update enables the API to work seamlessly with both regular clubs
and national teams, providing a unified interface for all club-related
data retrieval.
- Add GET /competitions/{competition_id}/seasons endpoint
- Implement TransfermarktCompetitionSeasons service to scrape season data
- Add CompetitionSeason and CompetitionSeasons Pydantic schemas
- Support both cross-year (e.g., 25/26) and single-year (e.g., 2025) seasons
- Handle historical seasons correctly (e.g., 99/00 -> 1999-2000)
- Extract seasons from competition page dropdown/table structure
- Return season_id, season_name, start_year, and end_year for each season
- Sort seasons by start_year descending (newest first)
Closes #[issue-number]
- Detect national team competitions (FIWC, EURO, COPA, AFAC, GOCU, AFCN) - Use /teilnehmer/pokalwettbewerb/ URL for national team competitions - Handle season_id correctly (year-1 for national teams in URL) - Add XPath expressions for participants table - Limit participants to expected tournament size to exclude non-qualified teams - Make season_id optional in CompetitionClubs schema - Update Dockerfile PYTHONPATH configuration
- Add length validation for ids and names before zip() to prevent silent data loss - Raise descriptive ValueError with logging if ids and names mismatch - Simplify seasonId assignment logic for national teams - Remove unnecessary try/except block (isdigit() prevents ValueError) - Clean up unreachable fallback code
- Add tournament size configuration to Settings class with environment variable support - Replace hardcoded dict with settings.get_tournament_size() method - Add warning logging when tournament size is not configured (instead of silent truncation) - Proceed without truncation when size is unavailable (no silent data loss) - Add validation for tournament sizes (must be positive integers) - Add comprehensive unit tests for both configured and fallback paths - Update README.md with new environment variables documentation This prevents silent truncation when tournament sizes change (e.g., World Cup expanding to 48) and allows easy configuration via environment variables.
- Remove extra HTTP request to fetch club profile just to read isNationalTeam - Set is_national_team=None to let TransfermarktClubPlayers use DOM heuristics - Remove broad except Exception that silently swallowed all errors - Improve performance by eliminating redundant network call - Players class already has robust DOM-based detection for national teams
- Move datetime and HTTPException imports from method level to module level - Improves code readability and marginally improves performance - Follows Python best practices for import organization
- Move datetime and HTTPException imports from method level to module level - Improves code readability and marginally improves performance - Follows Python best practices for import organization
- Keep imports at module level in clubs/competitions.py (from CodeRabbit review) - Preserve is_national_team flag logic in clubs/players.py - Keep name padding in competitions/search.py - Add .DS_Store to .gitignore
- Remove whitespace from blank lines (W293) - Add missing trailing commas (COM812) - Split long XPath lines to comply with E501 line length limit
- Format XPath strings to comply with line length - Format list comprehensions - Format is_season condition
- Fix session initialization issue causing all HTTP requests to fail - Improve block detection to avoid false positives - Optimize browser scraping delays (reduce from 12-13s to 0.4-0.8s) - Update XPath definitions for clubs, competitions, and players search - Fix nationalities parsing in player search (relative to each row) - Add comprehensive monitoring endpoints - Update settings for anti-scraping configuration Performance improvements: - HTTP success rate: 0% → 100% - Response time: 12-13s → 0.4-0.8s - Browser fallback: Always → Never needed - All endpoints now working correctly
Resolved conflicts: - app/services/clubs/players.py: Kept improved nationalities parsing with trim() - app/settings.py: Kept anti-scraping configuration settings - app/utils/xpath.py: Combined URL from HEAD with robust NAME fallbacks from main
- Fix import sorting - Add trailing commas - Replace single quotes with double quotes - Add noqa comments for long lines (User-Agent strings, XPath definitions) - Remove unused variables - Fix whitespace issues
- Change padding logic for players_joined_on, players_joined, and players_signed_from - Use "" instead of None to match the default value when elements are None - Fixes CodeRabbit review: inconsistent placeholder values
- Add try/except for playwright import to handle missing dependency - Make _browser_scraper optional (None if playwright unavailable) - Add checks in make_request_with_browser_fallback and get_monitoring_stats - Update test_browser_scraping endpoint to handle missing playwright - Add playwright to requirements.txt - App can now start without playwright, browser scraping disabled if unavailable
- Add playwright install chromium step in Dockerfile - Only runs if playwright is installed (graceful fallback) - Ensures browser binaries are available for Railway deployment
|
Warning Rate limit exceeded@eskoubar95 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 53 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
- Keep optional playwright import in base.py - Keep playwright availability check in test_browser_scraping endpoint - Maintains Railway deployment compatibility
- Keep optional playwright import and checks - Maintain browser scraper optional initialization - Preserve playwright availability checks in monitoring stats - All conflicts resolved, ready for merge
🔧 Railway Deployment Fix
This PR fixes the Railway deployment issue where the app failed to start due to missing
playwrightmodule.🐛 Problem
ModuleNotFoundError: No module named 'playwright'✅ Solution
Optional Playwright Import
Graceful Fallback
_browser_scraperisNoneif playwright unavailableDockerfile Update
playwright install chromiumstepRequirements Update
playwright==1.48.0torequirements.txtTest Endpoint Fix
/test/browser-scrapingto handle missing playwright gracefully📊 Impact
🧪 Testing
📝 Additional Fixes
__init__methodsReady for Review ✅