🛍️ E-commerce Web Scraper & Analytics Dashboard
https://img.shields.io/badge/Python-3.8%252B-blue https://img.shields.io/badge/Next.js-14-black https://img.shields.io/badge/React-18-blue https://img.shields.io/badge/License-MIT-green
A powerful Python-based web scraper that collects real-time product data from e-commerce sites and displays actionable insights through a beautiful React/Next.js dashboard. Perfect for competitor analysis, price monitoring, and market research.
https://via.placeholder.com/800x400/3B82F6/FFFFFF?text=E-commerce+Analytics+Dashboard Replace with actual screenshot ✨ Features 🔍 Web Scraping Capabilities
Multi-site Support: Extract data from various e-commerce platforms
Automated Data Collection: Products, prices, ratings, availability, and reviews
Smart Rate Limiting: Polite scraping with configurable delays
Custom URL Input: Users can add specific product URLs to monitor
📊 Analytics & Visualization
Interactive Dashboard: Beautiful charts and data visualizations
Price Distribution Analysis: Understand market pricing trends
Rating Statistics: Track product ratings and reviews
Category Insights: Analyze products by category
Export Functionality: Download data as CSV or JSON reports
🛠️ Technical Features
Python Backend: Robust scraping with BeautifulSoup and Scrapy
React/Next.js Frontend: Modern, responsive dashboard
SQLite Database: Lightweight data storage
RESTful API: Clean separation between backend and frontend
TypeScript: Fully typed for better development experience
🚀 Quick Start Prerequisites
Python 3.8+
Node.js 16+
Git
Installation
1.Clone the repository
bash
git clone https://github.com/your-username/web-scraper-analytics.git cd web-scraper-analytics
2.Set up the Backend bash
cd backend python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt
3.Set up the Frontend bash
cd ../dashboard npm install
4.Run the Application bash
cd backend source venv/bin/activate python main.py
cd dashboard npm run dev
5.Open your browser and navigate to http://localhost:3000
📖 Usage Guide Adding Products to Scrape
1.Open the dashboard at http://localhost:3000
2.Enter product URLs in the input form (one per line)
3.Run the scraper from the backend directory:
bash
cd backend python main.py
4.View results in the dashboard - data will automatically refresh
Supported E-commerce Sites
The scraper currently supports:
✅ Books to Scrape (testing)
⚡ Amazon (configuration needed)
⚡ eBay (configuration needed)
⚡ Custom sites (easily extendable)
Exporting Data
Download your scraped data in multiple formats:
CSV Export: Perfect for spreadsheets and data analysis
JSON Export: Ideal for developers and API integrations
Report Generation: Custom analytics reports
🏗️ Project Architecture text
web-scraper-analytics/ ├── backend/ # Python scraping application │ ├── scrapers/ # Site-specific scraping modules │ ├── main.py # Main scraping script │ ├── data_processor.py # Data cleaning and analysis │ └── requirements.txt # Python dependencies ├── dashboard/ # Next.js React dashboard │ ├── src/ │ │ ├── app/ # Next.js app router │ │ ├── components/ # React components │ │ └── types/ # TypeScript definitions │ └── package.json # Node.js dependencies ├── data/ # Data storage (auto-created) └── docs/ # Documentation
🔧 Configuration Backend Settings
Create a backend/.env file: env
DB_PATH=../data/products.db
REQUEST_DELAY=2 TIMEOUT=30 MAX_RETRIES=3
SCRAPE_BOOKS_TO_SCRAPE=true
Frontend Settings
Create a dashboard/.env.local file: env
NEXT_PUBLIC_APP_NAME="E-commerce Scraper Dashboard" NEXT_PUBLIC_API_URL=/api
🌐 Deployment Frontend Deployment (Vercel/Netlify)
The dashboard is ready for deployment on modern platforms:
Connect your GitHub repository to Vercel or Netlify
Set build settings:
Root directory: dashboard
Build command: npm run build
Output directory: .next
Deploy - changes will auto-deploy on git push
Backend Deployment (Optional)
For production use, consider deploying the backend to:
Heroku: For Python application hosting
AWS EC2: For full control and scalability
DigitalOcean: Simple droplet deployment
Note: The backend is designed to run on a schedule (e.g., daily scraping) 🛡️ Ethical Scraping Practices
This project follows ethical web scraping guidelines:
✅ Respects robots.txt directives
✅ Implements rate limiting to avoid overwhelming servers
✅ Uses descriptive user agents
✅ Caches responses to minimize repeated requests
✅ Provides clear identification of the scraper
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
Fork the project
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request
📝 License
This project is licensed under the MIT License - see the LICENSE file for details. 🚨 Disclaimer
This project is for educational and portfolio purposes. Always:
Check website terms of service before scraping
Respect rate limits and access policies
Use scraped data responsibly and ethically
Consider using official APIs when available
📊 Project Status
Active Development - This project is actively maintained and regularly updated.
✅ Core scraping functionality
✅ Dashboard visualization
✅ Data export features
🔄 Adding more e-commerce platforms
🔄 Enhanced analytics capabilities
🔄 User authentication system
🙋♂️ Support
If you have any questions or need help:
Check the FAQ
Open an Issue
Contact me at your.email@example.com
📈 SEO Keywords
E-commerce web scraper, price monitoring tool, competitor analysis dashboard, product tracking system, web scraping with Python, React analytics dashboard, BeautifulSoup scraper, Next.js dashboard, market research tools, price intelligence platform, product data extraction, e-commerce analytics, web scraping portfolio project, Python web scraper, product monitoring solution.
⭐ Star this repo if you found it helpful!
https://img.shields.io/github/stars/your-username/web-scraper-analytics?style=social
Built with ❤️ using Python, React, Next.js, and BeautifulSoup