Skip to content

brn-maker/web-scraper2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛍️ E-commerce Web Scraper & Analytics Dashboard

https://img.shields.io/badge/Python-3.8%252B-blue https://img.shields.io/badge/Next.js-14-black https://img.shields.io/badge/React-18-blue https://img.shields.io/badge/License-MIT-green

A powerful Python-based web scraper that collects real-time product data from e-commerce sites and displays actionable insights through a beautiful React/Next.js dashboard. Perfect for competitor analysis, price monitoring, and market research.

https://via.placeholder.com/800x400/3B82F6/FFFFFF?text=E-commerce+Analytics+Dashboard Replace with actual screenshot ✨ Features 🔍 Web Scraping Capabilities

Multi-site Support: Extract data from various e-commerce platforms

Automated Data Collection: Products, prices, ratings, availability, and reviews

Smart Rate Limiting: Polite scraping with configurable delays

Custom URL Input: Users can add specific product URLs to monitor

📊 Analytics & Visualization

Interactive Dashboard: Beautiful charts and data visualizations

Price Distribution Analysis: Understand market pricing trends

Rating Statistics: Track product ratings and reviews

Category Insights: Analyze products by category

Export Functionality: Download data as CSV or JSON reports

🛠️ Technical Features

Python Backend: Robust scraping with BeautifulSoup and Scrapy

React/Next.js Frontend: Modern, responsive dashboard

SQLite Database: Lightweight data storage

RESTful API: Clean separation between backend and frontend

TypeScript: Fully typed for better development experience

🚀 Quick Start Prerequisites

Python 3.8+

Node.js 16+

Git

Installation

1.Clone the repository
bash

git clone https://github.com/your-username/web-scraper-analytics.git cd web-scraper-analytics

2.Set up the Backend bash

cd backend python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt

3.Set up the Frontend bash

cd ../dashboard npm install

4.Run the Application bash

Terminal 1 - Start the backend

cd backend source venv/bin/activate python main.py

Terminal 2 - Start the frontend

cd dashboard npm run dev

5.Open your browser and navigate to http://localhost:3000

📖 Usage Guide Adding Products to Scrape

1.Open the dashboard at http://localhost:3000

2.Enter product URLs in the input form (one per line)

3.Run the scraper from the backend directory:
bash

cd backend python main.py

4.View results in the dashboard - data will automatically refresh

Supported E-commerce Sites

The scraper currently supports:

✅ Books to Scrape (testing)

⚡ Amazon (configuration needed)

⚡ eBay (configuration needed)

⚡ Custom sites (easily extendable)

Exporting Data

Download your scraped data in multiple formats:

CSV Export: Perfect for spreadsheets and data analysis

JSON Export: Ideal for developers and API integrations

Report Generation: Custom analytics reports

🏗️ Project Architecture text

web-scraper-analytics/ ├── backend/ # Python scraping application │ ├── scrapers/ # Site-specific scraping modules │ ├── main.py # Main scraping script │ ├── data_processor.py # Data cleaning and analysis │ └── requirements.txt # Python dependencies ├── dashboard/ # Next.js React dashboard │ ├── src/ │ │ ├── app/ # Next.js app router │ │ ├── components/ # React components │ │ └── types/ # TypeScript definitions │ └── package.json # Node.js dependencies ├── data/ # Data storage (auto-created) └── docs/ # Documentation

🔧 Configuration Backend Settings

Create a backend/.env file: env

Database settings

DB_PATH=../data/products.db

Scraping settings

REQUEST_DELAY=2 TIMEOUT=30 MAX_RETRIES=3

Target sites

SCRAPE_BOOKS_TO_SCRAPE=true

Frontend Settings

Create a dashboard/.env.local file: env

NEXT_PUBLIC_APP_NAME="E-commerce Scraper Dashboard" NEXT_PUBLIC_API_URL=/api

🌐 Deployment Frontend Deployment (Vercel/Netlify)

The dashboard is ready for deployment on modern platforms:

Connect your GitHub repository to Vercel or Netlify

Set build settings:

    Root directory: dashboard

    Build command: npm run build

    Output directory: .next

Deploy - changes will auto-deploy on git push

Backend Deployment (Optional)

For production use, consider deploying the backend to:

Heroku: For Python application hosting

AWS EC2: For full control and scalability

DigitalOcean: Simple droplet deployment

Note: The backend is designed to run on a schedule (e.g., daily scraping) 🛡️ Ethical Scraping Practices

This project follows ethical web scraping guidelines:

✅ Respects robots.txt directives

✅ Implements rate limiting to avoid overwhelming servers

✅ Uses descriptive user agents

✅ Caches responses to minimize repeated requests

✅ Provides clear identification of the scraper

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Fork the project

Create your feature branch (git checkout -b feature/AmazingFeature)

Commit your changes (git commit -m 'Add some AmazingFeature')

Push to the branch (git push origin feature/AmazingFeature)

Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details. 🚨 Disclaimer

This project is for educational and portfolio purposes. Always:

Check website terms of service before scraping

Respect rate limits and access policies

Use scraped data responsibly and ethically

Consider using official APIs when available

📊 Project Status

Active Development - This project is actively maintained and regularly updated.

✅ Core scraping functionality

✅ Dashboard visualization

✅ Data export features

🔄 Adding more e-commerce platforms

🔄 Enhanced analytics capabilities

🔄 User authentication system

🙋‍♂️ Support

If you have any questions or need help:

Check the FAQ

Open an Issue

Contact me at your.email@example.com

📈 SEO Keywords

E-commerce web scraper, price monitoring tool, competitor analysis dashboard, product tracking system, web scraping with Python, React analytics dashboard, BeautifulSoup scraper, Next.js dashboard, market research tools, price intelligence platform, product data extraction, e-commerce analytics, web scraping portfolio project, Python web scraper, product monitoring solution.

⭐ Star this repo if you found it helpful!

https://img.shields.io/github/stars/your-username/web-scraper-analytics?style=social

Built with ❤️ using Python, React, Next.js, and BeautifulSoup

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors