This project demonstrates a complete end-to-end data analytics pipeline using Python.
It includes web scraping, data preprocessing, exploratory data analysis (EDA), data visualization, and sentiment analysis.
The data is collected from a sample e-commerce website and transformed into structured datasets for analysis and insights.
- Extract data from websites using web scraping
- Perform data cleaning and preprocessing
- Analyze dataset using EDA techniques
- Visualize patterns using graphs
- Apply sentiment analysis
- Python
- BeautifulSoup
- Requests
- Pandas
- Matplotlib
Web-Scraping-and-Data-Analysis-using-Python/
│
├── Task_1.py
├── Task_2.py
├── Task_3.py
├── Task_4.py
│
├── data/
│ ├── books_data.csv
│ ├── books_with_sentiment.csv
│
├── outputs/
│ ├── rating_chart.png
│
├── report.docxgit clone https://github.com/Rohitkoli1096/Web-Scraping-and-Data-Analysis-using-Python.git
cd Web-Scraping-and-Data-Analysis-using-Pythonpip3 install requests beautifulsoup4 pandas matplotlibpython3 Task_1.py
python3 Task_2.py
python3 Task_3.py
python3 Task_4.py- 📄 books_data.csv → Raw scraped dataset
- 📄 books_with_sentiment.csv → Dataset with sentiment classification
- 📈 rating_chart.png → Visualization of rating distribution
- Most products have ratings between 3 to 5
- Positive sentiment dominates the dataset
- Price distribution varies across products
- ⭐ 4–5 → Positive
- ⭐ 3 → Neutral
- ⭐ 1–2 → Negative
- Handling HTML structure
- Cleaning special characters (currency symbols)
- Managing Python environment setup
- Use Selenium for dynamic websites
- Scrape real-world e-commerce platforms
- Apply machine learning for sentiment analysis
- Build interactive dashboards
Rohit Koli
CodeAlpha Internship
MIT License
Thanks to CodeAlpha for providing this internship opportunity.