Scrapy 🕷️

A Web Scraper Retrieval-Augmented Generation (RAG) System
Built using Selenium, Streamlit, and Bright Data Web Scraping API

P.S. If you are not seeing the desired output from a popular website it may be because they are using anti-bot and security mechanisms.

📌 Overview

Scrapy is a modular web scraping application that integrates RAG (Retrieval-Augmented Generation) capabilities to extract and process structured data from web pages. This system can be used to:

Extract dynamic content from websites
Interface with users via a Streamlit web UI
Leverage Bright Data’s scraping infrastructure for enhanced reliability
Run analysis or pass scraped data into LLM pipelines

⚙️ Tech Stack

Python
Selenium – For headless browser automation and dynamic content rendering
Streamlit – For the interactive web UI
Bright Data Web Scraping API – For robust scraping of websites with anti-bot measures
LangChain / RAG Architecture (optional) – For integrating with LLMs

🖼️ Screenshots

🔧 Streamlit Interface

🌐 Website Scraped Example

Features

Dynamic scraping with Selenium
Bright Data API Integration for bypassing complex anti-bot challenges
Clean, interactive interface with Streamlit
Modular structure for plugging in RAG pipelines or further NLP processing

Potential Use Cases

Market competitor analysis
E-commerce pricing tracker
Feed generator for LLMs

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
README.md		README.md
Test.png		Test.png
main.py		main.py
page.png		page.png
parse.py		parse.py
requirements.txt		requirements.txt
scrape.py		scrape.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapy 🕷️

📌 Overview

⚙️ Tech Stack

🖼️ Screenshots

🔧 Streamlit Interface

🌐 Website Scraped Example

Features

Potential Use Cases

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Scrapy 🕷️

📌 Overview

⚙️ Tech Stack

🖼️ Screenshots

🔧 Streamlit Interface

🌐 Website Scraped Example

Features

Potential Use Cases

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages