AI Web Scraper with Streamlit

This project is a web-based tool built using Streamlit and BeautifulSoup that allows users to extract specific data (tables, headings, or rows) from webpages. The tool enables quick preview, visualization, and CSV export of the scraped data.

Features

Input any webpage URL
Select the type of data to extract: full table, headings, or specific rows
Visualize the extracted data using basic charts
Export the data to a CSV file
Simple and user-friendly interface

Getting Started

1. Clone the repository

git clone https://github.com/yourusername/ai-web-scraper.git
cd ai-web-scraper

2. Create and activate a virtual environment (recommended)

On Windows:

python -m venv venv
venv\Scripts\activate

On macOS/Linux:

python3 -m venv venv
source venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Run the app

streamlit run app.py

Project Structure

app/
├── app.py               # Main Streamlit app
├── scraper.py           # Contains scraping logic
├── interpreter.py       # AI prompt → selector logic (optional)
├── utils.py             # CSV export, error handling
├── requirements.txt

Secrets Setup (Optional)

If using an OpenAI API key or other environment variables, create .streamlit/secrets.toml like this:

[openai]
api_key = "your-openai-api-key"

Access it in your code as:

st.secrets["openai"]["api_key"]

Dependencies

streamlit
pandas
beautifulsoup4
lxml
matplotlib

Deployment

You can deploy this project on Streamlit Cloud. Just upload your code and add your secrets.toml in the Settings → Secrets section of your app.

Let me know if you’d like badges, Docker setup, or example screenshots added.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
interpreter.py		interpreter.py
requirements.txt		requirements.txt
scraper.py		scraper.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Web Scraper with Streamlit

Features

Getting Started

1. Clone the repository

2. Create and activate a virtual environment (recommended)

3. Install dependencies

4. Run the app

Project Structure

Secrets Setup (Optional)

Dependencies

Deployment

About

Uh oh!

Languages

License

shivah12/WebScrapper-python

Folders and files

Latest commit

History

Repository files navigation

AI Web Scraper with Streamlit

Features

Getting Started

1. Clone the repository

2. Create and activate a virtual environment (recommended)

3. Install dependencies

4. Run the app

Project Structure

Secrets Setup (Optional)

Dependencies

Deployment

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages