Skip to content

Alex-stack-cell/movie-web-scrapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraping Project

A Python web scraping project that extracts data from websites and processes it using pandas and BeautifulSoup.

🚀 Features

  • Web scraping using requests and BeautifulSoup
  • Data processing with pandas
  • CSV export functionality
  • Virtual environment setup

📋 Requirements

  • Python 3.x
  • Virtual environment (venv)

🛠️ Installation

  1. Clone the repository (if applicable):

    git clone <repository-url>
    cd web_scrapping
  2. Create and activate virtual environment:

    python3 -m venv venv
    source venv/bin/activate  # On macOS/Linux
    # or
    venv\Scripts\activate     # On Windows
  3. Install dependencies:

    pip install -r requirements.txt

📦 Dependencies

  • pandas - Data manipulation and analysis
  • requests - HTTP library for making requests
  • beautifulsoup4 - HTML/XML parsing library

🎯 Usage

  1. Activate the virtual environment:

    source venv/bin/activate
  2. Run the web scraping script:

    python web_scrapping.py
  3. Deactivate when done:

    deactivate

📁 Project Structure

web_scrapping/
├── venv/                    # Virtual environment
├── .gitignore              # Git ignore rules
├── requirements.txt         # Python dependencies
├── README.md               # This file
└── web_scrapping.py        # Main scraping script

🔒 Git Ignored Files

The following files are automatically ignored by git:

  • *.csv - Data files
  • venv/ - Virtual environment
  • __pycache__/ - Python cache
  • .DS_Store - macOS system files
  • Various temporary and IDE files

📝 Notes

  • Always activate the virtual environment before running the script
  • The script generates CSV files that are automatically ignored by git
  • Use ./venv/bin/python web_scrapping.py as an alternative to activating the environment

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Commit and push
  5. Create a pull request

📄 License

This project is open source and available under the MIT License.

About

A Python web scraping project that extracts data from movie website and processes it using pandas and BeautifulSoup.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages