π§ An automated flight scraping tool powered by Puppeteer to extract real-time data from Google Flights.
- π Customizable flight search (origin, destination, dates)
- π§ Headless browser automation using Puppeteer
- πΎ Outputs structured JSON data
βοΈ Captures airline, duration, price, stops, and more- π No API required β scrapes live data directly
GFlightScraper/
βββ scraper/ # Core Puppeteer scraping logic
βββ fileHandler/ # JSON file handling
βββ utils/ # Helper utilities
βββ main.js # Main script
βββ package.json # Project dependencies
βββ README.md # Documentation
# 1. Clone this repository
git clone https://github.com/Faheem798/GFlightScraper.git
# 2. Move into the project folder
cd GFlightScraper
# 3. Install required packages
npm install- βοΈ Add your Google Flights links in
utils/data.jslike this:
module.exports = [
"https://www.google.com/travel/flights/search?tfs=...&tfu=...",
"https://www.google.com/travel/flights/search?tfs=...&tfu=...",
// Add more links as needed
];βΆοΈ Run the script:
node main.js- β Check your output:
The scraped data will be saved in the output/ folder as a .json file.
[
{
"airline": "Qatar Airways",
"price": "$420",
"departureTime": "02:45 AM",
"arrivalTime": "10:25 AM",
"duration": "7h 40m",
"stops": "1 stop"
}
]- To see browser activity, launch Puppeteer in non-headless mode:
puppeteer.launch({ headless: false });- If selectors stop working, inspect the DOM and update the scraping logic accordingly.
Contributions are welcome! π
# Create a new feature branch
git checkout -b feature/your-feature
# Commit your changes
git commit -m "Add awesome feature"
# Push and create a pull request
git push origin feature/your-featureThis project is licensed under the MIT License.
Made with β€οΈ by Faheem
Got questions or suggestions? Open an issue
β οΈ Disclaimer: Google Flights may change its layout or block automated access. Use this tool responsibly and for educational purposes only.