Automates apartment hunting by scraping Facebook groups, extracting structured data with an LLM, and generating ready-to-send messages to landlords.
Because scrolling Facebook groups for hours is boring.
- Scrape — opens Facebook groups in Chrome, scrolls to the last unseen post, and collects new posts (author, text, timestamp, URL)
- Parse — sends each post to an LLM to extract structured data: price, location, property type, utilities, move-in date, etc.
- Filter — applies your criteria from
user_config.yaml(max price, location, property type, etc.) - Review — manually go through filtered apartments and keep/remove them
- Message — generates ready-to-send messages for each remaining apartment
make monitor-apartmentsRuns scraper and parser in a loop every N minutes (configurable in system_config.yaml). Prompts for manual review after each cycle if new relevant apartments are found.
Open demo.ipynb and run cells one by one:
- Login — opens Chrome, auto-fills credentials if provided, waits for CAPTCHA
- Scrape — collects new posts from all configured groups
- Parse — extracts structured data and filters by criteria
- Manual operations — review, add, or generate messages
cp .env.example .env
cp user_config.example.yaml user_config.yaml- Add FB credentials + Groq API key (or use local Ollama) in
.env - Edit
user_config.yamlwith your groups, message template and criteria - Optionally edit
system_config.yamlto change LLM, scrape interval, and timing
make monitor-apartments # run scraper + parser in a loop
make filter-raw # reprocess raw scraped files
make add-apartments # manually add apartments
make review-apartments # review and remove apartments
make generate-messages # generate messages for all relevant apartmentsfb-rental-scanner/
├── demo.ipynb # step-by-step Jupyter walkthrough
├── user_config.yaml # groups, criteria, message template
├── system_config.yaml # LLM, scraper, and runner settings
├── .env # secrets
├── scr/
│ ├── runner.py # main loop: scrape → parse → review
│ ├── scraper.py # selenium scraping logic
│ ├── parser.py # LLM extraction and filtering
│ ├── models.py # pydantic models
│ ├── setup.py # config loading and logging setup
│ └── manual_operations.py # manual review, add, and message generation
├── data/
│ ├── raw_data/ # raw scraped CSVs (auto-deleted after parsing)
│ ├── unsorted_apartments.csv # all extracted apartments before filtering
│ ├── relevant_apartments.csv # filtered apartments
│ └── messages_<timestamp>.txt # generated messages ready to send
└── makefile
user_config.yaml — things you change often:
facebook_groups— list of group URLs to scrapecriteria— price range, property type, location, etc.message_template— message sent to landlords
system_config.yaml — infrastructure settings:
llm_config— model type (groqorlocal), model name, promptscraper_config— cooldown, scroll speed, CSS/XPath selectorsrunner_config— scrape interval in minutes
Supports any model via LangChain. Tested with:
- Groq (
meta-llama/llama-4-scout-17b-16e-instruct) — fast, free tier available - Ollama (
llama3) — fully local, no API key needed
Change model_type in system_config.yaml to switch.
- Facebook requires manual CAPTCHA solving on first login
- Images and videos are blocked after login to speed up scraping
last_visitedper group is saved automatically inuser_config.yamlafter each run- LLM uses structured output — null values and type coercion are handled automatically
This project was built for personal and educational purposes only. Scraping Facebook may violate their Terms of Service. The author is not responsible for any misuse or consequences of using this tool. Use at your own risk.