A collection of Python scrapers for monitoring government procurement and research opportunities from various federal agencies. All scrapers integrate with Monday.com for tracking and Slack for notifications.
- 20+ scrapers covering SAM.gov, SBIR/STTR, OTA consortia, DARPA, DIU, DHS, and more
- BaseScraper pattern: New scrapers inherit from
BaseScraperfor consistent Monday.com/Slack integration and deduplication - Slack Integration: Automatic notifications when new opportunities are found
- Monday.com Integration: Auto-create items in Monday.com boards (opportunities board + event dashboard)
- Environment-based Configuration: Secure API key management via
.env - Selenium support: Headless browser scraping for JS-heavy and login-required sites
These scrapers inherit from BaseScraper and post to the main opportunities Monday.com board.
| Scraper | Source | Method | Verified | Notes |
|---|---|---|---|---|
dod_sbirsttr_scraper.py |
DoD SBIR/STTR | REST API | Yes | Public API, fetches open/pre-release topics |
darpa_scraper.py |
DARPA | RSS feed | Yes | Parses RSS, extracts deadlines from descriptions |
erdcwerx_scraper.py |
ERDCWERX | WordPress API + HTML | Yes | WP REST API for listing, HTML scraping for deadlines |
diu_scraper.py |
DIU | HTML (Nuxt SSR) | Yes | Server-rendered, no JS needed |
grantsgov_scraper.py |
Grants.gov | REST API | Yes | Filters by for-profit/small-biz eligibility codes |
colosseum_scraper.py |
Colosseum (ONI) | HTML | Yes | Public homepage, no login needed |
challenge_gov_scraper.py |
USA.gov Challenges | HTML | Yes | Detail page enrichment with deadlines, prizes, agencies |
dhs_sbir_scraper.py |
DHS SBIR | Selenium | No | Cloudflare-protected; falls back to sbir.gov |
tradewind_scraper.py |
Tradewind AI | Selenium | No | Wix site, CSS selectors need live validation |
vulcan_sof_scraper.py |
Vulcan SOF | Selenium (visible) | No | Requires login + 2FA; runs non-headless for manual 2FA entry |
These scrapers predate BaseScraper and have their own Monday.com/Slack integration.
| Scraper | Source | Description |
|---|---|---|
custom_samgov_search.py |
SAM.gov | Custom SAM.gov search template |
small_biz_samgov_search.py |
SAM.gov | NAICS 541715 small business set-aside opportunities |
industry_day_scraper.py |
SAM.gov | Industry Day events -- posts to Event Dashboard board |
| Scraper | Source | Description |
|---|---|---|
sda_scraper.py |
SDA | Space Development Agency opportunities |
cfic/ |
CFIC / ARCYBER | CyberFIC collaboration events, webinars, and assessments |
Note: Not all scrapers have been verified end-to-end with live Monday.com/Slack integration. The "Verified" column above indicates whether the scraper has been tested against the live source and confirmed to fetch/parse data correctly. Selenium-based scrapers in particular need live validation of CSS selectors, which may change when sites update their layouts.
- Clone this repository:
git clone https://github.com/jhilly20/GovCon.git
cd GovCon- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
cp .env.example .env
# Edit .env with your API keys and configurationCreate a .env file based on .env.example:
# Monday.com (optional - for tracking opportunities)
MONDAY_API_KEY=your_monday_api_key_here
MONDAY_BOARD_ID=your_board_id_here
MONDAY_EVENT_BOARD_ID=your_event_board_id_here # Event Dashboard board (industry days)
# Slack (optional - for notifications)
SLACK_BOT_TOKEN=xoxb-your-slack-bot-token-here
SLACK_CHANNEL=your_slack_channel_id_here
# SAM.gov (optional - search works without it)
SAM_API_KEY=your_sam_api_key_here
# Vulcan SOF (required for vulcan_sof_scraper only)
VULCAN_SOF_EMAIL=your_email_here
VULCAN_SOF_PASSWORD=your_password_here
# Colosseum credentials (if login required)
COLOSSEUM_EMAIL=your_email_here
COLOSSEUM_PASSWORD=your_password_here- Monday.com: Visit https://developer.monday.com/api/docs/authentication
- Slack: Create a bot app at https://api.slack.com/apps
- SAM.gov: Request an API key at https://sam.gov/api/key-request
- Grants.gov: No key needed (public
search2API)
# SAM.gov scrapers
python scrapers/custom_samgov_search.py # Custom SAM.gov search
python scrapers/small_biz_samgov_search.py # Small business set-asides
python scrapers/industry_day_scraper.py # Industry Day events (Event Dashboard)
# BaseScraper-based opportunity scrapers
python scrapers/dod_sbirsttr_scraper.py # DoD SBIR/STTR topics
python scrapers/darpa_scraper.py # DARPA opportunities (RSS)
python scrapers/erdcwerx_scraper.py # ERDCWERX tech challenges
python scrapers/diu_scraper.py # DIU open solicitations
python scrapers/grantsgov_scraper.py # Grants.gov (for-profit eligible)
python scrapers/colosseum_scraper.py # Colosseum / ONI challenges
python scrapers/challenge_gov_scraper.py # USA.gov challenge competitions
# Selenium-based scrapers (require browser)
python scrapers/dhs_sbir_scraper.py # DHS SBIR (Cloudflare)
python scrapers/tradewind_scraper.py # Tradewind AI (Wix)
python scrapers/vulcan_sof_scraper.py # Vulcan SOF (login + 2FA)
# Other
python scrapers/sda_scraper.py # Space Development Agency
python -m scrapers.cfic # CyberFIC eventsThe custom_samgov_search.py file shows how to create targeted searches. Key parameters to modify:
# Example search parameters
params = {
"q": "your search terms here",
"naics": "541715", # NAICS code for your industry
"set_aside": "SBP,SBA", # Small business set-asides
"notice_type": "p" # Presolicitations only
}The small_biz_samgov_search.py is specifically configured for:
- NAICS 541715: Computer Systems Design Services
- Set-asides: Small Business (SBP) and SBA programs
- Custom Slack labeling: "small biz setaside 541715 sam.gov"
The industry_day_scraper.py searches for:
- Industry Day events: Government-hosted industry days and conferences
- Search term: "Industry Day" on SAM.gov
- Notice type: Special notices (type "s")
- Event Dashboard board: Posts to
MONDAY_EVENT_BOARD_ID(separate from the opportunities board) - Deduplication: By solicitation number to prevent recreating existing items
- Detail enrichment: Fetches v2 detail endpoint for authoritative links and topic numbers
The cfic/ package scrapes upcoming events from CyberFIC.org:
- Collaboration Events (CE): In-person events with purpose/synopsis, RSVP deadlines, PDF releases
- Assessment Events (AE): Targeted problem events with desirements
- Connector Series Webinars: Virtual speaker series with key takeaways
- Q & A Sessions: Pre-submission Q&A with ARCYBER stakeholders
- Automatically follows detail page links to collect full event information
- Syncs to Monday.com and sends Slack notifications for new events
If you want to use Monday.com, update the column mappings in each scraper's config section:
# Monday.com column mappings
TITLE_COLUMN = "your_title_column_id"
DESCRIPTION_COLUMN = "your_description_column_id"
URL_COLUMN = "your_url_column_id"
DEADLINE_COLUMN = "your_deadline_column_id"
AGENCY_COLUMN = "your_agency_column_id"Each scraper returns a list of opportunity dictionaries with the following structure:
{
"title": "Opportunity Title",
"description": "Full description",
"url": "Direct link to opportunity",
"deadline": "YYYY-MM-DD",
"agency": "Agency Name",
"posted_date": "YYYY-MM-DD"
}- Fork the repository
- Create a feature branch:
git checkout -b feature/new-scraper - Make your changes
- Add tests if applicable
- Submit a pull request
The following scrapers and integrations are planned for future development:
| Source | Status | Notes |
|---|---|---|
| MITRE AiDA OTA Consortia | In progress | Per-consortium opportunity parsing; prepend consortium name to titles |
| CyberFIC | Done | Already implemented in cfic/ |
| ICWERX | Planned | |
| NASA SBIR/STTR | Planned | |
| DOE SBIR | Planned | |
| ConnectWERX | Planned | |
| EnergyWERX | Planned | |
| HSWERX | Planned |
| Source | Status | Notes |
|---|---|---|
| NAM Consortium | Planned | |
| TechConnect | Planned | WordPress REST API available |
| ARL DEVCOM | Planned | |
| NSPIRES (NASA) | Planned | |
| ARPA-E | Planned | |
| EERE Exchange | Planned | |
| DOE PAMS | Planned | |
| DHS Forecast | Planned | |
| Volpe DOT SBIR | Planned | Cloudflare-protected |
| ARPA-I | Planned | |
| NIST SBIR | Planned | |
| NOAA SBIR | Planned |
Future integration with event calendars for automatic syncing:
- Google Calendar (DEF.org events, imported calendars)
- CTO Innovation events
- NCSI calendar
- These scrapers are provided for educational and research purposes
- Always respect website terms of service and rate limits
- Some sites may require additional authentication or have anti-scraping measures
- Consider adding delays between requests to avoid overwhelming servers
- Not all scrapers have been verified end-to-end yet. API-based scrapers (DoD SBIR, DARPA, ERDCWERX, DIU, Grants.gov, Colosseum) have been tested against live sources. Selenium-based scrapers (DHS SBIR, Tradewind, Vulcan SOF) need live validation of CSS selectors.
This software is not affiliated with any government agency. Users are responsible for ensuring compliance with all applicable laws and terms of service.
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with Requests, BeautifulSoup, and Selenium
- Inspired by the need to streamline opportunity discovery for researchers and small businesses