Scraper Data Examples

This is a guide for getting started with Vergil, Directory, and Handshake Data. The links to the repos of the scrapers can be found below.

Setup

$ git clone git@github.com:NewsroomDevelopment/scraper-examples.git
$ cd scraper-examples

Create a .env file with the contents below. (See this Google Doc for the MongoDB user credentials.) Make sure .env is always listed in your .gitignore file.

# MongoDB credentials

MDB_USERNAME=USERNAME
MDB_PASSWORD=PASSWORD

Follow this tutorial to set up the aws credentials needed for some of the scrapers.
If you're using Python: Run pipenv install to install the necessary packages. Run pipenv shell to launch the virtual environment and get access to those packages. If you do not have pipenv do pip install pipenv. If you do not have pip look it up.
In the shell, run do python -m ipykernel install --user --name=scraper-kernel
Open up jupyter notebook and change the kernel by going to kernel -> change kernel -> scraper-kernel.

Open up jupyter notebook and test out the scrapers!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
.DS_Store		.DS_Store
.gitignore		.gitignore
Example Scraper Data.ipynb		Example Scraper Data.ipynb
Pipfile		Pipfile
README.md		README.md