Skip to content

NewsroomDevelopment/scraper-examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scraper Data Examples

This is a guide for getting started with Vergil, Directory, and Handshake Data. The links to the repos of the scrapers can be found below.

Setup

  1. Clone the repository and move into it:
$ git clone git@github.com:NewsroomDevelopment/scraper-examples.git
$ cd scraper-examples
  1. Create a .env file with the contents below. (See this Google Doc for the MongoDB user credentials.) Make sure .env is always listed in your .gitignore file.
# MongoDB credentials

MDB_USERNAME=USERNAME
MDB_PASSWORD=PASSWORD
  1. Follow this tutorial to set up the aws credentials needed for some of the scrapers.

  2. If you're using Python: Run pipenv install to install the necessary packages. Run pipenv shell to launch the virtual environment and get access to those packages. If you do not have pipenv do pip install pipenv. If you do not have pip look it up.

  3. In the shell, run do python -m ipykernel install --user --name=scraper-kernel

  4. Open up jupyter notebook and change the kernel by going to kernel -> change kernel -> scraper-kernel.

Usage

Open up jupyter notebook and test out the scrapers!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors