Skip to content
This repository was archived by the owner on Aug 1, 2023. It is now read-only.

VIVYNet/IMSLPScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IMSLPScraper

This is a python script to scrape every song information from imslp.org. Places all information about each lied into a specified MongoDB server. Beautiful Soup and the LXML parser is used for this web scraper. Multithreading is utilized in order to hasten the scraping process.

main.py - Python Script for lieder.net web scraping. Script execution specifications are listed below.

py main.py

login.json - Remote connection to a MongoDB connection is possible via this file.

requirements.txt - Text file containing libraries needed to run the script

Questions or Issues?

If you have any inquiries or questions, please let me know! You can open an issue under the Issues Tab. Or, you can email me at b10@asu.edu* or BenHerrera1044@outlook.com. If an email is sent, expect to receive a response in under 24 hours.

* I cannot access emails sent to b10@asu.edu after May 2026

About

Scrapes imslp.org for documents listed on the website. Utilizes bs4 and lxml for html parsing, and MongoDB for storage of metadata from scraped webpage data. Multithreading is used for quicker operations.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages