This script was written to scrape the docx of parliament speeches from the
official website of the Lithuanian seimas.
This was used for my masters thesis.
Since this was an annoying step to do, the code is publicly available here so that it could be easier to reproduce.
This code is written quite poorly, but it did the job. Any suggestions are welcome. Feel free to use this I don't care.
To set up an environment:
python3 -m venv env
source env/bin/activate
pip install -r requirements.txtTo run:
python main.py