LEGAL STUDIES 190, FALL 2022
Natural Language Processing & Law
The goal of this course is to examine natural language processing tools and techniques and how they can be used with legal text data.
Module
Summary
Berkeley Datahub Link
Notebook 1
Introduction to Anaconda
Notebook 2
Introduction to Python, Jupyter Notebooks, Pandas and Visualizations
Notebook 3
Introduction to case.law API
Notebook 4
N-Grams, Preprocessing, Tokenization, and non-Machine Learning Approaches to Text
Notebook 5
Supervized Machine Learning - Text Classification
Notebook 6
Unsupervised Machine Learning - Topic Modeling and Clustering
Notebook 7
Word Embeddings - Word2Vec and Doc2Vec
Notebook 8
Contextualized Word Embeddings - NLP with Transformers
Developer Team Lead: Arushi Sharma
Developer Team: Eddie Guo, Ukiah Heasley, Charlie Cheng-Jie Ji, Parth Shisode
These notebooks borrow code from multiple sources:
Spring 2022 UC Berkeley, Legal Studies 123 "Law, Data and Prediction", instructed by Jonathan Marshall - https://github.com/ds-modules/Legalst-123
Spring 2022 ETH Course, "Natural Language Processing for Law and Social Science", instructed by Elliott Ash - https://github.com/elliottash/nlp_lss_2022
case.law API example notebooks - https://github.com/harvard-lil/cap-examples
Jupyter notebooks for the "Natural Language Processing with Transformers" book (2022) - https://github.com/nlp-with-transformers/notebooks