Introduction to natural language processing. The course presents methods for representing human language at different levels, moving from simple bag-of-words representations, through sequential models, to syntactic and semantic structures. At each level, the course develops several machine learning models and algorithms for solving various natural language processing tasks, such as text classification, sequence labeling, syntactic parsing, and semantic parsing. The course discusses models such as Naïve Bayes, logistic regression, hidden Markov models, various deep neural networks, and appropriate learning algorithms for estimating their parameters. The course emphasizes implementation of classical and modern natural language processing models.
By the end of this course, the student will be able to:
- Model the standard levels of linguistic structures using formal grammars or statistical and computational models.
- Identify and carry out proper experimental methodology for training and evaluating natural language processing systems.
- Manipulate probabilities and estimate parameters of structured models using supervised training methods.
- Implement simple models of language, and employ and adapt them in service of solving natural language processing problems.