Natural Language Processing

From classical word representations to large language models, with an emphasis on mathematical foundations and hands-on implementation.

The course covers the full arc of NLP, from classical word representation methods to modern large language models, with an emphasis on mathematical foundations and hands-on implementation.

The lecture slides and notebooks are available on GitHub, and the course is structured as follows:

	Topic	Key Concepts
1	Word Embeddings	Distributional hypothesis, co-occurrence matrices, TF-IDF, PMI, LSA, LDA, Word2Vec, GloVe
2	Embedding exercises	Practical problems on vector spaces and similarity measures
3	LSA & Word2Vec notebook	SVD-based dimensionality reduction, skip-gram training, analogy tasks
4	Transformers	Self-attention, positional encoding, encoder/decoder architectures, BERT, GPT, fine-tuning (SFT, RLHF)
5	Transformer exercises	Conceptual questions on attention, masking, and training objectives
6	GPT from scratch notebook	Character-level GPT implementation in PyTorch following Karpathy’s nanoGPT
7	Transformers in Practice	Downstream tasks: text classification, QA, seq2seq, generation; fine-tuning regimes
8	BERT exercises	Fine-tuning BERT for sentiment analysis; feature extraction vs. full fine-tuning
9	BERT sentiment notebook	Fine-tune BERT on SST-2; frozen vs. fine-tuned embeddings, error analysis, attention visualisation