TTIC101: Statistical Methods for Artificial Intelligence, Spring 2005

Instructor: David McAllester

MWF 9:30-10:20

Course Description

This course will give a survey of mathematical methods in statistical modeling, inference, and learning with an emphasis on techniques widely used in speech recognition, computational linguistics and vision.  Each lecture is organized around a theorem, algorithm, or technical concepts such as the channel  capacity theorem,  PAC-Bayesian theorem, the junction tree algorithm, boosting, support vector machines, VC dimensions and manifolds.  No prerequisites.

Problem Sets

The notes for each lecture contain problems at the end. The problems for the lectures in any given week are due at lecture on the Friday of the following week. There is a 25% grade penalty for late problem sets if they are no more than one week late. Problem sets late by more than one week have a 50% greade reduction.

General Outline (Subject to Change)

Part I: Information Theory, Modeling, Perception and Inference.

·       Entropy and Data Compression

·       Convexity, Jensen's Inequality, and KL Divergence

·       Random Variables, Mutual Information, and Channel Capacity

·       Hidden Markov Models. Viterbi and Forward-Backward.

·       Probabilistic Context Free Grammars (PCFGs).  Viterbi and Inside-Outside.

·       A*

·       Dijkstra Lightest Parse, Generaized A* , Paper on Generalized A*

·       Bayesian Networks and Markov Random Fields

·       Junction Trees, Tree Width, and the Hammersley-Clifford Theorem

·       Loopy Belief Propagation

·       Min-Cut Segmentation.

·       Spectral Clustering.

Part II: Learning

·       The Covariance Matrix. Multivariate Central Limit Theorem. Principle Component Analysis.

·       Conjugate Priors. Dirichlett Priors. Gaussian Priors.

·       Least Squares Regression. Ridge Regression.

·       The PAC-Bayesian Theorem.

·       Margin Bounds. SVMs.

·       Convex Duality

·       Kernels.

·       VC Dimension.

·       Logistic Regression.   (more slides)

·       General EM.

·       MAP EM, Structural EM, Leave-One-Out EM.

·       Boosting.

·       On-line learning.

·       Good-Turing.

·       Semi-Supervised Regression.