**MS Level Course: ADAPTATION AND LEARNING**

In this course, students learn to master tools, algorithms, and core concepts related to inference from data, data analysis, and adaptation and learning theories. *Emphasis is on the theoretical underpinnings and statistical limits of learning theory.* In particular, the course covers topics related to optimal inference, estimation theory, regularization methods, proximal methods, online and batch methods, stochastic learning, generalization and statistical learning theories, Bayes and naive classifiers, nearest-neighbor rules, self-organizing maps, decision trees, logistic regression, discriminant analysis, Perceptron, support vector machines, kernel methods, bagging, boosting, random forests, cross-validation, and principal component analysis. Project themes selected by students in consultation with the instructor.

**REFERENCES**

- A. H. Sayed,
*Adaptation and Learning*, lecture notes by the instructor, 2015. - A. H. Sayed,
*Adaptive Filters*, Wiley, NY, 2008.

**TOPICS COVERED**

**Part A: Background Material**

- Optimal inference
- Bayesian inference
- Maximum likelihood, Expectation-maximization
- Mixture models
- Regression analysis, Data fusion
- Least-squares problems
- Convex functions, L2 regularization, L1 regularization
- Subgradients, proximal operators

**Part B: Stochastic Learning**

- Batch learning
- Stochastic gradient learning
- Stochastic subgradient learning
- Stochastic proximal learning
- Variance-reduced stochastic learning

**Part C: Classification and Clustering**

- Naive Bayes
- NN rule
- k-means clustering
- Self-organizing maps
- Decision trees

**Part D: Generalization and Learning**

- Generalization theory
- Logistic regression
- Discriminant analysis
- The Perceptron
- Support vector machines
- Kernel-based learning
- Bagging and boosting
- Principal component analysis

**Ph.D. Level Course: INFERENCE OVER NETWORKS**

The course deals with the topic of information processing over graphs. It covers results and techniques that relate to the analysis and design of networks that are able to solve optimization, adaptation, and learning problems in an efficient and distributed manner through localized interactions among their agents. The treatment covers three intertwined topics: (a) how to perform distributed optimization over networks; (b) how to perform distributed adaptation over networks; and (c) how to perform distributed learning over networks. In these three domains, the course examines and compares the advantages and limitations of non-cooperative, centralized, and distributed solutions. There are many good reasons for the peaked interest in distributed implementations, especially in this day and age when the word “network” has become commonplace whether one is referring to social networks, power networks, transportation networks, biological networks, or other types of networks. Some of these reasons have to do with the benefits of cooperation in terms of improved performance and improved robustness and resilience to failure. Other reasons deal with privacy and secrecy considerations where agents may not be comfortable sharing their data with remote fusion centers. In other situations, the data may already be available in dispersed locations, as happens with cloud computing. One may also be interested in learning and extracting information through data mining from Big Data sets. The course devotes some good effort towards quantifying the limits of performance of distributed solutions and towards discussing design procedures that can realize their potential more fully. The presentation adopts a useful statistical perspective and derives tight performance results that elucidate the mean-square stability, convergence, and steady-state behavior of the learning networks. The course also illustrates how distributed processing over graphs gives rise to some revealing phenomena due to the coupling effect among the agents. The course overviews such phenomena in the context of adaptive networks and considers examples related to distributed sensing, intrusion detection, distributed estimation, online adaptation, clustering, network system theory, and machine learning applications.

**REFERENCES**

- A. H. Sayed, “
__Adaptation, learning, and optimization over networks__,”*Foundations and Trends in Machine Learning**,*vol. 7, issue 4-5, pp. 311-801, NOW Publishers, July 2014.**[Main Text]** - A. H. Sayed, “
__Adaptive networks__,”*Proc. IEEE*, vol. 102, no. 4, pp. 460-497, April 2014.**[Proceedings Article]** - A. H. Sayed et al., “
__Diffusion strategies for adaptation and learning over networks__,”*IEEE Signal Processing Magazine**,*vol. 30, no. 3, pp. 155-171, May 2013.**[Magazine Article]** - A. H. Sayed, “
__Diffusion adaptation over networks____,__” in*Academic Press Library in Signal Processing*vol. 3, pp. 323-454, Academic Press, Elsevier 2014.*,***[Book Chapter]** - LIST OF PROBLEMS

**TOPICS COVERED**

**Part A: Background Material**

- Linear Algebra and Matrix Theory Results
- Complex Gradients and Complex Hessian Matrices
- Convexity, Strict Convexity, and Strong Convexity
- Mean-Value Theorems, Lipschitz Conditions

**Part B: Single-Agent Adaptation and Learning**

- Single-Agent Optimization
- Stochastic-Gradient Optimization
- Convergence and Stability Properties
- Mean-Square-Error Performance

**Part C: Centralized Adaptation and Learning**

- Batch and Centralized Processing
- Convergence, Stability, and Performance
- Comparison to Single-Agent Processing

**Part D: Multi-Agent Network Model**

- Graph Properties. Connected and Strongly-Connected Networks
- Multi-Agent Inference Strategies
- Limit Point and Pareto Optimality
- Evolution of Network Dynamics

**Part E: Multi-Agent Network Stability and Performance**

- Stability of Network Dynamics
- Long-Term Error Dynamics
- Performance of Multi-Agent Networks
- Benefits of Cooperation
- Role of Informed Agents
- Adaptive Combination Strategies
- Gossip and Asynchronous Strategies
- Constrained Optimization
- Proximal Strategies
- ADMM Strategies
- Clustering

**LECTURES**

- Lecture #1: Motivation and Examples (slides)
- Lecture #2: Complex Gradient Vectors (slides)
- Lecture #3: Complex Hessian Matrices (slides)
- Lecture #4: Convex Functions (slides)
- Lecture #5: Logistic Regression (slides)
- Lecture #6: Mean-Value Theorems (slides)
- Lecture #7: Lipschitz Conditions (slides)
- Lecture #8: Useful Matrix Results (slides)
- Lecture #9: Optimization by Single Agents (slides)
- Lecture #10: Stochastic Optimization by Single Agents (slides)
- Lecture #11: Stability and Long-Term Dynamics (slides)
- Lecture #12: Performance by Single Agents (slides)
- Lecture #13: Centralized Adaptation and Learning (slides)
- Lecture #14: Multi-Agent Network Model (slides)
- Lecture #15: Multi-Agent Distributed Strategies (slides)
- Lecture #16: Evolution of Multi-Agent Networks (slides)
- Lecture #17: Stability of Multi-Agent Networks (slides)
- Lecture #18: Mean-Error Network Stability (slides)
- Lecture #19: Long-Term Network Dynamics (slides)
- Lecture #20: Performance of Multi-Agent Networks, I (slides)
- Lecture #21: Performance of Multi-Agent Networks, II (slides)
- Lecture #22: Benefits of Cooperation (slides)
- Lecture #23: Role of Informed Agents (slides)
- Lecture #24: Combination Policies (slides)
- Lecture #25: Extensions (slides)