In this dissertation, we study optimization, adaptation, and learning problems over connected networks. In these problems, each agent k collects and learns from its own local data and is able to communicate with its local neighbors. While every single node in the network may not be capable of sophisticated behavior on its own, the agents collaborate to solve large-scale and challenging learning problems.
Different approaches have been proposed in the literature to boost the learning capabilities of networked agents. Among these approaches, the class of diffusion strategies has been shown to be particularly well-suited due to their enhanced stability range over other methods and improved performance in adaptive scenarios. However, diffusion implementations suffer from a small inherent bias in the iterates. When a constant step-size is employed to solve deterministic optimization problems, the iterates generated by the diffusion strategy will converge to a small neighborhood around the desired global solution but not to the exact solution itself. This bias is not due to any gradient noise arising from stochastic approximation; it is instead due to the update structure in diffusion implementations. The existence of the bias leads to three questions: (1) What is the origin of this inherent bias? (2) Can it be eliminated? (3) Does the correction of the bias bring benefits to distributed optimization, distributed adaptation, or distributed learning?
This dissertation provides affirmative solutions to these questions. Specifically, we design a new exact diffusion approach that eliminates the inherent bias in diffusion. Exact diffusion has almost the same structure as diffusion, with the addition of a “correction” step between the adaptation and combination steps. Next, this dissertation studies the performance of exact diffusion for the scenarios of distributed optimization, distributed adaptation, and distributed learning, respectively. For distributed optimization, exact diffusion is proven to converge exponentially fast to the exact global solution under proper conditions. For distributed adaptation, exact diffusion is proven to have better steady-state mean-square-error than diffusion, and this superiority is analytically shown to be more evident for sparsely-connected networks such as line, cycle, grid, and other topologies. In distributed learning, exact diffusion can be integrated with the amortized variance-reduced gradient method (AVRG) so that it converges exponentially fast to the exact global solution while employing stochastic gradients per iteration. This dissertation also compares exact diffusion with other state-of-the-art methods in the literature. Intensive numerical simulations are provided to illustrate the theoretical results derived in the dissertation.
This work was supported in part by NSF grants ECCS-1407712 and CCF-1524250. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.