SATW: DOPE - Deep Optimization

DOPE - Deep Optimization

Abstract

Thanks to neural networks (NNs), faster computation, and massive datasets, machine learning (ML) is under increasing pressure to provide automated solutions to even harder real-world tasks with beyond human performance with ever faster response times due to potentially huge technological and societal benefits. Unsurprisingly, the NN learning formulations present a fundamental challenge to the back-end learning algorithms despite their scalability, in particular due to the existence traps in the non-convex optimization landscape, such as saddle points, that can prevent algorithms to obtain “good” solutions.
Our recent research has demonstrated that the non-convex optimization dogma is false by showing that scalable stochastic optimization algorithms can avoid traps and rapidly obtain locally optimal solutions. Coupled with the progress in representation learning, such as over-parameterized neural networks, such local solutions can be globally optimal. Unfortunately, we have also proved that the central min- max optimization problems in ML, such as generative adversarial networks (GANs) and distributionally robust ML, contain spurious attractors that do not include any stationary points of the original learning formulation. Indeed, algorithms are subject to a grander challenge, including unavoidable convergence failures, which explain the stagnation in their progress despite the impressive earlier demonstrations.
As a result, the proposed Deep Optimization Project (DOPE) will confront this grand challenge in ML by building unified optimization and representation foundations in how we capture functions via non-linear representations, how we set-up our learning objectives that govern our fundamental goals, and how we optimize these goals to obtain numerical solutions in scalable fashion. We contend that optimization problems, such as non-convex non-concave min-max, cannot be studied in isolation from the context they are formulated. By exploiting the properties of the representations, we can invoke structures to obtain favorable convergence or actively discover which types of external oracles are necessary to guarantee convergence to “good” solutions. To this end, DOPE integrates three inter-related thrusts:
Thrust I: Limits of online learning in games. Many central ML problems, including min-max optimization, falls under the prototypical setting of online learning in games. This thrust develops a new unified theoretical framework in rigorously characterizing the convergence behavior of optimization methods in this setting, using a stochastic approximation perspective that not only identifies to which extend improvements are possible through function representations but also develops new approaches that match theoretical lower bounds on their runtime efficiency.
Thrust II: Fundamental trade-offs in representation and optimization. This thrust re-thinks how we represent functions by developing new uncertainty relations in non-linear representations, and characterizing their sample complexities as well as computational and robustness trade-offs. We also develop a new universal optimization backbone that can seamlessly obtain efficient solutions by automatically adapting to the hidden structures of the learning problems including min-max and min-min settings.
Thrust III: Bridging the theory and the practice. This thrust validates and provides feedback to our theory and methods in a key engineering application of circuit design. To achieve the desiderata, we develop a new optimization framework for automated control applications that can be formulated using Markov decision processes and test them on our open-source Electronic Design Automation interface.
Real progress on ML requires a coordinated effort based on theoretical and algorithmic foundations that balances, for a given statistical risk, the competing roles of data and representation size, computation, and robustness. Our goal of systematically understanding and expanding on this emerging perspective is ambitious and requires a significant departure from the state-of-the-art. However, our confidence that substantial progress can be made is backed up by promising preliminary research results.
Impact: A joint study of optimization and function representations are currently in its infancy even for basic minimization formulations in ML. By taking a unified approach in the broader setting of online learning in games, we develop new methods for variational inequalities, stochastic approximation in PDEs, and sampling with Langevin dynamics that are applicable well beyond data generation, compression, domain adaptation, and control tasks and are expected to change the way we treat data across sciences and engineering, promising substantial learning capabilities in diverse domains.

Last updated:22.03.2023

SNSF
Project funding
Original data source 205011 i

Mathematics, Natural- and Engineering Sciences
Engineering Sciences

1 People

Prof.Volkan Cevher

We help you find the perfect fit.

Abstract