Berlin Oberseminar:
Optimization, Control and Inverse Problems


This seminar serves as a knowledge exchange and networking platform for the broad area of mathematical optimization and related applications within Berlin.




22.03.2023 Dr. Constantin Christof (Technische Universität München, Germany)
10:00 - 11:00

On the identification and optimization of nonsmooth superposition operators in semilinear elliptic PDEs

We study an infinite-dimensional optimization problem that aims to identify the Nemytskii operator in the nonlinear part of a prototypical semilinear elliptic partial differential equation which minimizes the distance between the PDE-solution and a given desired state. In contrast to previous works, we consider this identification problem in a low-regularity regime in which the function inducing the Nemytskii operator is a-priori only known to be an element of H1loc. This makes the studied problem class a suitable point of departure for the rigorous analysis of training problems for learning-informed PDEs in which an unknown superposition operator is approximated by means of a neural network with nonsmooth activation functions (ReLU, leaky-ReLU, etc.). We establish that, despite the low regularity of the controls, it is possible to derive a classical stationarity system for local minimizers and to solve the considered problem by means of a gradient projection method. It is also shown that the established first-order necessary optimality conditions imply that locally optimal superposition operators share various characteristic properties with commonly used activation functions: They are always sigmoidal, continuously differentiable away from the origin, and typically possess a distinct kink at zero.

Past Events

10.11.2022 Dr. Jonas Latz (Heriot-Watt University, Edinburgh, Scotland)

Analysis of stochastic gradient descent in continuous time

Optimisation problems with discrete and continuous data appear in statistical estimation, machine learning, functional data science, robust optimal control, and variational inference. The 'full' target function in such an optimisation problem is given by the integral over a family of parameterised target functions with respect to a discrete or continuous probability measure. Such problems can often be solved by stochastic optimisation methods: performing optimisation steps with respect to the parameterised target function with randomly switched parameter values. In this talk, we discuss a continuous-time variant of the stochastic gradient descent algorithm. This so-called stochastic gradient process couples a gradient flow minimising a parameterised target function and a continuous-time 'index' process which determines the parameter. We first briefly introduce the stochastic gradient processes for finite, discrete data which uses pure jump index processes. Then, we move on to continuous data. Here, we allow for very general index processes: reflected diffusions, pure jump processes, as well as other Lévy processes on compact spaces. Thus, we study multiple sampling patterns for the continuous data space. We show that the stochastic gradient process can approximate the gradient flow minimising the full target function at any accuracy. Moreover, we give convexity assumptions under which the stochastic gradient process with constant learning rate is geometrically ergodic. In the same setting, we also obtain ergodicity and convergence to the minimiser of the full target function when the learning rate decreases over time sufficiently slowly.

30.05.22 Pier Luigi Dragotti (Imperial College London)

Computational Imaging and Sensing: Theory and Applications

The revolution in sensing, with the emergence of many new imaging techniques, offers the possibility of gaining unprecedented access to the physical world, but this revolution can only bear fruit through the skilful interplay between the physical and computational realms. This is the domain of computational imaging which advocates that, to develop effective imaging systems, it will be necessary to go beyond the traditional decoupled imaging pipeline where device physics, image processing and the end-user application are considered separately. Instead, we need to rethink imaging as an integrated sensing and inference model.

In the first part of the talk we highlight the centrality of sampling theory in computational imaging and investigate new sampling modalities which are inspired by the emergence of new sensing mechanisms. We discuss time-based sampling which is connected to event-based cameras where pixels behave like neurons and fire when an event happens. We derive sufficient conditions and propose novel algorithms for the perfect reconstruction of classes of non-bandlimited functions from time-based samples. We then develop the interplay between learning and computational imaging and present a model-based neural network for the reconstruction of video sequences from events. The architecture of the network is model-based and is designed using the unfolding technique, some element of the acquisition device are part of the network and are learned with the reconstruction algorithm.

In the second part of the talk, we focus on the heritage sector which is experiencing a digital revolution driven in part by the increasing use of non-invasive, non-destructive imaging techniques. These new imaging methods provide a way to capture information about an entire painting and can give us information about features at or below the surface of the painting. We focus on Macro X-Ray Fluorescence (XRF) scanning which is a technique for the mapping of chemical elements in paintings and introduce a method that can process XRF scanning data from paintings. The results presented show the ability of our method to detect and separate weak signals related to hidden chemical elements in the paintings. We analyse the results on Leonardo's 'The Virgin of the Rocks' and show that our algorithm is able to reveal, more clearly than ever before, the hidden drawings of a previous composition that Leonardo then abandoned for the painting that we can now see.

This is joint work with R. Alexandru, R. Wang, Siying Liu, J. Huang and Y.Su from Imperial College London; C. Higgitt and N. Daly from The National Gallery in London and Thierry Blu from the Chinese University of Hong Kong.


Bio: Pier Luigi Dragotti is Professor of Signal Processing in the Electrical and Electronic Engineering Department at Imperial College London and Fellow of the IEEE. He received the Laurea Degree (summa cum laude) in Electronic Engineering from the University Federico II, Naples, Italy, in 1997; the Master degree in Communications Systems from the Swiss Federal Institute of Technology of Lausanne (EPFL), Switzerland in 1998; and PhD degree from EPFL, Switzerland, in 2002. He has held several visiting positions. In particular, he was a visiting student at Stanford University, Stanford, CA in 1996, a summer researcher in the Mathematics of Communications Department at Bell Labs, Lucent Technologies, Murray Hill, NJ in 2000, a visiting scientist at Massachusetts Institute of Technology (MIT) in 2011 and a visiting scholar at Trinity College Cambridge in 2020.

Dragotti was Editor-in-Chief of the IEEE Transactions on Signal Processing (2018-2020), Technical Co-Chair for the European Signal Processing Conference in 2012, Associate Editor of the IEEE Transactions on Image Processing from 2006 to 2009. He was also Elected Member of the IEEE Computational Imaging Technical Committee and the recipient of an ERC starting investigator award for the project RecoSamp. Currently, he is IEEE SPS Distinguished Lecturer.

His research interests include sampling theory, wavelet theory and its applications, computational imaging and sparsity-driven signal processing.


06.12.21 Juan Carlos de los Reyes (Escuela Politécnica Nacional, Ecuador)

Bilevel learning for inverse problems

In recent years, novel optimization ideas have been applied to several inverse problems in combination with machine learning approaches, to improve the inversion by optimally choosing different quantities/functions of interest. A fruitful approach in this sense is bilevel optimization, where the inverse problems are considered as lower-level constraints, while on the upper-level a loss function based on a training set is used. When confronted with inverse problems with nonsmooth regularizers or nonlinear operators, however, the bilevel optimization problem structure becomes quite involved to be analyzed, as classical nonlinear or bilevel programming results cannot be directly utilized. In this talk, I will discuss on the different challenges that these problems pose, and provide some analytical results as well as a numerical solution strategy.

05.07.2021 Patrick Farrell (University of Oxford, UK)

Computing disconnected bifurcation diagrams of partial differential equations

Computing the distinct solutions $u$ of an equation $f(u, \lambda) = 0$ as a parameter $\lambda \in \mathbb{R}$ is varied is a central task in applied mathematics and engineering. The solutions are captured in a bifurcation diagram, plotting (some functional of) $u$ as a function of $\lambda$. In this talk I will present a new algorithm, deflated continuation, for this task.

Deflated continuation has three advantages. First, it is capable of computing disconnected bifurcation diagrams; previous algorithms only aimed to compute that part of the bifurcation diagram continuously connected to the initial data. Second, its implementation is very simple: it only requires a minor modification to an existing Newton-based solver. Third, it can scale to very large discretisations if a good preconditioner is available; no auxiliary problems must be solved.

We will present applications to hyperelastic structures, liquid crystals, and Bose-Einstein condensates, among others.

14.06.2021 Ozan Öktem (KTH, Sweden)

Data driven large-scale convex optimisation

This joint work with Jevgenjia Rudzusika (KTH), Sebastian Banert (Lund University) and Jonas Adler (DeepMind) introduces a framework for using deep-learning to accelerate optimisation solvers with convergence guarantees. The approach builds on ideas from the analysis of accelerated forward-backward schemes, like FISTA. Instead of the classical approach of proving convergence for a choice of parameters, such as a step-size, we show convergence whenever the update is chosen in a specific set. Rather than picking a point in this set through a handcrafted method, we train a deep neural network to pick the best update. The method is applicable to several smooth and non-smooth convex optimisation problems and it outperforms established accelerated solvers.

03.05.2021 Lars Ruthotto (Emory University, USA)
This talk was also part of the SPP 1962 Priority Program 2021 Keynote Presentation series.

A Machine Learning Framework for Mean Field Games and Optimal Control

We consider the numerical solution of mean field games and optimal control problems whose state space dimension is in the tens or hundreds. In this setting, most existing numerical solvers are affected by the curse of dimensionality (CoD). To mitigate the CoD, we present a machine learning framework that combines the approximation power of neural networks with the scalability of Lagrangian PDE solvers. Specifically, we parameterize the value function with a neural network and train its weights using the objective function with additional penalties that enforce the Hamilton Jacobi Bellman equations. A key benefit of this approach is that no training data is needed, e.g., no numerical solutions to the problem need to be computed before training. We illustrate our approach and its efficacy using numerical experiments. To show the framework's generality, we consider applications such as optimal transport, deep generative modeling, mean field games for crowd motion, and multi-agent optimal control.

29.03.2021 Serge Gratton (ENSEEIHT, Toulouse, France)

On a multilevel Levenberg-Marquardt method for the training of artificial neural networks and its application to the solution of partial differential equations

We propose a new multilevel Levenberg-Marquardt optimizer for the training of artificial neural networks with quadratic loss function. When the least-squares problem arises from the training of artificial neural networks, the variables subject to optimization are not related by any geometrical constraints and the standard interpolation and restriction operators cannot be employed any longer. A heuristic, inspired by algebraic multigrid methods, is then proposed to construct the multilevel transfer operators. We test the new optimizer on an important application: the approximate solution of partial differential equations by means of artificial neural networks. The learning problem is formulated as a least squares problem, choosing the nonlinear residual of the equation as a loss function, whereas the multilevel method is employed as a training method. Numerical experiments show encouraging results related to the efficiency of the new multilevel optimization method compared to the corresponding one-level procedure in this context.