Research Group "Stochastic Algorithms and Nonparametric Statistics"

Seminar "Modern Methods in Applied Stochastics and Nonparametric Statistics" Winter Semester 2022/23

04.10.2022

11.10.2022 Thomas Wagenhofer (TU Berlin)
Weak error estimates for rough volatility models
We consider a rough volatility model where the volatility is a (smooth) function of a Riemann--Liouville Brownian motion with Hurst parameter H in (0,1/2). When simulating these models, one often uses a discretization of stochastic integrals as an approximation. These integrals can be interpreted as log-stock-prices. In Applications, such as in pricing, the most relevent quantities are expectations of (payoff) functions. Our main result is that moments of these integrals have a weak error rate of order 3H+1/2 if H<1/6 and order 1 otherwise. For this we first derive a moment formula for both the discretization and the true stochastic integral. We then use this formula and properties of Gaussian random variables to prove our main theorems. We furthermore show that this convergence rate also holds for slightly more general payoffs and also provide a lower bound. Note that our rate of 3H+1/2 is in stark contrast to the strong error rate which is of order H. This is a joint work with Peter Friz and William Salkeld.
18.10.2022 Egor Gladin (Humboldt Universität zu Berlin)
Algorithm for constrained Markov decision process with linear convergence (hybrid talk)
The problem of constrained Markov decision process is considered. An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs (the number of constraints is relatively small). A new dual approach is proposed with the integration of two ingredients: entropy-regularized policy optimizer and Vaidya's dual optimizer, both of which are critical to achieve faster convergence. The finite-time error bound of the proposed approach is provided. Despite the challenge of the nonconcave objective subject to nonconcave constraints, the proposed approach is shown to converge (with linear rate) to the global optimum. The complexity expressed in terms of the optimality gap and the constraint violation significantly improves upon the existing primal-dual approaches.
25.10.2022 Luca Pelizzari (WIAS Berlin)
Polynomial Volterra processes and rough polynomial models (hybrid talk)
01.11.2022 Alexandra Suvorikova (WIAS Berlin)
Robust k-means clustering (hybrid talk)
In this work we investigate the theoretical properties of robust k-means clustering under assumption of adversarial data corruption. We provide non-asymptotic rates for excess distortion under weak model assumptions on the moments of the distribution.
08.11.2022

15.11.2022
ESH, Mohrenstr. 39
22.11.2022 Pavel Dvurechensky (WIAS Berlin)
Generalized self-concordant analysis of Frank-Wolfe algorithms (hybrid talk)
We propose several variants of the Frank-Wolfe method for minimizing generalized self-concordant (GSC) functions over compact sets. Such problems are ill-conditioned and are motivated by machine learning applications such as inverse covariance estimation or distance-weighted discrimination problems in support vector machines. We obtain O(1/k) convergence rate guarantees in the general situation and linear convergence under strong convexity and additional assumptions.
29.11.2022 Thomas Wagenhofer (TU Berlin)
Reconstructing volatility: Pricing of index options under rough volatility (hybrid talk)
In [ABOBF02, ABOBF03] Avellaneda et al. pioneered the pricing and hedging of index options - products highly sensitive to implied volatility and correlation assumptions - with large deviations methods, assuming local volatility dynamics for all components of the index. We here present an extension applicable to non-Markovian dynamics and in particular the case of rough volatility dynamics.
06.12.2022 Alexandra Suvorikova (WIAS Berlin)
Anomaly detection in biometric authentication (hybrid talk)
In this work we suggest a novel framework to detect user behaviour anomalies based on transfer learning approach combined with optimal transport techniques.
13.12.2022 Amal Alphonse (WIAS Berlin)
Risk-averse optimal control of random elliptic variational inequalities (hybrid talk)
In this talk, I will discuss a risk-averse optimal control problem governed by an elliptic variational inequality (VI) subject to random inputs. I will derive two forms of first-order stationarity conditions for the problem by passing to the limit in a penalised and smoothed approximating control problem. The lack of regularity with respect to the uncertain parameters and complexities induced by the presence of the risk measure give rise to delicate analytical challenges seemingly unique to the stochastic setting. To finish, I will briefly discuss a path-following stochastic approximation algorithm and demonstrate it on an example.
20.12.2022 No Seminar

03.01.2023 No Seminar

10.01.2023 Robert Gruhlke (WIAS Berlin)
Wasserstein polynomial chaos and Langevin dynamics (hybrid talk)
An unsupervised learning approach for the computation of an explicit functional representation of a random vector is presented, which only relies on a finite set of samples from an unknown distribution. Motivated by recent advances with computational optimal transport for estimating Wasserstein distances, we develop a generative model denoted as Wasserstein multi-element polynomial chaos expansion (WPCE). It relies on the minimization of a regularized empirical Wasserstein metric known as debiased Sinkhorn divergence. Since the used PCE grows exponentially in the number of underlying input random coordinates, we introduce an appropriate low-rank format given as stacks of tensor trains. This alleviates the curse of dimensionality, leading to only linear dependence on the input dimension. Ensemble methods nowadays are ubiquitous for the solution of Bayesian inference problems. We discuss recent advances based on state-of-the-art samplers such as Affine Invariant Langevin Dynamics (ALDI) that allow for increased convergence speed and in turn reduce the required number of forward calls encoded in the drift term of the underlying stochastic differential equation. This improvement is realised through possible adaptive ensemble enrichment and an adapted Langevin dynamics based on a homotopy formalism. Optionally, the history of particles obtained within the solution process of the Langevin dynamics then can be used to numerically construct a push forward map in terms of WPCE. Once computed, this provides functional access to the posterior without the need of further forward model evaluations.
17.01.2023 No Seminar

24.01.2023 Oleg Butkovsky (WIAS Berlin)
Stochastic equations with singular drift driven by fractional Brownian motion (hybrid talk)
31.01.2023 Matthias Liero (WIAS Berlin)
On the geometry of the Hellinger-Kantorovich space
07.02.2023 Alain Rossier (University of Oxford)
Asymptotic analysis of deep residual networks
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation (SDE) or neither of these. Furthermore, we are able to formally prove the linear convergence of gradient descent to a global optimum for the training of deep residual networks with constant layer width and smooth activation function. We further prove that if the trained weights, as a function of the layer index, admit a scaling limit as the depth increases, then the limit has finite 2-variation.
14.02.2023 Tim Laux (Universität Bonn)
The large-data limit of the MBO scheme for data clustering (hybrid talk)
The MBO scheme is an efficient algorithm for data clustering, the task of partitioning a given dataset into several meaningful clusters. In this talk, I will present the first rigorous analysis of this scheme in the large-data limit. The starting point for the first part of the talk is that each iteration of the MBO scheme corresponds to one step of implicit gradient descent for the thresholding energy on the similarity graph of the dataset. It is then natural to think that outcomes of the MBO scheme are (local) minimizers of this energy. We prove that the algorithm is consistent, in the sense that these (local) minimizers converge to (local) minimizers of a suitably weighted optimal partition problem. To study the dynamics of the scheme, we use the theory of viscosity solutions. The main ingredients are (i) a new abstract convergence result based on quantitative estimates for heat operators and (ii) the derivation of these estimates in the setting of random geometric graphs. To implement the scheme in practice, two important parameters are the number of eigenvalues for computing the heat operator and the step size of the scheme. Our results give a theoretical justification for the choice of these parameters in relation to sample size and interaction width. This is joint work with Jona Lelmi (U Bonn).
21.02.2023

28.02.2023

09.03.2023 Thomas O'Leary Roseberry (Oden Institute, University of Texas)
Due to WIAS Days the seminar is postponed to Thursday at 2 p.m.! Enabling efficient UQ and optimization with derivative-informed neural operators
Outer-loop problems arising in scientific applications such as Bayesian uncertainty quantification, and further optimization under uncertainty, require repeated evaluation of computationally intensive numerical models for varying parameters, making their solution intractable when one is constrained to use a high-fidelity model. Neural operators offer a means of explicitly learning maps from model parameters to outputs, thus enabling efficient solution of these outer loop problems. However, an essential ingredient for the scalable solution of high-dimensional is parametric derivative information, which can have the effect of reducing the dimensionality of the problem, and improving algorithmic convergence.In this talk we will present efficient strategies for learning high-dimensional derivative information via neural operators. By exploiting compactness of high-dimensional maps, if it exists, one can both generate and learn high dimensional derivative information where the dominant computational costs can be made independent of the discretization dimensions. Numerical results demonstrate that this additional derivative information improves the accuracy of the function approximation, and additionally is necessary to produce neural operators with reliable approximations of high-dimensional parametric derivatives. Numerical examples will demonstrate how these derivative informed neural operators can be used to accelerate the solutions of stochastic optimization problems and high-dimensional inference problems.This work is a collaboration with Omar Ghattas, Dingcheng Luo, Peng Chen and Umberto Villa.
14.03.2023 Emilio Ferrucci (University of Oxford)
Branched Itô formula and Itô-Stratonovich correction (hybrid talk)
21.03.2023

28.03.23 Luca Pelizzari (WIAS Berlin)
Primal-Dual optimal stopping with signatures (hybrid talk)
04.04.23 Robert A. Vandermeulen (TU Berlin)
Beating the nonparametric curse of dimensionality using multi-view density estimators
The curse of dimensionality is a famous statistical phenomenon that is perhaps nowhere better exemplified than in the task of nonparametric density estimation. Here the curse of dimensionality is not only evident in practical applications, but there is a very robust corpus of theoretical work that demonstrate the curse of dimensionality. This talk will cover recent work on nonparametric density estimation that utilizes multi-view models, which are a type of low-rank model, to obviate the curse of dimensionality and give dimension-independent rates of convergence.
11.04.23 Alexandra Suvorikova (WIAS Berlin)
Data ordering in general metric spaces


last reviewed: March 27, 2023 by Christine Schneider