Publications
Monographs

M. Danilova, P. Dvurechensky, A. Gasnikov, E. Gorbunov, S. Guminov, D. Kamzolov, I. Shibaev, Chapter: Recent Theoretical Advances in Nonconvex Optimization, A. Nikeghbali, P.M. Pardalos, A.M. Raigorodskii, M.Th. Rassias, eds., 191 of Springer Optimization and Its Applications, Springer, Cham, 2022, pp. 79163, (Chapter Published), DOI 10.1007/9783031008320_3 .

M. Hintermüller, K. Papafitsoros, Chapter 11: Generating Structured Nonsmooth Priors and Associated Primaldual Methods, in: Processing, Analyzing and Learning of Images, Shapes, and Forms: Part 2, R. Kimmel, X.Ch. Tai, eds., 20 of Handbook of Numerical Analysis, Elsevier, 2019, pp. 437502, (Chapter Published), DOI 10.1016/bs.hna.2019.08.001 .

M. Hintermüller, M. Hinze, J. Sokołowski, S. Ulbrich, eds., Special issue to honour Guenter Leugering on his 65th birthday, 1 of Control & Cybernetics, Systems Research Institute, Polish Academy of Sciences, Warsaw, 2019, (Collection Published).

M. Hintermüller, J.F. Rodrigues, eds., Topics in Applied Analysis and Optimisation  Partial Differential Equations, Stochastic and Numerical Analysis, CIM Series in Mathematical Sciences, Springer Nature Switzerland AG, Cham, 2019, 396 pages, (Collection Published).

P. Deuflhard, M. Grötschel, D. Hömberg, U. Horst, J. Kramer, V. Mehrmann, K. Polthier, F. Schmidt, Ch. Schütte, M. Skutella, J. Sprekels, eds., MATHEON  Mathematics for Key Technologies, 1 of EMS Series in Industrial and Applied Mathematics, European Mathematical Society Publishing House, Zurich, 2014, 453 pages, (Collection Published).
Articles in Refereed Journals

A. Ivanova, P. Dvurechensky, E. Vorontsova, D. Pasechnyuk, A. Gasnikov, D. Dvinskikh, A. Tyurin, Oracle complexity separation in convex optimization, Journal of Optimization Theory and Applications, 193 (2022), pp. 462490, DOI 10.1007/s10957022020387 .
Abstract
Ubiquitous in machine learning regularized empirical risk minimization problems are often composed of several blocks which can be treated using different types of oracles, e.g., full gradient, stochastic gradient or coordinate derivative. Optimal oracle complexity is known and achievable separately for the full gradient case, the stochastic gradient case, etc. We propose a generic framework to combine optimal algorithms for different types of oracles in order to achieve separate optimal oracle complexity for each block, i.e. for each block the corresponding oracle is called the optimal number of times for a given accuracy. As a particular example, we demonstrate that for a combination of a full gradient oracle and either a stochastic gradient oracle or a coordinate descent oracle our approach leads to the optimal number of oracle calls separately for the full gradient part and the stochastic/coordinate descent part. 
E. Gorbunov, P. Dvurechensky, A. Gasnikov, An accelerated method for derivativefree smooth stochastic convex optimization, SIAM Journal on Optimization, 32 (2022), pp. 12101238, DOI 10.1137/19M1259225 .
Abstract
We consider an unconstrained problem of minimization of a smooth convex function which is only available through noisy observations of its values, the noise consisting of two parts. Similar to stochastic optimization problems, the first part is of a stochastic nature. On the opposite, the second part is an additive noise of an unknown nature, but bounded in the absolute value. In the twopoint feedback setting, i.e. when pairs of function values are available, we propose an accelerated derivativefree algorithm together with its complexity analysis. The complexity bound of our derivativefree algorithm is only by a factor of n??? larger than the bound for accelerated gradientbased algorithms, where n is the dimension of the decision variable. We also propose a nonaccelerated derivativefree algorithm with a complexity bound similar to the stochasticgradientbased algorithm, that is, our bound does not have any dimensiondependent factor. Interestingly, if the solution of the problem is sparse, for both our algorithms, we obtain better complexity bound if the algorithm uses a 1norm proximal setup, rather than the Euclidean proximal setup, which is a standard choice for unconstrained problems. 
P. Dvurechensky, K. Safin, S. Shtern, M. Staudigl, Generalized selfconcordant analysis of FrankWolfe algorithms, Mathematical Programming. A Publication of the Mathematical Programming Society, 198 (2023), pp. 255323 (published online on 29.01.2022), DOI 10.1007/s10107022017711 .
Abstract
Projectionfree optimization via different variants of the FrankWolfe method has become one of the cornerstones of large scale optimization for machine learning and computational statistics. Numerous applications within these fields involve the minimization of functions with selfconcordance like properties. Such generalized selfconcordant functions do not necessarily feature a Lipschitz continuous gradient, nor are they strongly convex, making them a challenging class of functions for firstorder methods. Indeed, in a number of applications, such as inverse covariance estimation or distanceweighted discrimination problems in binary classification, the loss is given by a generalized selfconcordant function having potentially unbounded curvature. For such problems projectionfree minimization methods have no theoretical convergence guarantee. This paper closes this apparent gap in the literature by developing provably convergent Frank?Wolfe algorithms with standard O(1/k) convergence rate guarantees. Based on these new insights, we show how these sublinearly convergent methods can be accelerated to yield linearly convergent projectionfree methods, by either relying on the availability of a local liner minimization oracle, or a suitable modification of the awaystep FrankWolfe method. 
M. Eigel, R. Gruhlke, M. Marschall, Lowrank tensor reconstruction of concentrated densities with application to Bayesian inversion, Statistics and Computing, 32 (2022), pp. 27/127/27, DOI 10.1007/s11222022100871 .
Abstract
A novel method for the accurate functional approximation of possibly highly concentrated probability densities is developed. It is based on the combination of several modern techniques such as transport maps and nonintrusive reconstructions of lowrank tensor representations. The central idea is to carry out computations for statistical quantities of interest such as moments with a convenient reference measure which is approximated by an numerical transport, leading to a perturbed prior. Subsequently, a coordinate transformation leads to a beneficial setting for the further function approximation. An efficient layer based transport construction is realized by using the Variational Monte Carlo (VMC) method. The convergence analysis covers all terms introduced by the different (deterministic and statistical) approximations in the Hellinger distance and the KullbackLeibler divergence. Important applications are presented and in particular the context of Bayesian inverse problems is illuminated which is a central motivation for the developed approach. Several numerical examples illustrate the efficacy with densities of different complexity. 
E. Vorontsova, A. Gasnikov, P. Dvurechensky, A. Ivanova, D. Pasechnyuk, Numerical methods for the resource allocation problem in a computer network, Computational Mathematics and Mathematical Physics, 61 (2021), pp. 297328, DOI 10.1134/S0965542521020135 .

A. Gasnikov, D. Dvinskikh, P. Dvurechensky, D. Kamzolov, V. Matyukhin, D. Pasechnyuk, N. Tupitsa, A. Chernov, Accelerated metaalgorithm for convex optimization, Computational Mathematics and Mathematical Physics, 61 (2021), pp. 1728, DOI 10.1134/S096554252101005X .

N. Tupitsa, P. Dvurechensky, A. Gasnikov, S. Guminov, Alternating minimization methods for strongly convex optimization, Journal of Inverse and IllPosed Problems, 29 (2021), pp. 721739, DOI 10.1515/jiip20200074 .
Abstract
We consider alternating minimization procedures for convex optimization problems with variable divided in many block, each block being amenable for minimization with respect to its variable with freezed other variables blocks. In the case of two blocks, we prove a linear convergence rate for alternating minimization procedure under PolyakŁojasiewicz condition, which can be seen as a relaxation of the strong convexity assumption. Under strong convexity assumption in manyblocks setting we provide an accelerated alternating minimization procedure with linear rate depending on the square root of the condition number as opposed to condition number for the nonaccelerated method. 
A. Rastogi, G. Blanchard, P. Mathé, Convergence analysis of Tikhonov regularization for nonlinear statistical inverse problems, Electronic Journal of Statistics, 14 (2020), pp. 27982841, DOI 10.1214/20EJS1735 .
Abstract
We study a nonlinear statistical inverse problem, where we observe the noisy image of a quantity through a nonlinear operator at some random design points. We consider the widely used Tikhonov regularization (or method of regularization) approach to estimate the quantity for the nonlinear illposed inverse problem. The estimator is defined as the minimizer of a Tikhonov functional, which is the sum of a data misfit term and a quadratic penalty term. We develop a theoretical analysis for the minimizer of the Tikhonov regularization scheme using the concept of reproducing kernel Hilbert spaces. We discuss optimal rates of convergence for the proposed scheme, uniformly over classes of admissible solutions, defined through appropriate source conditions. 
P. Mathé, Bayesian inverse problems with noncommuting operators, Mathematics of Computation, 88 (2019), pp. 28972912, DOI 10.1090/mcom/3439 .
Abstract
The Bayesian approach to illposed operator equations in Hilbert space recently gained attraction. In this context, and when the prior distribution is Gaussian, then two operators play a significant role, the one which governs the operator equation, and the one which describes the prior covariance. Typically it is assumed that these operators commute. Here we extend this analysis to noncommuting operators, replacing the commutativity assumption by a link condition. We discuss its relation to the commuting case, and we indicate that this allows to use interpolation type results to obtain tight bounds for the contraction of the posterior Gaussian distribution towards the data generating element. 
M. Eigel, M. Marschall, R. Schneider, Samplingfree Bayesian inversion with adaptive hierarchical tensor representations, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 34 (2018), pp. 035010/1035010/29, DOI 10.1088/13616420/aaa998 .
Abstract
The statistical Bayesian approach is a natural setting to resolve the illposedness of inverse problems by assigning probability densities to the considered calibration parameters. Based on a parametric deterministic representation of the forward model, a samplingfree approach to Bayesian inversion with an explicit representation of the parameter densities is developed. The approximation of the involved randomness inevitably leads to several high dimensional expressions, which are often tackled with classical sampling methods such as MCMC. To speed up these methods, the use of a surrogate model is beneficial since it allows for faster evaluation with respect to calibration parameters. However, the inherently slow convergence can not be remedied by this. As an alternative, a complete functional treatment of the inverse problem is feasible as demonstrated in this work, with functional representations of the parametric forward solution as well as the probability densities of the calibration parameters, determined by Bayesian inversion. The proposed samplingfree approach is discussed in the context of hierarchical tensor representations, which are employed for the adaptive evaluation of a random PDE (the forward problem) in generalized chaos polynomials and the subsequent highdimensional quadrature of the loglikelihood. This modern compression technique alleviates the curse of dimensionality by hierarchical subspace approximations of the involved low rank (solution) manifolds. All required computations can be carried out efficiently in the lowrank format. A priori convergence is examined, considering all approximations that occur in the method. Numerical experiments demonstrate the performance and verify the theoretical results. 
S. Lu, P. Mathé, S. Pereverzyev, Balancing principle in supervised learning for a general regularization scheme, Applied and Computational Harmonic Analysis. TimeFrequency and TimeScale Analysis, Wavelets, Numerical Algorithms, and Applications, 48 (2020), pp. 123148, DOI 10.1016/j.acha.2018.03.001 .

A. Anikin, A. Gasnikov, P. Dvurechensky, A. Turin, A. Chernov, Dual approaches to the minimization of strongly convex functionals with a simple structure under affine constraints, Computational Mathematics and Mathematical Physics, 57 (2017), pp. 12621276.

D. Belomestny, H. Mai, J.G.M. Schoenmakers, Generalized PostWidder inversion formula with application to statistics, Journal of Mathematical Analysis and Applications, 455 (2017), pp. 89104.
Abstract
In this work we derive an inversion formula for the Laplace transform of a density observed on a curve in the complex domain, which generalizes the well known PostWidder formula. We establish convergence of our inversion method and derive the corresponding convergence rates for the case of a Laplace transform of a smooth density. As an application we consider the problem of statistical inference for variancemean mixture models. We construct a nonparametric estimator for the mixing density based on the generalized PostWidder formula, derive bounds for its root mean square error and give a brief numerical example. 
S. Bürger, P. Mathé, Discretized Lavrent'ev regularization for the autoconvolution equation, Applicable Analysis. An International Journal, 96 (2017), pp. 16181637, DOI 10.1080/00036811.2016.1212336 .
Abstract
Lavrent?ev regularization for the autoconvolution equation was considered by Janno J. in Lavrent?ev regularization of illposed problems containing nonlinear neartomonotone operators with application to autoconvolution equation, Inverse Prob. 2000;16:333?348. Here this study is extended by considering discretization of the Lavrent?ev scheme by splines. It is shown how to maintain the known convergence rate by an appropriate choice of spline spaces and a proper choice of the discretization level. For piecewise constant splines the discretized equation allows for an explicit solver, in contrast to using higher order splines. This is used to design a fast implementation by means of postsmoothing, which provides results, which are indistinguishable from results obtained by direct discretization using cubic splines. 
M. Hintermüller, C.N. Rautenberg, T. Wu, A. Langer, Optimal selection of the regularization function in a generalized total variation model. Part II: Algorithm, its analysis and numerical tests, Journal of Mathematical Imaging and Vision, 59 (2017), pp. 515533.
Abstract
Based on the generalized total variation model and its analysis pursued in part I (WIAS Preprint no. 2235), in this paper a continuous, i.e., infinite dimensional, projected gradient algorithm and its convergence analysis are presented. The method computes a stationary point of a regularized bilevel optimization problem for simultaneously recovering the image as well as determining a spatially distributed regularization weight. Further, its numerical realization is discussed and results obtained for image denoising and deblurring as well as Fourier and wavelet inpainting are reported on. 
M. Hintermüller, C.N. Rautenberg, Optimal selection of the regularization function in a weighted total variation model. Part I: Modeling and theory, Journal of Mathematical Imaging and Vision, 59 (2017), pp. 498514.
Abstract
Based on the generalized total variation model and its analysis pursued in part I (WIAS Preprint no. 2235), in this paper a continuous, i.e., infinite dimensional, projected gradient algorithm and its convergence analysis are presented. The method computes a stationary point of a regularized bilevel optimization problem for simultaneously recovering the image as well as determining a spatially distributed regularization weight. Further, its numerical realization is discussed and results obtained for image denoising and deblurring as well as Fourier and wavelet inpainting are reported on. 
P. Mathé, S.V. Pereverzev, Complexity of linear illposed problems in Hilbert space, Journal of Complexity, 38 (2017), pp. 5067.

D. Belomestny, J.G.M. Schoenmakers, Statistical inference for timechanged Lévy processes via Mellin transform approach, Stochastic Processes and their Applications, 126 (2016), pp. 20922122.

K. Lin, S. Lu, P. Mathé, Oracletype posterior contraction rates in Bayesian inverse problems, Inverse Problems and Imaging, 9 (2015), pp. 895915.

P. Mathé, Adaptive discretization for signal detection in statistical inverse problems, Applicable Analysis. An International Journal, 94 (2015), pp. 494505.

S. Anzengruber, B. Hofmann, P. Mathé, Regularization properties of the sequential discrepancy principle for Tikhonov regularization in Banach spaces, Applicable Analysis. An International Journal, 93 (2014), pp. 13821400.

S. Lu, P. Mathé, Discrepancy based model selection in statistical inverse problems, Journal of Complexity, 30 (2014), pp. 290308.

C. Marteau, P. Mathé, General regularization schemes for signal detection in inverse problems, Mathematical Methods of Statistics, 23 (2014), pp. 176200.

R.I. Boţ, B. Hofmann, P. Mathé, Regularizability of illposed problems and the modulus of continuity, Analysis and Applications, 32 (2013), pp. 299312.

E. Burnaev, A. Zaytsev, V. Spokoiny, Nonasymptotic properties for Gaussian field regression, Automation and Remote Control, 74 (2013), pp. 16451655.

Q. Jin, P. Mathé, Oracle inequality for a statistical RausGfrerertype rule, SIAM ASA J. Uncertainty Quantification, 1 (2013), pp. 386407.

A. Zaitsev, E. Burnaev, V. Spokoiny, Properties of the posterior distribution of a regression model based on Gaussian random fields, Automation and Remote Control, 74 (2013), pp. 16451655.

B. Hofmann, P. Mathé, Some note on the modulus of continuity for illposed problems in Hilbert space, Trudy Instituta Matematiki i Mekhaniki UrO RAN, 18 (2012), pp. 3441.

G. Blanchard, P. Mathé, Discrepancy principle for statistical inverse problems with application to conjugate gradient iteration, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 28 (2012), pp. 115011/1115011/23.

B. Hofmann, P. Mathé, Parameter choice in Banach space regularization under variational inequalities, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 28 (2012), pp. 104006/1104006/17.

S.M.A. Becker, Regularization of statistical inverse problems and the Bakushinskii veto, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 27 (2011), pp. 115010/1115010/22.
Abstract
In the deterministic context Bakushinskii's theorem excludes the existence of purely datadriven convergent regularization for illposed problems. We will prove in this work that in the statistical setting we can either construct a counter example or develop an equivalent formulation depending on the considered class of probability distributions. Hence, Bakushinskii's theorem does not generalize to the statistical context, although this has often been assumed in the past. To arrive at this conclusion, we will deduce from the classic theory new concepts for a general study of statistical inverse problems and perform a systematic clarification of the key ideas of statistical regularization. 
F. Bauer, P. Mathé, Parameter choice methods using minimization schemes, Journal of Complexity, 27 (2011), pp. 6885.
Abstract
In this paper we establish a generalized framework, which allows to prove convergenence and optimality of parameter choice schemes for inverse problems based on minimization in a generic way. We show that the well known quasioptimality criterion falls in this class. Furthermore we present a new parameter choice method and prove its convergence by using this newly established tool. 
J. Flemming, B. Hofmann, P. Mathé, Sharp converse results for the regularization error using distance functions, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 27 (2011), pp. 025006/1025006/18.
Abstract
In the analysis of illposed inverse problems the impact of solution smoothness on accuracy and convergence rates plays an important role. For linear illposed operator equations in Hilbert spaces and with focus on the linear regularization schema we will establish relations between the different kinds of measuring solution smoothness in a pointwise or integral manner. In particular we discuss the interplay of distribution functions, profile functions that express the regularization error, index functions generating source conditions, and distance functions associated with benchmark source conditions. We show that typically the distance functions and the profile functions carry the same information as the distribution functions, and that this is not the case for general source conditions. The theoretical findings are accompanied with examples exhibiting applications and limitations of the approach. 
P. Mathé, U. Tautenhahn, Enhancing linear regularization to treat large noise, Journal of Inverse and IllPosed Problems, 19 (2011), pp. 859879.

P. Mathé, U. Tautenhahn, Regularization under general noise assumptions, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 27 (2011), pp. 035016/1035016/15.
Abstract
The authors explain how the major results which were obtained recently in Eggermont et al (2009 Inverse Problems 25 115018) can be derived from a more general perspective of recent regularization theory. By pursuing this further, the authors provide a general view on regularization under general noise assumptions, including weakly and strongly controlled noise. The prospect is not to generalize previous work in this direction, but rather to envision the intrinsic structure present in regularization under general noise assumptions. In particular, the authors find variants of the discrepancy and the Lepski$vrm i$ principle to choose the regularization parameter, albeit within different context and under different assumptions. 
D. Belomestny, Spectral estimation of the fractional order of a Lévy process, The Annals of Statistics, 38 (2010), pp. 317351.

B. Hoffmann, P. Mathé, H. VON Weizsäcker, Regularization in Hilbert space under unbounded operators and general source conditions, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 25 (2009), pp. 115013/1115013/15.

D. Belomestny, G.N. Milstein, V. Spokoiny, Regression methods in pricing American and Bermudan options using consumption processes, Quantitative Finance, 9 (2009), pp. 315327.
Abstract
Here we develop methods for efficient pricing multidimensional discretetime American and Bermudan options by using regression based algorithms together with a new approach towards constructing upper bounds for the price of the option. Applying sample space with payoffs at the optimal stopping times, we propose sequential estimates for continuation values, values of the consumption process, and stopping times on the sample paths. The approach admits constructing both low and upper bounds for the price by Monte Carlo simulations. The methods are illustrated by pricing Bermudan swaptions and snowballs in the Libor market model. 
P. Mathé, S.V. Pereverzev, The use of higher order finite difference schemes is not dangerous, Journal of Complexity, 25 (2009), pp. 310.

V. Spokoiny, C. Vial, Parameter tuning in pointwise adaptation using a propagation approach, The Annals of Statistics, 37 (2009), pp. 27832807.
Abstract
This paper discusses the problem of adaptive estimating a univariate object like the value of a regression function at a given point or a linear functional in a linear inverse problem. We consider an adaptive procedure originated from Lepski (1990) which selects in a datadriven way one estimate out of a given class of estimates ordered by their variability. A serious problem with using this and similar procedures is the choice of some tuning parameters like thresholds. Numerical results show that the theoretically recommended proposals appear to be too conservative and lead to a strong oversmoothing effects. A careful choice of the parameters of the procedure is extremely important for getting the reasonable quality of estimation. The main contribution of this paper is the new approach for choosing the parameters of the procedure by providing the prescribed behavior of the resulting estimate in the simple parametric situation. We establish a nonasymptotical “oracle” bound which shows that the estimation risk is, up to a logarithmic multiplier, equal to the risk of the “oracle” estimate which is optimally selected from the given family. A numerical study demonstrates the nice performance of the resulting procedure in a number of simulated examples. 
B. Hofmann, P. Mathé, M. Schieck, Modulus of continuity for conditionally stable illposed problems in Hilbert space, Journal of Inverse and IllPosed Problems, 16 (2008), pp. 567585.

P. Mathé, B. Hofmann, Direct and inverse results in variable Hilbert scales, Journal of Approximation Theory, 154 (2008), pp. 7789.

P. Mathé, B. Hofmann, How general are general source conditions?, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 24 (2008), pp. 015009/1015009/5.

P. Mathé, N. Schöne, Regularization by projection in variable Hilbert scales, Applicable Analysis. An International Journal, 2 (2008), pp. 201 219.

B. Hofmann, P. Mathé, S.V. Pereverzev, Regularization by projection: Approximation theoretic aspects and distance functions, Journal of Inverse and IllPosed Problems, 15 (2007), pp. 527545.

P. Mathé, U. Tautenhahn, Error bounds for regularization methods in Hilbert scales by using operator monotonicity, Far East Journal of Mathematical Sciences (FJMS), 24 (2007), pp. 121.
Abstract
For solving linear illposed problems with noisy data regularization methods are required. In the present paper regularized approximations are obtained by a general regularization scheme in Hilbert scales. We exploit operator monotonicity of certain functions for deriving order optimal error bounds that characterize the accuracy of the regularized approximations. These error bounds are obtained under general smoothness conditions 
P. Mathé, B. Hofmann, Analysis of profile functions for general linear regularization methods, SIAM Journal on Numerical Analysis, 45 (2007), pp. 11221141.
Abstract
The stable approximate solution of illposed linear operator equations in Hilbert spaces requires regularization. Tight bounds for the noisefree part of the regularization error are constitutive for bounding the overall error. Norm bounds of the noisefree part which decrease to zero along with the regularization parameter are called profile functions and are subject of our analysis. The interplay between properties of the regularization and certain smoothness properties of solution sets, which we shall describe in terms of sourcewise representations is crucial for the decay of associated profile functions. On the one hand, we show that a given decay rate is possible only if the underlying true solution has appropriate smoothness. On the other hand, if smoothness fits the regularization, then decay rates are easily obtained. If smoothness does not fit, then we will measure this in terms of some distance function. Tight bounds for these allow us to obtain profile functions. Finally we study the most realistic case when smoothness is measured with respect to some operator which is related to the one governing the original equation only through a link condition. In many parts the analysis is done on geometric basis, extending classical concepts of linear regularization theory in Hilbert spaces. We emphasize intrinsic features of linear illposed problems which are frequently hidden in the classical analysis of such problems. 
A. Goldenshluger, V. Spokoiny, Recovering convex edges of image from noisy tomographic data, IEEE Transactions on Information Theory, 52 (2006), pp. 13221334.

D. Belomestny, M. Reiss, Spectral calibration of exponential Lévy models, Finance and Stochastics, 10 (2006), pp. 449474.
Abstract
We investigate the problem of calibrating an exponential Lévy model based on market prices of vanilla options. We show that this inverse problem is in general severely illposed and we derive exact minimax rates of convergence. The estimation procedure we propose is based on the explicit inversion of the option price formula in the spectral domain and a cutoff scheme for high frequencies as regularisation. 
P. Mathé, U. Tautenhahn, Interpolation in variable Hilbert scales with application to inverse problems, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 22 (2006), pp. 22712297.
Abstract
For solving linear illposed problems with noisy data regularization methods are required. In the present paper regularized approximations in Hilbert scales are obtained by a general regularization scheme. The analysis of such schemes is based on new results for interpolation in Hilbert scales. Error bounds are obtained under general smoothness conditions. 
J. Polzehl, V. Spokoiny, Propagationseparation approach for local likelihood estimation, Probability Theory and Related Fields, 135 (2006), pp. 335362.
Abstract
The paper presents a unified approach to local likelihood estimation for a broad class of nonparametric models, including, e.g., regression, density, Poisson and binary response models. The method extends the adaptive weights smoothing (AWS) procedure introduced by the authors [Adaptive weights smoothing with applications to image sequentation. J. R. Stat. Soc., Ser. B 62, 335354 (2000)] in the context of image denoising. The main idea of the method is to describe a greatest possible local neighborhood of every design point in which the local parametric assumption is justified by the data. The method is especially powerful for model functions having large homogeneous regions and sharp discontinuities. The performance of the proposed procedure is illustrated by numerical examples for density estimation and classification. We also establish some remarkable theoretical nonasymptotic results on properties of the new algorithm. This includes the “propagation” property which particularly yields the root$n$ consistency of the resulting estimate in the homogeneous case. We also state an “oracle” result which implies rate optimality of the estimate under usual smoothness conditions and a “separation” result which explains the sensitivity of the method to structural changes. 
D. Belomestny, Reconstruction of the general distribution from the distribution of some statistics, Theory of Probability and its Applications, 49 (2005), pp. 115.
Abstract
We investigate the problem of characterizing the distribution of independent identically distributed random variables $X_1,ldots,X_m$ (general distribution) by the distribution of linear statistics and statistics of maximum with positive coefficients. Necessary and sufficient conditions are found under which such a characterization takes place. 
A. Goldenshluger, V. Spokoiny, On the shapefrommoments problem and recovering edges from noisy Radon data, Probability Theory and Related Fields, 128 (2004), pp. 123140.

D. Belomestny, Constraints on distributions imposed by properties of linear forms, ESAIM. Probability and Statistics, 7 (2003), pp. 313328.
Abstract
Let $(X_1,Y_1)...,(X_m,Y_m)$ be $m$ independent identically distributed bivariate vectors and $L_1=b_1X_1+...+b_mX_m$ ,$L_2=b_1Y_1+...+b_mY_m$ are two linear forms with positive coefficients. We study two problems: under what conditions does the equidistribution of $L_1$ and $L_2$ imply the same property for $X_1$ and $Y_1$, and under what conditions does the independence of $L_1$ and $L_2$ entail independence of $X_1$ and $Y_1$? Some analytical sufficient conditions are obtained and it is shown that in general they can not be weakened. 
P. Mathé, S.V. Pereverzev, Discretization strategy for linear illposed problems in variable Hilbert scales, Inverse Problems. An International Journal on the Theory and Practice of Inverse Problems, Inverse Methods and Computerized Inversion of Data, 19 (2003), pp. 12631277.
Abstract
The authors study the regularization of projection methods for solving linear illposed problems with compact and injective linear operators in Hilbert spaces. Smoothness of the unknown solution is given in terms of general source conditions, such that the framework of variable Hilbert scale s is suitable. The structure of the error is analyzed in terms of the noise level, the regularization parameter and as a function of other parameters, driving the discretization. As a result, a strategy is proposed, which automatically adapts to the unknown source condition, uniformly for certain classes, and provides the optimal order of accuracy. 
P. Mathé, S.V. Pereverzev, Direct estimation of linear functionals from indirect noisy observations, Journal of Complexity, 18 (2002), pp. 500516.
Abstract
The authors study the efficiency of the linear functional strategy, as introduced by Anderssen (1986), for inverse problems with observations blurred by Gaussian white noise with known intensity $delta$. The optimal accuracy is presented and it is shown, how this can be achieved by a linearfunctional strategy based on the noisy observations. This optimal linearfunctional strategy is obtained from Tikhonov regularization of some dual problem. Next, the situation is treated, when only a finite number of noisy observations, given beforehand is available. Under appropriate smoothness assumptions best possible accuracy still can be attained, if the number of observations corresponds to the noise intensity in a proper way. It is also shown, that, at least asymptotically this number of observations cannot be reduced. 
P. Mathé, Stable summation of orthogonal series with noisy coefficients, Journal of Approximation Theory, 117 (2002), pp. 6680.
Abstract
We study the recovery of continuous functions from Fourier coefficients with respect to certain given orthonormal systems, blurred by noise. For deterministic noise this is a classical illposed problem. Emphasis is laid on a priori smoothness assumptions on the solution, which allows to apply regularization to reach the best possible accuracy. Results are obtained for systems obeying norm growth conditions. In the white noise setting mild additional assumptions have to be made to have accurate bounds. We finish our study with the recovery of functions from noisy coefficients with respect to the Haar system. 
P. Mathé, S.V. Pereverzev, Optimal discretization of inverse problems in Hilbert scales. Regularization and selfregularization of projection methods, SIAM Journal on Numerical Analysis, 38 (2001), pp. 19992021.
Abstract
We study the efficiency of the approximate solution of illposed problems, based on discretized observations, which we assume to be given aforehand. We restrict ourselves to problems which can be formulated in Hilbert scales. Within this framework we shall quantify the degree of illposedness, provide general conditions on projection schemes to achieve the best possible order of accuracy. We pay particular attention on the problem of selfregularization vs. Tikhonov regularization. Moreover, we study the information complexity. Asymptotically, any method, which achieves the best possible order of accuracy must use at least such amount of noisy observations. We accomplish our study with two specific problems, Abel's integral equation and the recovery of continuous functions from noisy coefficients with respect to a given orthonormal system, both classical illposed problems.
Contributions to Collected Editions

A. Agafonov, P. Dvurechensky, G. Scutari, A. Gasnikov, D. Kamzolov, A. Lukashevich, A. Daneshmand, An accelerated secondorder method for distributed stochastic optimization, in: 60th IEEE Conference on Decision and Control (CDC), IEEE, 2021, pp. 24072413, DOI 10.1109/CDC45484.2021.9683400 .

A. Daneshmand, G. Scutari, P. Dvurechensky, A. Gasnikov, Newton method over networks is fast up to the statistical precision, in: Proceedings of the 38th International Conference on Machine Learning, 139 of Proceedings of Machine Learning Research, 2021, pp. 23982409.

E. Gladin, A. Sadiev, A. Gasnikov, P. Dvurechensky, A. Beznosikov, M. Alkousa, Solving smooth minmin and minmax problems by mixed oracle algorithms, in: Mathematical Optimization Theory and Operations Research: Recent Trends, A. Strekalovsky, Y. Kochetov, T. Gruzdeva, A. Orlov , eds., 1476 of Communications in Computer and Information Science book series (CCIS), Springer International Publishing, Basel, 2021, pp. 1940, DOI 10.1007/9783030864330_2 .
Abstract
In this paper, we consider two types of problems that have some similarity in their structure, namely, minmin problems and minmax saddlepoint problems. Our approach is based on considering the outer minimization problem as a minimization problem with an inexact oracle. This inexact oracle is calculated via an inexact solution of the inner problem, which is either minimization or maximization problem. Our main assumption is that the available oracle is mixed: it is only possible to evaluate the gradient w.r.t. the outer block of variables which corresponds to the outer minimization problem, whereas for the inner problem, only zerothorder oracle is available. To solve the inner problem, we use the accelerated gradientfree method with zerothorder oracle. To solve the outer problem, we use either an inexact variant of Vaidya's cuttingplane method or a variant of the accelerated gradient method. As a result, we propose a framework that leads to nonasymptotic complexity bounds for both minmin and minmax problems. Moreover, we estimate separately the number of first and zerothorder oracle calls, which are sufficient to reach any desired accuracy. 
S. Guminov, P. Dvurechensky, N. Tupitsa, A. Gasnikov, On a combination of alternating minimization and Nesterov's momentum, in: Proceedings of the 38th International Conference on Machine Learning, 139 of Proceedings of Machine Learning Research, 2021, pp. 38863898.
Abstract
Alternating minimization (AM) optimization algorithms have been known for a long time and are of importance in machine learning problems, among which we are mostly motivated by approximating optimal transport distances. AM algorithms assume that the decision variable is divided into several blocks and minimization in each block can be done explicitly or cheaply with high accuracy. The ubiquitous Sinkhorn's algorithm can be seen as an alternating minimization algorithm for the dual to the entropyregularized optimal transport problem. We introduce an accelerated alternating minimization method with a $1/k^2$ convergence rate, where $k$ is the iteration counter. This improves over known bound $1/k$ for general AM methods and for the Sinkhorn's algorithm. Moreover, our algorithm converges faster than gradienttype methods in practice as it is free of the choice of the stepsize and is adaptive to the local smoothness of the problem. We show that the proposed method is primaldual, meaning that if we apply it to a dual problem, we can reconstruct the solution of the primal problem with the same convergence rate. We apply our method to the entropy regularized optimal transport problem and show experimentally, that it outperforms Sinkhorn's algorithm. 
A. Rogozin, M. Bochko, P. Dvurechensky, A. Gasnikov, V. Lukoshkin, An accelerated method for decentralized distributed stochastic optimization over timevarying graphs, in: 2021 IEEE 60th Annual Conference on Decision and Control (CDC), IEEE, 2021, pp. 33673373, DOI 10.1109/CDC45484.2021.9683400 .

A. Sadiev , A. Beznosikov, P. Dvurechensky, A. Gasnikov, Zerothorder algorithms for smooth saddlepoint problems, in: Mathematical Optimization Theory and Operations Research: Recent Trends, A. Strekalovsky, Y. Kochetov, T. Gruzdeva, A. Orlov , eds., 1476 of Communications in Computer and Information Science book series (CCIS), Springer International Publishing, Basel, 2021, pp. 7185, DOI 10.1007/9783030864330_5 .
Abstract
Saddlepoint problems have recently gained an increased attention from the machine learning community, mainly due to applications in training Generative Adversarial Networks using stochastic gradients. At the same time, in some applications only a zerothorder oracle is available. In this paper, we propose several algorithms to solve stochastic smooth (strongly) convexconcave saddlepoint problems using zerothorder oracles, and estimate their convergence rate and its dependence on the dimension n of the variable. In particular, our analysis shows that in the case when the feasible set is a direct product of two simplices, our convergence rate for the stochastic term is only by a factor worse than for the firstorder methods. Finally, we demonstrate the practical performance of our zerothorder methods on practical problems. 
N. Tupitsa, A. Gasnikov, P. Dvurechensky, S. Guminov, Strongly convex optimization for the dual formulation of optimal transport, in: Mathematical Optimization Theory and Operations Research, A. Kononov, M. Khachay, V.A. Kalyagin, P. Pardalos, eds., 1275 of Theoretical Computer Science and General Issues, Springer International Publishing AG, Cham, 2020, pp. 192204, DOI 10.1007/9783030586577_17 .
Abstract
In this paper we experimentally check a hypothesis, that dual problem to discrete entropy regularized optimal transport problem possesses strong convexity on a certain compact set. We present a numerical estimation technique of parameter of strong convexity and show that such an estimate increases the performance of an accelerated alternating minimization algorithm for strongly convex functions applied to the considered problem. 
N. Tupitsa , P. Dvurechensky, A. Gasnikov, C.A. Uribe, Multimarginal optimal transport by accelerated alternating minimization, in: 2020 59th IEEE Conference on Decision and Control (CDC), IEEE, 2020, pp. 61326137, DOI 10.1109/CDC42340.2020.9304010 .

D. Dvinskikh, A. Ogaltsov, A. Gasnikov, P. Dvurechensky, V. Spokoiny, On the linesearch gradient methods for stochastic optimization, in: Proceedings of the 21th IFAC World Congress, R. Findeisen, S. Hirche, K. Janschek, M. Mönnigmann, eds., 53 of IFAC PapersOnLine, Elsevier, 2020, pp. 17151720, DOI 10.1016/j.ifacol.2020.12.2284 .
Abstract
In this paper we propose several adaptive gradient methods for stochastic optimization. Our methods are based on Armijotype line search and they simultaneously adapt to the unknown Lipschitz constant of the gradient and variance of the stochastic approximation for the gradient. We consider an accelerated gradient descent for convex problems and gradient descent for nonconvex problems. In the experiments we demonstrate superiority of our methods to existing adaptive methods, e.g. AdaGrad and Adam. 
P. Dvurechensky, A. Gasnikov, E. Nurminski, F. Stonyakin, Advances in lowmemory subgradient optimization, in: Numerical Nonsmooth Optimization, A.M. Bagirov, M. Gaudioso, N. Karmitsa, M.M. Mäkelä, S. Taheri, eds., Springer International Publishing, Cham, 2020, pp. 1959, DOI 10.1007/9783030349103_2 .

P. Dvurechensky, A. Gasnikov, S. Omelchenko, A. Tiurin, A stable alternative to Sinkhorn's algorithm for regularized optimal transport, in: Mathematical Optimization Theory and Operations Research. MOTOR 2020, A. Kononov, M. Khachay, V.A. Kalyagin, P. Pardalos, eds., Lecture Notes in Computer Science, Springer, Cham, 2020, pp. 406423, DOI 10.1007/9783030499884_28 .

P. Dvurechensky, P. Ostroukhov, K. Safin, S. Shtern, M. Staudigl, Selfconcordant analysis of FrankWolfe algorithms, in: Proceedings of the 37th International Conference on Machine Learning, H. Daumé Iii, A. Singh, eds., 119 of Proceedings of Machine Learning Research (online), 2020, pp. 28142824.
Abstract
Projectionfree optimization via different variants of the FrankWolfe (FW), a.k.a. Conditional Gradient method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a selfconcordant (SC) function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations. If the problem admits a stronger local linear minimization oracle, we construct a novel FW method with linear convergence rate for SC functions. 
D. Dvinskikh, E. Gorbunov, A. Gasnikov, A. Dvurechensky, C.A. Uribe, On primal and dual approaches for distributed stochastic convex optimization over networks, in: 2019 IEEE 58th Conference on Decision and Control (CDC), IEEE Xplore, 2019, pp. 74357440, DOI 10.1109/CDC40024.2019.9029798 .
Abstract
We introduce a primaldual stochastic gradient oracle method for distributed convex optimization problems over networks. We show that the proposed method is optimal in terms of communication steps. Additionally, we propose a new analysis method for the rate of convergence in terms of duality gap and probability of large deviations. This analysis is based on a new technique that allows to bound the distance between the iteration sequence and the optimal point. By the proper choice of batch size, we can guarantee that this distance equals (up to a constant) to the distance between the starting point and the solution. 
P. Dvurechensky, D. Dvinskikh, A. Gasnikov, C.A. Uribe, A. Nedić, Decentralize and randomize: Faster algorithm for Wasserstein barycenters, in: Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. CesaBianchi, R. Garnett, eds., Curran Associates, Inc., 2018, pp. 1076010770.

L. Bogulubsky, P. Dvurechensky, A. Gasnikov, G. Gusev, Y. Nesterov, A. Raigorodskii, A. Tikhonov, M. Zhukovskii, Learning supervised PageRank with gradientbased and gradientfree optimization methods, in: Advances in Neural Information Processing Systems 29, D.D. Lee, M. Sugiyama, U.V. Luxburg, I. Guyon, R. Garnett, eds., Curran Associates, Inc., 2016, pp. 49074915.

A. Chernov, P. Dvurechensky, A. Gasnikov, Fast primaldual gradient method for strongly convex minimization problems with linear constraints, in: Discrete Optimization and Operations Research  9th International Conference, DOOR 2016, Vladivostok, Russia, September 1923, 2016, Proceedings, Y. Kochetov, M. Khachay, V. Beresnev, E. Nurminski, P. Pardalos, eds., 9869 of Theoretical Computer Science and General Issues, Springer International Publishing Switzerland, Cham, 2016, pp. 391403.

M. Arias Chao, D.S. Lilley, P. Mathé, V. Schlosshauer, Calibration and uncertainty quantification of gas turbines performance models, in: ASME Turbo Expo 2015: Turbine Technical Conference and Exposition, Volume 7A: Structures and Dynamics, ASME and Alstom Technologie AG, 2015, pp. V07AT29A001V07AT29A012.

G. Blanchard, N. Krämer, Kernel partial least squares is universally consistent, in: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS 2010), Y.W. Teh, M. Titterington, eds., 9 of JMLR Workshop and Conference Proceedings, Journal of Machine Learning Research, Cambridge, MA, USA, 2010, pp. 5764.
Preprints, Reports, Technical Reports

M. Eigel, R. Gruhlke, D. Sommer, Less interaction with forward models in Langevin dynamics, Preprint no. 2987, WIAS, Berlin, 2022, DOI 10.20347/WIAS.PREPRINT.2987 .
Abstract, PDF (1423 kByte)
Ensemble methods have become ubiquitous for the solution of Bayesian inference problems. Stateoftheart Langevin samplers such as the Ensemble Kalman Sampler (EKS), Affine Invariant Langevin Dynamics (ALDI) or its extension using weighted covariance estimates rely on successive evaluations of the forward model or its gradient. A main drawback of these methods hence is their vast number of required forward calls as well as their possible lack of convergence in the case of more involved posterior measures such as multimodal distributions. The goal of this paper is to address these challenges to some extend. First, several possible adaptive ensemble enrichment strategies that successively enlarge the number of particles in the underlying Langevin dynamics are discusses that in turn lead to a significant reduction of the total number of forward calls. Second, analytical consistency guarantees of the ensemble enrichment method are provided for linear forward models. Third, to address more involved target distributions, the method is extended by applying adapted Langevin dynamics based on a homotopy formalism for which convergence is proved. Finally, numerical investigations of several benchmark problems illustrates the possible gain of the proposed method, comparing it to stateoftheart Langevin samplers. 
R. Gruhlke, M. Eigel, Lowrank Wasserstein polynomial chaos expansions in the framework of optimal transport, Preprint no. 2927, WIAS, Berlin, 2022, DOI 10.20347/WIAS.PREPRINT.2927 .
Abstract, PDF (10 MByte)
A unsupervised learning approach for the computation of an explicit functional representation of a random vector Y is presented, which only relies on a finite set of samples with unknown distribution. Motivated by recent advances with computational optimal transport for estimating Wasserstein distances, we develop a new Wasserstein multielement polynomial chaos expansion (WPCE). It relies on the minimization of a regularized empirical Wasserstein metric known as debiased Sinkhorn divergence.As a requirement for an efficient polynomial basis expansion, a suitable (minimal) stochastic coordinate system X has to be determined with the aim to identify ideally independent random variables. This approach generalizes representations through diffeomorphic transport maps to the case of noncontinuous and noninjective model classes M with different input and output dimension, yielding the relation Y=M(X) in distribution. Moreover, since the used PCE grows exponentially in the number of random coordinates of X, we introduce an appropriate lowrank format given as stacks of tensor trains, which alleviates the curse of dimensionality, leading to only linear dependence on the input dimension. By the choice of the model class M and the smooth loss function, higher order optimization schemes become possible. It is shown that the relaxation to a discontinuous model class is necessary to explain multimodal distributions. Moreover, the proposed framework is applied to a numerical upscaling task, considering a computationally challenging microscopic random nonperiodic composite material. This leads to tractable effective macroscopic random field in adopted stochastic coordinates.
Talks, Poster

A. Selahi, Homogenization in PDEbased battery models and recovery of ageing dynamics via Bayesian inference (online talk), Second Conference of Young Applied Mathematicians (Hybrid Event), September 18  22, 2022, Arenzano, Italy, September 21, 2022.

P. Dvurechensky, Accelerated alternating minimization methods, 20th French German Portuguese Conference in Optimization, May 3  6, 2022, University of Porto, School of Economics and Management, Portugal, May 5, 2022.

P. Dvurechensky, Generalized selfconcordant analysis of FrankWolfe algorithms, 19th Workshop on Advances in Continuous Optimization (EUROPT 2022), July 29  30, 2022, NOVA University Lisbon, School of Science and Technology, Portugal, July 30, 2022.

P. Dvurechensky, Multimarginal optimal transport by accelerated alternating minimization (online talk), SIAM Conference on Imaging Science (IS22) (Online Event), Minisymposium MS94: ``MultiMarginal Optimal Transport'', March 21  25, 2022, March 24, 2022.

O. Klein, On forward and inverse uncertainty quantification for a model for a magneto mechanical device involving a hysteresis operator (joint work with Carmine Stefano Clemente and Daniele Davino, Universitá degli Studi di Sannio, Italy), MURPHYS 2022  Interdisciplinary Conference on Multiple Scale Systems, Systems with Hysteresis, May 29  June 3, 2022, Silesian University, Ostravice, Czech Republic, May 31, 2022.

M. Landstorfer, A. Selahi, M. Heida, M. Eigel, Recovery of battery ageing dynamics with multiple timescales, MATH+Day 2022, Technische Universität Berlin, November 18, 2022.

V. Spokoiny, Bayesian inference for nonlinear inverse problems, SFB 1294 Annual Meeting 2022, September 13  14, 2022, Universität Potsdam, Institut für Mathematik, September 13, 2022.

V. Spokoiny, Bayesian optimization by Laplace iterations, Workshop on statistical inference and convex optimization, June 13  15, 2022, Université Grenoble Alpes, Laboratoire Jean Kuntzmann, France, June 13, 2022.

P. Dvurechensky, Accelerated gradient methods and their applications to Wasserstein barycenter problem (online talk), The XIII International Scientific Conference and Young Scientist School ``Theory and Numerics of Inverse and Illposed Problems'' (Online Event), April 12  22, 2021, Mathematical Center in Akademgorodok, Novosibirsk, Russian Federation, April 14, 2021.

P. Dvurechensky, Newton method over networks is fast up to the statistical precision (online talk), Thirtyeighth International Conference on Machine Learning (Online Event), July 18  24, 2021, Carnegie Mellon University, Pittsburgh, USA, July 20, 2021.

P. Dvurechensky, On a combination of alternating minimization and Nesterov's momentum (online talk), Thirtyeighth International Conference on Machine Learning (Online Event), July 18  24, 2021, Carnegie Mellon University, Pittsburgh, USA, July 20, 2021.

P. Dvurechensky, Primaldual accelerated gradient methods with alternating minimization (online talk), Conference Optimization without Borders, July 12  18, 2021, Sirius University of Science and Technology, Sochi, Russian Federation, July 15, 2021.

P. Dvurechensky, Wasserstein barycenters from the computational perspective (online talk), Moscow Conference on Combinatorics and Applications (Online Event), May 31  June 4, 2021, Moscow Institute of Physics and Technology, School of Applied Mathematics and Computer Science, Moscow, Russian Federation, June 2, 2021.

M. Hintermüller, Mathematics of quantitative MRI (online talk), The 5th International Symposium on Image Computing and Digital Medicine (ISICDM 2021), December 17  20, 2021, Guilin, China, December 18, 2021.

M. Hintermüller, Optimization with learninginformed differential equation constraints and its applications (online talk), Deep Learning and Inverse Problems (MDLW02), September 27  October 1, 2021, Isaac Newton Institute for Mathematical Sciences (Hybrid Event), Oxford, UK, October 1, 2021.

M. Hintermüller, Quantitative imaging: Physics integrated and machine learning based models in MRI (online talk), MATHIMS Joint Applied Mathematics Colloquium Series, The Chinese University of Hong Kong, Center for Mathematical Artificial Intelligence, China, December 3, 2021.

K. Papafitsoros, A. Kofler, Classical vs. data driven regularization methods in imaging (online tutorial), MATH+ Thematic Einstein Semester on Mathematics of Imaging in RealWord Challenges, Berlin, October 29, 2021.

M. Hintermüller, A function space framework for structural total variation regularization with applications in inverse problems, 71st Workshop: Advances in Nonsmooth Analysis and Optimization (NAO2019), June 25  30, 2019, International School of Mathematics ``Guido Stampacchia'', Erice, Italy, June 26, 2019.

M. Hintermüller, A function space framework for structural total variation regularization with applications in inverse problems, Thematic Programme ``Modern Maximal Monotone Operator Theory: From Nonsmooth Optimization to Differential Inclusions'', Workshop ``Nonsmooth and Variational Analysis'', January 28  February 1, 2019, Erwin Schrödinger International Institute for Mathematics and Physics, Vienna, Austria, February 1, 2019.

M. Hintermüller, Applications in image processing, Workshop on Efficient Operator Splitting Techniques for Complex System and Large Scale Data Analysis, January 15  18, 2019, Sanya, China, January 14, 2019.

M. Hintermüller, Structural total variation regularization with applications in inverse problems, 90th Annual Meeting of the International Association of Applied Mathematics and Mechanics (GAMM), DFG Priority Programme 1962 ``Non Smooth and ComplementarityBased Distributed Parameter Systems: Simulation and Hierarchical Optimization'', February 18  22, 2019, Technische Universität Wien, Austria, February 19, 2019.

M. Marschall, Adaptive lowrank approximation in Bayesian inverse problems, 3rd International Conference on Uncertainty Quantification in Computational Sciences and Engineering (UNCECOMP 2019), Minisymposium 6IV ``Uncertainty Computations with Reduced Order Models and LowRank Representations'', June 24  26, 2019, Crete, Greece, June 25, 2019.

M. Marschall, Complexity reduction in Bayesian inverse problems by lowrank tensor representation, Robert Bosch GmbH, Corporate Research  Advanced Engineering Computer Vision Systems (CR/AEC4), Hildesheim, April 16, 2019.

M. Marschall, Lowrank surrogates in Bayesian inverse problems, 19th FrenchGermanSwiss Conference on Optimization (FGS'2019), Minisymposium 1 ``Recent Trends in Nonlinear Optimization 1'', September 17  20, 2019, Nice, France, September 17, 2019.

M. Marschall, Random domains in PDE problems with lowrank surrogates. Forward and backward, PhysikalischTechnische Bundesanstalt, Arbeitsgruppe 8.41 ``Mathematische Modellierung und Datenanalyse'', Berlin, April 10, 2019.

P. Mathé, Relating direct and inverse Bayesian problems via the modulus of continuity, Stochastic Computation and Complexity (ibcparis2019), April 15  16, 2019, Institut Henri Poincaré, Paris, France, April 16, 2019.

P. Mathé, Relating direct and inverse problems via the modulus of continuity, The Chemnitz Symposium on Inverse Problems 2019, September 30  October 2, 2019, Technische Universität Chemnitz, Fakultät für Mathematik, Frankfurt a. M., October 1, 2019.

P. Mathé, The role of the modulus of continuity in inverse problems, Forschungsseminar Inverse Probleme, Technische Universität Chemnitz, Fachbereich Mathematik, August 13, 2019.

M. Eigel, Adaptive Galerkin FEM for stochastic forward and inverse problems, Optimisation and Numerical Analysis Seminars, University of Birmingham, School of Mathematics, UK, February 15, 2018.

M. Eigel, Adaptive tensor methods for forward and inverse problems, SIAM Conference on Uncertainty Quantification (UQ18), Minisymposium 122 ``LowRank Approximations for the Forward and the Inverse Problems III'', April 16  19, 2018, Garden Grove, USA, April 19, 2018.

M. Hintermüller, Automated regularization parameter choice rule in image processing, Workshop ``New Directions in Stochastic Optimisation'', August 19  25, 2018, Mathematisches Forschungsinstitut Oberwolfach, August 23, 2018.

M. Hintermüller, Bilevel optimisation in automated regularisation parameter selection in image processing, WIASPGMO Workshop on Nonsmooth and Stochastic Optimization, June 26, 2018, HumboldtUniversität zu Berlin, June 26, 2018.

M. Hintermüller, Bilevel optimization and some "parameter learning" applications in image processing, SIAM Conference on Imaging Science, Minisymposium MS5 ``Learning and Adaptive Approaches in Image Processing'', June 5  8, 2018, Bologna, Italy, June 5, 2018.

M. Marschall, Bayesian inversion with adaptive lowrank approximation, Analysis, Control and Inverse Problems for PDEs  Workshop of the FrenchGermanItalian LIA (Laboratoire International Associe) COPDESC on Applied Analysis, November 26  30, 2018, University of Naples Federico II and Accademia Pontaniana, Italy, November 29, 2018.

P. Mathé, Bayesian inverse problems with noncommuting operators, Chemnitz September of Applied Mathematics 2018, Chemnitz Symposium on Inverse Problems, September 27  28, 2018, Technische Universität Chemnitz, Fakultät für Mathematik, September 28, 2018.

P. Mathé, Complexity of linear illposed problems in Hilbert space, Stochastisches Kolloquium, GeorgAugustUniversität Göttingen, Institut für Mathematische Stochastik, February 7, 2018.

M. Eigel, Adaptive stochastic FE for explicit Bayesian inversion with hierarchical tensor representations, Institut National de Recherche en Informatique et en Automatique (INRIA), SERENA (Simulation for the Environment: Reliable and Efficient Numerical Algorithms) research team, Paris, France, June 1, 2017.

R. Gruhlke, Multiscale failure analysis with polymorphic uncertainties for optimal design of rotor blades, Frontiers of Uncertainty Quantification in Engineering (FrontUQ 2017), September 6  8, 2017, München, September 6, 2017.

M. Hintermüller, Bilevel optimization and applications in imaging, Workshop ``Emerging Developments in Interfaces and Free Boundaries'', January 22  28, 2017, Mathematisches Forschungsinstitut Oberwolfach.

M. Hintermüller, Bilevel optimization and applications in imaging, Mathematisches Kolloquium, Universität Wien, Austria, January 18, 2017.

M. Hintermüller, Bilevel optimization and some ``parameter learning'' applications in image processing, LMS Workshop ``Variational Methods Meet Machine Learning'', September 18, 2017, University of Cambridge, Centre for Mathematical Sciences, UK, September 18, 2017.

A. Koziuk, Bootstrap for the regression problem with instrumental variables, Haindorf Seminar 2017, January 24  28, 2017, HumboldtUniversität zu Berlin, Wirtschaftswissenschaftliche Fakultät, Hejnice, Czech Republic, January 26, 2017.

M. Marschall, Bayesian inversion using hierarchical tensors, 88th Annual Meeting of the International Association of Applied Mathematics and Mechanics (GAMM 2017), Section S15 ``Uncertainty Quantification'', March 6  10, 2017, Bauhaus Universität Weimar/Technische Universität Ilmenau, Weimar, March 8, 2017.

M. Marschall, Samplingfree Bayesian inversion with adaptive hierarchical tensor representation, Frontiers of Uncertainty Quantification in Engineering (FrontUQ 2017), September 6  8, 2017, München, September 7, 2017.

M. Marschall, Samplingfree Bayesian inversion with adaptive hierarchical tensor representation, International Conference on Scientific Computation and Differential Equations (SciCADE2017), MS21 ``Tensor Approximations of MultiDimensional PDEs'', September 11  15, 2017, University of Bath, UK, September 14, 2017.

P. Mathé, Bayesian inverse problems with noncommuting operators, Statistical Foundations of Uncertainty Quantification for Inverse Problems Workshop, June 19  22, 2017, University of Cambridge, Center for Mathematical Sciences, UK, June 21, 2017.

P. Mathé, Complexity of supervised learning, ibcparis2017 : Information Based Complexity, HighDimensional Problems, March 14  15, 2017, Institut Henri Poincaré, Paris, France, March 15, 2017.

P. Mathé, Numerical integration (mini course), November 20  December 4, 2017, Fudan University, School of Mathematical Sciences, China.

N. Buzun, Multiplier bootstrap for change point detection, Mathematical Statistics and Inverse Problems, February 8  12, 2016, Faculté des Sciences de Luminy, France, February 11, 2016.

N. Buzun, Multiplier bootstrap for change point detection, Spring School ``Structural Inference 2016", March 13  18, 2016, DFG Forschergruppe FOR 1735, Lübeck, Germany, March 14, 2016.

M. Eigel, Bayesian inversion using hierarchical tensor approximations, SIAM Conference on Uncertainty Quantification, Minisymposium 67 ``Bayesian Inversion and Lowrank Approximation (Part II)'', April 5  8, 2016, Lausanne, Switzerland, April 6, 2016.

T. Wu, Bilevel optimization and applications in imaging sciences, August 24  25, 2016, Shanghai Jiao Tong University, Institute of Natural Sciences, China.

P. Dvurechensky, Gradient and gradientfree methods for pagerank algorithm learning, Workshop on Modern Statistics and Optimization, February 23  24, 2016, Russian Academy of Sciences, Institute for Information Transmission Problems, Moscow, Russian Federation, February 24, 2016.

P. Dvurechensky, Random gradientfree methods for webpage ranking model learning, 30th annual conference of the Belgian Operational Research Society, January 27  29, 2016, LouvainlaNeuve, Belgium, January 28, 2016.

M. Hintermüller, K. Papafitsoros, C. Rautenberg, A fine scale analysis of spatially adapted total variation regularisation, Imaging, Vision and Learning based on Optimization and PDEs, Bergen, Norway, August 29  September 1, 2016.

M. Hintermüller, Bilevel optimization and applications in imaging, Imaging, Vision and Learning based on Optimization and PDEs, August 29  September 1, 2016, Bergen, Norway, August 30, 2016.

M. Hintermüller, Bilevel optimization for a generalized totalvariation model, SIAM Conference on Imaging Science, Minisymposium ``NonConvex Regularization Methods in Image Restoration'', May 23  26, 2016, Albuquerque, USA, May 26, 2016.

M. Hintermüller, Optimal selection of the regularisation function in a localised TV model, SIAM Conference on Imaging Science, Minisymposium ``Analysis and Parameterisation of Derivative Based Regularisation'', May 23  26, 2016, Albuquerque, USA, May 24, 2016.

P. Mathé, Complexity of linear illposed problems in Hilbert space, IBC on the 70th anniversary of Henryk Wozniakowski, August 29  September 2, 2016, Banach Center, Bedlewo, Poland, August 31, 2016.

P. Mathé, Complexity of linear illposed problems in Hilbert space, Chemnitz Symposium on Inverse Problems, September 22  23, 2016, Technische Universität Chemnitz, Fakultät für Mathematik, September 22, 2016.

P. Mathé, Discrepancy based model selection in statistical inverse problems, Mathematical Statistics and Inverse Problems, February 8  12, 2016, Centre International de Rencontres Mathématiques (CIRM), Luminy, France, February 11, 2016.

V. Spokoiny, Gradient and gradientfree methods for pagerank algorithm learning, Workshop on Modern Statistics and Optimization, February 23  24, 2016, Russian Academy of Sciences, Institute for Information Transmission Problems, Moscow, Russian Federation.

P. Dvurechensky, Semisupervised pagerank model learning with gradientfree optimization methods, Traditional Youth School ``Control, Information and Optimization'', June 14  20, 2015, Moscow, Russian Federation, June 17, 2015.

P. Mathé, Minimax signal detection in statistical inverse problems, Algorithms and Complexity for Continuous Problems, September 21  25, 2015, Schloss Dagstuhl, September 25, 2015.

P. Mathé, A random surfer in the internet, GermanEstonian Academic Week ``Academica'', October 13  15, 2014, University of Tartu, Faculty of Mathematics and Computer Science, Estonia, October 14, 2014.

P. Mathé, Bayesian analysis of statistical inverse problems, Colloquium of the Faculty of Mathematics and Computer Science, University of Tartu, Estonia, October 13, 2014.

P. Mathé, Bayesian regularization of statistical inverse problems, Rencontres de Statistique Mathématique: Nouvelles Procédures pour Nouvelles Données, December 15  19, 2014, Centre International de Rencontres Mathématiques (CIRM), Marseille, France, December 17, 2014.

P. Mathé, Convergence of Bayesian schemes in inverse problems, Kolloquium über Angewandte Mathematik, GeorgAugustUniversität Göttingen, Fakultät für Mathematik und Informatik, June 10, 2014.

P. Mathé, Merging regularization theory into Bayesian inverse problems, Chemnitz Symposium on Inverse Problems, September 15  19, 2014, Universität Chemnitz, Fachbereich Mathematik, September 18, 2014.

P. Mathé, Oracletype posterior contraction rates in Bayesian inverse problems, International Workshop ``Advances in Optimization and Statistics'', May 15  16, 2014, Russian Academy of Sciences, Institute of Information Transmission Problems (Kharkevich Institute), Moscow, May 16, 2014.

P. Mathé, Parameter estimation by thresholding, Mathematical Modelling and Simulation Workshop, WIAS Berlin, April 15, 2014.

P. Mathé, Smoothness beyond differentiability, Seminar for Doctoral Candidates, University of Tartu, Faculty of Mathematics and Computer Science, Estonia, October 15, 2014.

A. Andresen, Finite sample analysis of maximum likelihood estimators and convergence of the alternating procedure, 29th European Meeting of Statisticians (EMS), July 20  25, 2013, Eötvös Loránd University, Budapest, Hungary, July 20, 2013.

H. Mai, Estimating a subordinators density, DynStoch 2013, April 17  19, 2013, University of Copenhagen, Department of Mathematical Sciences, Denmark, April 17, 2013.

P. Mathé, Bayes analysis for model calibration, Alstom Workshop, WIAS, May 7, 2013.

P. Mathé, Signal detection in inverse problems, PreMoLab Workshop on: Advances in predictive modeling and optimization, May 16  17, 2013, WIASBerlin, May 16, 2013.

P. Mathé, Signal detection in inverse problems, Mathematical Modelling and Analysis ( MMA2013 ) and Approximation Methods and Orthogonal Expansions ( AMOE2013 ), May 27  31, 2013, University of Tartu, Institute of Mathematics, Estonia, May 29, 2013.

P. Mathé, Statistical Inverse Problems, Applied Math Seminar, University of Warwick, Mathematics Institute, Coventry, UK, October 18, 2013.

S. Becker, Image processing via orientation scores, Workshop ``Computational Inverse Problems'', October 23  26, 2012, Mathematisches Forschungsinstitut Oberwolfach, October 25, 2012.

A. Andresen, Non asymptotic Wilks phenomenon in semiparametric estimation, PreMoLab: MoscowBerlin Stochastic and Predictive Modeling, May 31  June 1, 2012, Russian Academy of Sciences, Institute for Information Transmission Problems (Kharkevich Institute), Moscow, May 31, 2012.

A. Andresen, Nonasymptotic Wilks phenomenon in semiparametric estimation, 2. Structural Inference Day, WIAS, April 23, 2012.

A. Andresen, Nonasymptotic Wilks phenomenon in semiparametric estimation, Haindorf Seminar 2012 (Klausurtagung des SFB 649), February 9  12, 2012, HumboldtUniversität zu Berlin, Wirtschaftswissenschaftliche Fakultät, Hejnice, Czech Republic, February 10, 2012.

P. Mathé, An oracletype bound for a statistical RGrule, Nonparametric and Highdimensional Statistics, December 17  21, 2012, Centre International de Rencontres Mathématiques (CIRM), Marseille, France, December 20, 2012.

P. Mathé, Diskrepanzbasierte Parameterwahl in statistischen inversen Problemen, Technische Universität Chemnitz, Fakultät für Mathematik, September 19, 2012.

P. Mathé, Regularization of statistical inverse problems in Hilbert space, Journées Statistiques du Sud 2012, June 20  22, 2012, Université Toulouse, Institut National des Sciences Appliquées, France, June 20, 2012.

P. Mathé, Statistische Datenanalyse unter BOP, Workshop on Simulation in Industrial Process Engineering, WIAS, September 6, 2012.

P. Mathé, Using the discrepancy principle in statistical inverse problems, Regularisation symposium, Australian National University, Mathematical Sciences Institute, Canberra, November 22, 2012.

V. Spokoiny, Basics of modern parametric statistics, February 13  28, 2012, Independent University of Moscow, Center for Continuous Mathematical Education, Russian Federation.

V. Spokoiny, Bernsteinvon Mises theorem for quasi posteriors, International Workshop on Recent Advances in Time Series Analysis (RATS 2012), June 8  12, 2012, University of Cyprus, Department of Mathematics and Statistics, Protaras, June 9, 2012.

V. Spokoiny, Bernsteinvon Mises theorem for quasi posteriors, Workshop ``Frontiers in Nonparametric Statistics'', March 11  17, 2012, Mathematisches Forschungsinstitut Oberwolfach, March 12, 2012.

V. Spokoiny, Bernsteinvon Mises theorem for quasi posteriors, Workshop II on Financial Time Series Analysis: Highdimensionality, Nonstationarity and the Financial Crisis, June 19  22, 2012, National University of Singapore, Institute for Mathematical Sciences, June 21, 2012.

V. Spokoiny, Bernsteinvon Mises theorem for quasi posteriors, Workshop on Recent Developments in Statistical Multiscale Methods, July 16  18, 2012, GeorgAugustUniversität Göttingen, Institut für Mathematische Stochastik, July 17, 2012.

V. Spokoiny, Bernsteinvon Mises theorem for quasi posteriors, PreMoLab Seminar, Russian Academy of Sciences, Institute for Information Transmission Problems (Kharkevich Institute), Moscow, March 15, 2012.

V. Spokoiny, Parametric estimation: Modern view, PreMoDay I, February 24, 2012, Russian Academy of Sciences, Institute for Information Transmission Problems (Kharkevich Institute), Moscow, February 24, 2012.

V. Spokoiny, Some methods of modern statistics, Information Technology and Systems 2, August 19  25, 2012, Russian Academy of Sciences, Institute for Information Transmission Problems (Kharkevich Institute), Petrozavodsk, August 20, 2012.

P. Mathé, Conjugate gradient iteration with noisy data, Foundations of Computational Mathematics (FoCM'11), July 4  14, 2011, Budapest, Hungary, July 6, 2011.

V. Spokoiny, Alternating and semiparametric efficiency, École Nationale de la Statistique et de l'Analyse de l'Information (ENSAI), Rennes, France, September 16, 2011.

V. Spokoiny, Modern parametric theory, September 13  16, 2011, École Nationale de la Statistique et de l'Analyse de l'Information (ENSAI), Rennes, France.

P. Mathé, Conjugate gradient iteration under white noise, International Conference on Scientific Computing (SC2011), October 10  14, 2011, Università di Cagliari, Dipartimento di Matematica e Informatica, Cagliari, Italy, October 14, 2011.

V. Spokoiny, Semiparametric estimation alternating and efficiency, École Nationale de la Statistique et de l'Administration Économique (ENSAE), Paris, France, December 12, 2011.

M. Becker, Selfintersection local times: Exponential moments in subcritical dimensions, Excess SelfIntersections and Related Topics, December 6  10, 2010, Centre international de rencontres mathématiques (CIRM), Luminy, France, December 6, 2010.

S. Becker, Regularization of statistical inverse problems and the Bakushinskii veto, Rencontres de Statistiques Mathématiques 10, December 13  17, 2010, Centre International de Rencontres Mathématiques (CIRM), Luminy, France, December 17, 2010.

N. Krämer, Conjugate gradient regularization  A statistical framework for partial least squares regression, 4th Workshop on Partial Least Squares and Related Methods for Cuttingedge Research in Experimental Sciences, May 10  11, 2010, École Supérieure d'Électricité, Department of Information Systems and Decision Sciences, GifsurYvette, France, May 11, 2010.

D. Belomestny, Estimating the distribution of jumps in regular affine models: Uniform rates of convergence, Leipziger StochastikTage, March 1  5, 2010, Universität Leipzig, Fakultät für Mathematik und Informatik, March 2, 2010.

D. Belomestny, Statistical inference for multidimensional timechanged Levy processes based on lowfrequency data, 28th European Meeting of Statisticians, August 19  23, 2010, University of Piraeus, Department of Statistics and Insurance Science, Greece, August 23, 2010.

G. Blanchard, N. Krämer, Kernel partial least squares is universally consistent, AI & Statistics 2010, Sardinia, Italy, May 13  15, 2010.

G. Blanchard, N. Krämer, Optimal rates for conjugate gradient regularization, AI & Statistics 2010, Sardinia, Italy, May 13  15, 2010.

P. Mathé, Analysis of inverse problems under general smoothness assumptions, 5th International Conference on Inverse Problems: Modeling and Simulation, May 24  29, 2010, Izmir University, Department of Mathematics and Computer Sciences, Antalya, Turkey, May 25, 2010.

P. Mathé, Regularization under general noise assumptions, Chemnitz Symposium on Inverse Problems 2010, September 23  24, 2010, Chemnitz University of Technology, Department of Mathematics, September 23, 2010.

P. Mathé, Warum und wie rechnen wir mit allgemeinen Quelldarstellungen?, Seminar Inverse Probleme (Fr. C. Böckmann), Universität Potsdam, Institut für Mathematik, January 12, 2010.

D. Belomestny, Estimation of the jump activity of a Lévy process from low frequency data, Haindorf Seminar 2009, February 12  15, 2009, HumboldtUniversität zu Berlin, CASE  Center for Applied Statistics and Economics, Hejnice, Czech Republic, February 12, 2009.

D. Belomestny, Spectral estimation of the fractional order of a Lévy process, Workshop ``Statistical Inference for Lévy Processes with Applications to Finance'', July 15  17, 2009, EURANDOM, Eindhoven, Netherlands, July 16, 2009.

G. Blanchard, Convergence du gradient conjugué fonctionnel pour l'apprentissage statistique, École Normale Supérieure, Paris, France, March 16, 2009.

P. Mathé, Approximation theoretic aspects in variable Hilbert scales, 2nd Dolomites Workshop on Constructive Approximation and Applications, September 3  10, 2009, Università degli Studi di Verona, Dipartimento di Informatica, Italy, September 5, 2009.

P. Mathé, Discretization under general smoothness assumptions, Monday Lecture Series, Johann Radon Institute for Computational and Applied Mathematics (RICAM), Linz, Austria, June 29, 2009.

P. Mathé, Inverse problems in different settings, Seminar ``Algorithms and Complexity for Continuous Problems'', September 20  25, 2009, Schloss Dagstuhl, September 25, 2009.

P. Mathé, On some minimizationbased heuristic parameter choice in inverse problems, ChemnitzRICAM Symposium on Inverse Problems, Johann Radon Institute for Computational and Applied Mathematics (RICAM), Linz, Austria, July 14, 2009.

P. Mathé, Typical behavior in heuristic parameter choice, KickOff Meeting of Mini Special Semester on Inverse Problems, May 18  22, 2009, Johann Radon Institute for Computational and Applied Mathematics (RICAM), Linz, Austria, May 18, 2009.

V. Spokoiny, Saddle point model selection for inverse problems, Workshop ``Challenges in Statistical Theory: Complex Data Structures and Algorithmic Optimization'', August 23  29, 2009, Mathematisches Forschungsinstitut Oberwolfach, August 26, 2009.

P. Mathé, Average case analysis of inverse problems, Symposium on Inverse Problems 2008, September 25  26, 2008, TU Chemnitz, Fachbereich Mathematik, September 25, 2008.
External Preprints

N. Kornilov, A. Gasnikov, P. Dvurechensky, D. Dvinskikh, Gradient free methods for nonsmooth convex optimization with heavy tails on convex compact, Preprint no. arXiv:2304.02442, Cornell University, 2023, DOI 10.48550/arXiv.2304.02442 .

M. Alkousa, A. Gasnikov, P. Dvurechensky, A. Sadiev, L. Razouk, An approach for nonconvex uniformly concave structured saddle point problem, Preprint no. arXiv:2202.06376, Cornell University, 2022, DOI 10.48550/arXiv.2202.06376 .
Abstract
Recently, saddle point problems have received much attention due to their powerful modeling capability for a lot of problems from diverse domains. Applications of these problems occur in many applied areas, such as robust optimization, distributed optimization, game theory, and many applications in machine learning such as empirical risk minimization and generative adversarial networks training. Therefore, many researchers have actively worked on developing numerical methods for solving saddle point problems in many different settings. This paper is devoted to developing a numerical method for solving saddle point problems in the nonconvex uniformlyconcave setting. We study a general class of saddle point problems with composite structure and Höldercontinuous higherorder derivatives. To solve the problem under consideration, we propose an approach in which we reduce the problem to a combination of two auxiliary optimization problems separately for each group of variables, outer minimization problem w.r.t. primal variables, and inner maximization problem w.r.t the dual variables. For solving the outer minimization problem, we use the Adaptive Gradient Method, which is applicable for nonconvex problems and also works with an inexact oracle that is generated by approximately solving the inner problem. For solving the inner maximization problem, we use the Restarted Unified Acceleration Framework, which is a framework that unifies the highorder acceleration methods for minimizing a convex function that has Höldercontinuous higherorder derivatives. Separate complexity bounds are provided for the number of calls to the firstorder oracles for the outer minimization problem and higherorder oracles for the inner maximization problem. Moreover, the complexity of the whole proposed approach is then estimated. 
A. Gasnikov, A. Novitskii, V. Novitskii, F. Abdukhakimov, D. Kamzolov, A. Beznosikov, M. Takáč, P. Dvurechensky, B. Gu, The power of firstorder smooth optimization for blackbox nonsmooth problems, Preprint no. arXiv:2201.12289, Cornell University, 2022, DOI 10.48550/arXiv.2201.12289 .
Abstract
Gradientfree/zerothorder methods for blackbox convex optimization have been extensively studied in the last decade with the main focus on oracle calls complexity. In this paper, besides the oracle complexity, we focus also on iteration complexity, and propose a generic approach that, based on optimal firstorder methods, allows to obtain in a blackbox fashion new zerothorder algorithms for nonsmooth convex optimization problems. Our approach not only leads to optimal oracle complexity, but also allows to obtain iteration complexity similar to firstorder methods, which, in turn, allows to exploit parallel computations to accelerate the convergence of our algorithms. We also elaborate on extensions for stochastic optimization problems, saddlepoint problems, and distributed optimization. 
P. Dvurechensky, S. Shtern, M. Staudigl, A conditional gradient homotopy method with applications to semidefinite programming, Preprint no. arXiv:2207.03101, Cornell University, 2022, DOI 10.48550/arXiv.2207.03101 .

A. Agafonov, P. Dvurechensky, G. Scutari, A. Gasnikov, D. Kamzolov, A. Lukashevich, A. Daneshmand, An accelerated secondorder method for distributed stochastic optimization, Preprint no. arXiv:2103.14392, Cornell University Library, arXiv.org, 2021.
Abstract
We consider distributed stochastic optimization problems that are solved with master/workers computation architecture. Statistical arguments allow to exploit statistical similarity and approximate this problem by a finitesum problem, for which we propose an inexact accelerated cubicregularized Newton's method that achieves lower communication complexity bound for this setting and improves upon existing upper bound. We further exploit this algorithm to obtain convergence rate bounds for the original stochastic optimization problem and compare our bounds with the existing bounds in several regimes when the goal is to minimize the number of communication rounds and increase the parallelization by increasing the number of workers. 
E. Gladin, A. Sadiev, A. Gasnikov, P. Dvurechensky, A. Beznosikov, M. Alkousa, Solving smooth minmin and minmax problems by mixed oracle algorithms, Preprint no. arXiv:2103.00434, Cornell University Library, arXiv.org, 2021.
Abstract
In this paper, we consider two types of problems that have some similarity in their structure, namely, minmin problems and minmax saddlepoint problems. Our approach is based on considering the outer minimization problem as a minimization problem with inexact oracle. This inexact oracle is calculated via inexact solution of the inner problem, which is either minimization or a maximization problem. Our main assumptions are that the problem is smooth and the available oracle is mixed: it is only possible to evaluate the gradient w.r.t. the outer block of variables which corresponds to the outer minimization problem, whereas for the inner problem only zerothorder oracle is available. To solve the inner problem we use accelerated gradientfree method with zerothorder oracle. To solve the outer problem we use either inexact variant of Vaydya's cuttingplane method or a variant of accelerated gradient method. As a result, we propose a framework that leads to nonasymptotic complexity bounds for both minmin and minmax problems. Moreover, we estimate separately the number of first and zerothorder oracle calls which are sufficient to reach any desired accuracy. 
E. Gorbunov, M. Danilova, I. Shibaev, P. Dvurechensky, A. Gasnikov, Nearoptimal high probability complexity bounds for nonsmooth stochastic optimization with heavytailed noise, Preprint no. arXiv:2106.05958, Cornell University Library, arXiv.org, 2021.
Abstract
Thanks to their practical efficiency and random nature of the data, stochastic firstorder methods are standard for training largescale machine learning models. Random behavior may cause a particular run of an algorithm to result in a highly suboptimal objective value, whereas theoretical guarantees are usually proved for the expectation of the objective value. Thus, it is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for nonsmooth stochastic convex optimization have complexity bounds with the dependence on the confidence level that is either negativepower or logarithmic but under an additional assumption of subGaussian (lighttailed) noise distribution that may not hold in practice, e.g., in several NLP tasks. In our paper, we resolve this issue and derive the first highprobability convergence results with logarithmic dependence on the confidence level for nonsmooth convex stochastic optimization problems with nonsubGaussian (heavytailed) noise. To derive our results, we propose novel stepsize rules for two stochastic methods with gradient clipping. Moreover, our analysis works for generalized smooth objectives with Höldercontinuous gradients, and for both methods, we provide an extension for strongly convex problems. Finally, our results imply that the first (accelerated) method we consider also has optimal iteration and oracle complexity in all the regimes, and the second one is optimal in the nonsmooth setting. 
A. Rogozin, M. Bochko, P. Dvurechensky, A. Gasnikov, V. Lukoshkin, An accelerated method for decentralized distributed stochastic optimization over timevarying graphs, Preprint no. arXiv:2103.15598, Cornell University Library, arXiv.org, 2021.
Abstract
We consider a distributed stochastic optimization problem that is solved by a decentralized network of agents with only local communication between neighboring agents. The goal of the whole system is to minimize a global objective function given as a sum of local objectives held by each agent. Each local objective is defined as an expectation of a convex smooth random function and the agent is allowed to sample stochastic gradients for this function. For this setting we propose the first accelerated (in the sense of Nesterov's acceleration) method that simultaneously attains optimal up to a logarithmic factor communication and oracle complexity bounds for smooth strongly convex distributed stochastic optimization. We also consider the case when the communication graph is allowed to vary with time and obtain complexity bounds for our algorithm, which are the first upper complexity bounds for this setting in the literature. 
V. Tominin , Y. Tominin , E. Borodich , D. Kovalev, A. Gasnikov, P. Dvurechensky, On accelerated methods for saddlepoint problems with composite structure, Preprint no. arXiv:2103.09344, Cornell University Library, arXiv.org, 2021.
Abstract
We consider stronglyconvexstronglyconcave saddlepoint problems with general nonbilinear objective and different condition numbers with respect to the primal and the dual variables. First, we consider such problems with smooth composite terms, one of which having finitesum structure. For this setting we propose a variance reduction algorithm with complexity estimates superior to the existing bounds in the literature. Second, we consider finitesum saddlepoint problems with composite terms and propose several algorithms depending on the properties of the composite terms. When the composite terms are smooth we obtain better complexity bounds than the ones in the literature, including the bounds of a recently proposed nearlyoptimal algorithms which do not consider the composite structure of the problem. If the composite terms are proxfriendly, we propose a variance reduction algorithm that, on the one hand, is accelerated compared to existing variance reduction algorithms and, on the other hand, provides in the composite setting similar complexity bounds to the nearlyoptimal algorithm which is designed for noncomposite setting. Besides that, our algorithms allow to separate the complexity bounds, i.e. estimate, for each part of the objective separately, the number of oracle calls that is sufficient to achieve a given accuracy. This is important since different parts can have different arithmetic complexity of the oracle, and it is desired to call expensive oracles less often than cheap oracles. The key thing to all these results is our general framework for saddlepoint problems, which may be of independent interest. This framework, in turn is based on our proposed Accelerated MetaAlgorithm for composite optimization with probabilistic inexact oracles and probabilistic inexactness in the proximal mapping, which may be of independent interest as well. 
P. Dvurechensky, M. Staudigl, Hessian barrier algorithms for nonconvex conic optimization, Preprint no. arXiv:2111.00100, Cornell University Library, arXiv.org, 2021.
Abstract
We consider the minimization of a continuous function over the intersection of a regular cone with an affine set via a new class of adaptive first and secondorder optimization methods, building on the Hessianbarrier techniques introduced in [Bomze, Mertikopoulos, Schachinger, and Staudigl, Hessian barrier algorithms for linearly constrained optimization problems, SIAM Journal on Optimization, 2019]. Our approach is based on a potentialreduction mechanism and attains a suitably defined class of approximate first or secondorder KKT points with the optimal worstcase iteration complexity O(??2) (firstorder) and O(??3/2) (secondorder), respectively. A key feature of our methodology is the use of selfconcordant barrier functions to construct strictly feasible iterates via a disciplined decomposition approach and without sacrificing on the iteration complexity of the method. To the best of our knowledge, this work is the first which achieves these worstcase complexity bounds under such weak conditions for general conic constrained optimization problems. 
M. Danilova, P. Dvurechensky, A. Gasnikov, E. Gorbunov, S. Guminov, D. Kamzolov, I. Shibaev, Recent theoretical advances in nonconvex optimization, Preprint no. arXiv:2012.06188, Cornell University, 2020.

A. Rastogi, P. Mathé, Inverse learning in Hilbert scales, Preprint no. arXiv:2002.10208, Cornell University, 2020.
Abstract
We study the linear illposed inverse problem with noisy data in the statistical learning setting. Approximate reconstructions from random noisy data are sought with general regularization schemes in Hilbert scale. We discuss the rates of convergence for the regularized solution under the prior assumptions and a certain link condition. We express the error in terms of certain distance functions. For regression functions with smoothness given in terms of source conditions the error bound can then be explicitly established. 
A. Sadiev, A. Beznosikov, P. Dvurechensky, A. Gasnikov, Zerothorder algorithms for smooth saddlepoint problems, Preprint no. arXiv:2009.09908, Cornell University, 2020.
Abstract
In recent years, the importance of saddlepoint problems in machine learning has increased. This is due to the popularity of GANs. In this paper, we solve stochastic smooth (strongly) convexconcave saddlepoint problems using zerothorder oracles. Theoretical analysis shows that in the case when the optimization set is a simplex, we lose only logn times in the stochastic convergence term. The paper also provides an approach to solving saddlepoint problems, when the oracle for one of the variables has zero order, and for the second  first order. Subsequently, we implement zerothorder and 1/2thorder methods to solve practical problems. 
D. Tiapkin, A. Gasnikov, P. Dvurechensky, Stochastic saddlepoint optimization for Wasserstein barycenters, Preprint no. arXiv:2006.06763, Cornell University, 2020.
Abstract
We study the computation of nonregularized Wasserstein barycenters of probability measures supported on the finite set. The first result gives a stochastic optimization algorithm for the discrete distribution over the probability measures which is comparable with the current best algorithms. The second result extends the previous one to the arbitrary distribution using kernel methods. Moreover, this new algorithm has a total complexity better than the Stochastic Averaging approach via the Sinkhorn algorithm in many cases. 
N. Tupitsa, P. Dvurechensky, A. Gasnikov, C.A. Uribe , Multimarginal optimal transport by accelerated alternating minimization, Preprint no. arXiv:2004.02294, Cornell University Library, arXiv.org, 2020.
Abstract
We consider a multimarginal optimal transport, which includes as a particular case the Wasserstein barycenter problem. In this problem one has to find an optimal coupling between m probability measures, which amounts to finding a tensor of the order m. We propose an accelerated method based on accelerated alternating minimization and estimate its complexity to find the approximate solution to the problem. We use entropic regularization with sufficiently small regularization parameter and apply accelerated alternating minimization to the dual problem. A novel primaldual analysis is used to reconstruct the approximately optimal coupling tensor. Our algorithm exhibits a better computational complexity than the stateoftheart methods for some regimes of the problem parameters. 
P. Dvurechensky, K. Safin, S. Shtern, M. Staudigl, Generalized selfconcordant analysis of FrankWolfe algorithms, Preprint no. arXiv:2010.01009, Cornell University, 2020.
Abstract
Projectionfree optimization via different variants of the FrankWolfe (FW) method has become one of the cornerstones in large scale optimization for machine learning and computational statistics. Numerous applications within these fields involve the minimization of functions with selfconcordance like properties. Such generalized selfconcordant (GSC) functions do not necessarily feature a Lipschitz continuous gradient, nor are they strongly convex. Indeed, in a number of applications, e.g. inverse covariance estimation or distanceweighted discrimination problems in support vector machines, the loss is given by a GSC function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. This paper closes this apparent gap in the literature by developing provably convergent FW algorithms with standard O(1/k) convergence rate guarantees. If the problem formulation allows the efficient construction of a local linear minimization oracle, we develop a FW method with linear convergence rate. 
P. Dvurechensky, S. Shtern, M. Staudigl, P. Ostroukhov, K. Safin, Selfconcordant analysis of FrankWolfe algorithms, Preprint no. arXiv:2002.04320, Cornell University, 2020.
Abstract
Projectionfree optimization via different variants of the FrankWolfe (FW) method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a selfconcordant (SC) function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k), k being the iteration counter. If the problem can be represented by a local linear minimization oracle, we are the first to propose a FW method with linear convergence rate without assuming neither strong convexity nor a Lipschitz continuous gradient. 
A. Rastogi, G. Blanchard, P. Mathé, Convergence analysis of Tikhonov regularization for nonlinear statistical inverse learning problems, Preprint no. arXiv:1902.05404, Cornell University Library, arXiv.org, 2019.
Abstract
We study a nonlinear statistical inverse learning problem, where we observe the noisy image of a quantity through a nonlinear operator at some random design points. We consider the widely used Tikhonov regularization (or method of regularization, MOR) approach to reconstruct the estimator of the quantity for the nonlinear illposed inverse problem. The estimator is defined as the minimizer of a Tikhonov functional, which is the sum of a data misfit term and a quadratic penalty term. We develop a theoretical analysis for the minimizer of the Tikhonov regularization scheme using the ansatz of reproducing kernel Hilbert spaces. We discuss optimal rates of convergence for the proposed scheme, uniformly over classes of admissible solutions, defined through appropriate source conditions. 
F. Stonyakin, A. Gasnikov, A. Tyurin, D. Pasechnyuk, A. Agafonov, P. Dvurechensky, D. Dvinskikh, A. Kroshnin, V. Piskunova, Inexact model: A framework for optimization and variational inequalities, Preprint no. arXiv:1902.00990, Cornell University Library, arXiv.org, 2019.

C.A. Uribe, D. Dvinskikh, P. Dvurechensky, A. Gasnikov, A. Nedić, Distributed computation of Wasserstein barycenters over networks, Preprint no. arXiv:1803.02933, Cornell University Library, arXiv.org, 2018.

P. Dvurechensky, D. Dvinskikh, A. Gasnikov, C.A. Uribe, A. Nedić, Decentralize and randomize: Faster algorithm for Wasserstein barycenters, Preprint no. arXiv:1806.03915, Cornell University Library, arXiv.org, 2018.
Abstract
We study the problem of decentralized distributed computation of a discrete approximation for regularized Wasserstein barycenter of a finite set of continuous probability measures distributedly stored over a network. Particularly, we assume that there is a network of agents/machines/computers where each agent holds a private continuous probability measure, and seeks to compute the barycenter of all the measures in the network by getting samples from its local measure and exchanging information with its neighbors. Motivated by this problem, we develop and theoretically analyze a novel accelerated primaldual stochastic gradient method for general stochastic convex optimization problems with linear equality constraints. Then, we apply this method to the decentralized distributed optimization setting to propose a new algorithm for the distributed semidiscrete regularized Wasserstein barycenter problem. The proposed algorithm can be executed over arbitrary networks that are undirected, connected and static, using the local information only. Moreover, we show explicit nonasymptotic complexity in terms of the problem parameters. Finally, we show the effectiveness of our method on the distributed computation of the regularized Wasserstein barycenter of univariate Gaussian and von Mises distributions, as well as on some applications to image aggregation. 
P. Mathé, Bayesian inverse problems with noncommuting operators, Preprint no. arXiv:1801.09540, Cornell University Library, arXiv.org, 2018.
Abstract
The Bayesian approach to illposed operator equations in Hilbert space recently gained attraction. In this context, and when the prior distribution is Gaussian, then two operators play a significant role, the one which governs the operator equation, and the one which describes the prior covariance. Typically it is assumed that these operators commute. Here we extend this analysis to noncommuting operators, replacing the commutativity assumption by a link condition. We discuss its relation to the commuting case, and we indicate that this allows to use interpolation type results to obtain tight bounds for the contraction of the posterior Gaussian distribution towards the data generating element. 
M. Hintermüller, N. Strogies, On the identification of the friction coefficient in a semilinear system for gas transport through a network, Preprint, DFG SFB Transregio 154 ``Mathematical Modelling, Simulation and Optimization using the Example of Gas Networks'', 2017.

L. Bogolubsky, P. Dvurechensky, A. Gasnikov, G. Gusev, Y. Nesterov, A. Raigorodskii, A. Tikhonov, M. Zhukovskii, Learning supervised PageRank with gradientbased and gradientfree optimization methods, Preprint no. arXiv:1603.00717, Cornell University Library, arXiv.org, 2016.

A. Chernov, P. Dvurechensky, A. Gasnikov, Fast primaldual gradient method for strongly convex minimization problems with linear constraints, Preprint no. arXiv:1605.02970, Cornell University Library, arXiv.org, 2016.

S.W. Anzengruber, B. Hofmann, P. Mathé, Regularization properties of the discrepancy principle for Tikhonov regularization in Banach spaces, Preprint no. 12, Technische Universität Chemnitz, Fakultät für Mathematik, 2012.

B. Hofmann, P. Mathé, Parameter choice in Banach space regularization under variational inequalities, Preprint no. 05, Technische Universität Chemnitz, Fakultät für Mathematik, 2012.

V. Spokoiny, Parametric estimation. Finite sample theory, Preprint no. arXiv:1111.3029, Cornell University Library, arXiv.org, 2012.

V. Spokoiny, Roughness penalty, Wilks phenomenon, and Bernsteinvon Mises theorem, Preprint no. arXiv:1205.0498, Cornell University Library, arXiv.org, 2012.

R.I. Boţ, B. Hofmann, P. Mathé, Regularizability of illposed problems and the modulus of continuity, Preprint no. 17, Technische Universität Chemnitz, Fakultät für Mathematik, 2011.

B. Hofmann, P. Mathé, Some note on the modulus of continuity for illposed problems in Hilbert space, Preprint no. 07, Technische Universität Chemnitz, Fakultät für Mathematik, 2011.

G. Blanchard, P. Mathé, Discrepancy principle for statistical inverse problems with application to conjugate gradient iteration, Preprint no. 07, Universität Potsdam, Institut für Mathematik, 2011.