# Research Group "Stochastic Algorithms and Nonparametric Statistics"

## Research Seminar "Mathematical Statistics" Winter semester 2011/12

last reviewed: July 14, 2011, Christine Schneider

Rui M. Castro (Eindhoven University of Technology)

Adaptive Sensing for Sparse Signal Detection and Localization

Abstract:In this talk I will consider the problem of detection and localization of sparse signals in
noise. I consider in particular adaptive sensing approaches (also known as active learning or inference
using sequential design of experiments). This means the data collection process is sequential and the
collection of future data is guided by all data collected so far. This is in contrast with most traditional
approaches where all the data is collected passively before any analysis is done. The added exibility of
sequential designs results in dramatic gains in performance, both for signal estimation and detection.
I consider in particular the setting where the signal of interest is a n-dimensional vector with solely
s non-zero components, and show that if sequential designs are allowed the minimum signal compo-
nent magnitude necessary to guarantee accurate detection or localization needs only to scale with s.
This is remarkable since there is no dependence on the extrinsic dimension n. On the other hand if
sequential designs are not allowed the magnitude of the signal components must necessarily scale with
the extrinsic dimension n. Our characterization involves both lower performance bounds, as well as
simple but powerful adaptive sensing procedures able to nearly achieve these fundamental limits for a
while range of scenarios. Extensions of these ideas to a compressed sensing scenario might also be discussed.

Samory Kpotufe (MPI Tübingen)

The curse of dimension in nonparametric regression

Abstract:Nonparametric regression consists of learning an arbitrary mapping f: \X
to \Y from a data set of
(X,Y) pairs in which the Y values are corrupted by noise of mean zero.
This statistical task is
known to be subject to a so-called "curse of dimension": if \X is a
subset of R^D, and if the only
smoothness assumption on f is that it satisfies a Lipschitz condition,
it is known that any estimator
based on n data points will have an error rate (risk) of
\Omega(n^{-2/(2+D)}). In other words a
data size exponential in D is required to approximate f, which is
unfeasible in high dimension.
Fortunately, high-dimensional data often has low-intrinsic complexity
due to correlations between features
and/or irrelevant features. Common examples in machine learning are data
that lie near a low-dimensional
manifold, and sparse data. Some nonparametric regressors perform better
in such situations as their risks
depend only on the "intrinsic dimension" of the data, appropriately
formalized. These methods which
are "adaptive" to intrinsic dimension therefore require no dimension
reduction!
I will start with a short discussion of some adaptive traditional
methods such as k-NN regression,
then move on to computationally efficient adaptive methods which are
less known. These efficient methods,
namely Random Projection tree and tree-kernel-hybrid regressors, have
strong theoretical guarantees
which are verifiable on a wide range of real-world data.
This talk is based on joint-work with Sanjoy Dasgupta and Nakul Verma.

Christian Hafner
(UCL, Belgien)

Volatility of price indices for heterogeneous goods

Abstract:Hedonic regression is a common tool to estimate price indices and has been widely
used to construct price indices of markets with heterogenous goods. Two prominent examples are
the real estate and the art markets. Although some eorts have been made to improve the eciency
of parameter estimates, there is no systematic treatment of volatility in these markets. Considering
heterogenous goods as alternative investments, this lack of reliable volatility measures prevents an
objective assessment of investment opportunities based on classical mean-variance criteria. Moreover,
derivatives on subsets of the traded goods require a precise estimation of the underlying volatility. For
example, in art markets, auction houses are interested in derivatives for collections or individual art
objects in order to hedge their risks. In this paper we propose a new model which explicitly denes
an underlying stochastic process for the price index. The model can be estimated using maximum
likelihood and an extended version of the Kalman lter. We derive theoretical properties of the
volatility estimator and show that it outperforms the standard estimator. To illustrate the usefulness of
the model, we apply it to a large data set of international blue chip artists. (joint work with Fabian Bocart)

Yoosoon Chang

Nonstationarity in Time Series of State Densities

Abstract:This paper proposes a new framework to analyze the nonstationarity in the time series of state densities, representing either cross-sectional or intra-period distributions of some underlying economic variables. We regard each state density as a realization of Hilbertian random variable, and use a functional time series model to fit a given time series of state densities. This allows us to explore various sources of the nonstationarity of such time series. The potential unit roots are identified through functional principal component analysis, and subsequently tested by the generalized eigenvalues of leading components of normalized sample variance operator. The asymptotic null distribution of the test statistic is obtained and tabulated. We use the methodology developed in the paper to investigate the state densities given by the cross-sectional distributions of individual earnings and the intra-month distributions of stock returns. We find some clear evidence for the presence of strong persistency in their time series.

Prof. Vladimir Spokoiny (WIAS/HU Berlin)

Semiparametric alternation: convergence and efficiency

Abstract:A general problem of semiparametric estimation is considered when
the target parameter \( \theta \) describing the model structure has to be estimated
under the presence of the high dimensional nuisance \( \eta \).
The alternating approach assumes that the partial estimation of \( \eta \) under the given
target \( \theta \) and vice versa can be done efficiently even in large dimension because
the related problem is convex or even quadratic.
This naturally leads to the procedure based on sequential optimization:
one starts from some initial value of \( \theta \) or \( \eta \) and then sequentially estimates one parameter with the other
one fixed.
Unfortunately, precise theoretical results addressing the overall quality of
such procedures are only available in special cases. One example is
given by linear models. In this case, an alternating procedure converges and is efficient under quite simple
and tractable identifiability conditions.
We aim at developing a novel approach to systematic study of the quality and
efficiency of such iterative procedures based on the idea of approximating the original model by a
linear one in a local vicinity of the central point.
This allows for extending the algorithmic properties of the procedure like geometric
convergence from the linear to a general smooth case.

Itai Dattner (TU Eindhoven)

On deconvolution of distribution functions

Abstract:It is well known that rates of convergence of estimators in deconvolution problems are aected
by the smoothness of the error density and the density to be estimated. However, the problem of distribution
deconvolution is more delicate than what was considered so far. We derive dierent rates of convergence
with respect to the tail behavior of the error characteristic function. We develop an optimal in order
deconvolution estimator and construct its adaptive version which achieves the optimal rates within a
logarithmic factor. Simulation studies comparing the adaptive estimator to other methods are presented
and support the superiority of our method. We also present an example with real data and discuss a few
relevant extensions of our results.
Based on joint works with Alexander Goldenshluger, Benjamin Reiser and Anatoli Juditsky.

Guillaume Lecué (Univ. Marne-la-Vallée)

General oracle inequalities for empirical risk minimization (ERM) procedures, penalized ERM and regularized ERM with applications to high dimensional data analysis

Abstract:We show that empirical risk minimization procedures, regularized empirical risk minimization procedures and penalized empirical risk minimization procedures satisfy non-exact oracle inequalities in a general unbounded framework. The main novelty, in addition to the boundedness assumption free setup, is that those inequalities can yield fast rates even in situations in which exact oracle inequalities only hold with slower rates.

Enno Mammen (Universität Mannheim)

Flexible Generalized Varying Coecient Regression Modes and
Locally Additive Interaction Models

Abstract:This talk discusses a very general model that can be used widely to analyze the relation
between a response and multiple covariates. The model is nonparametric, yet renders easy interpretation
for the eects of the covariates. The model accommodates both continuous and discrete random variables
for the response and covariates. It is quite exible to cover both of the generalized varying coecient
models and the generalized additive models as special cases. Under a weak condition we give a general
theorem that the problem of estimating the multivariate mean function is equivalent to that of estimating
its univariate component functions. We discuss implications of the theorem for penalized least squares
estimators, and then investigate the outcomes in full details for a kernel-type estimator. The kernel
estimator is given as a solution of a system of nonlinear integral equations. We provide an iterative
algorithm to solve the system of equations and discuss the theoretical properties of the estimator and the
algorithm. Finally, we give a simulation results and illustrate the method with a data example. The talk
reports on joint work with Young K. Lee and Byeong U. Park.

Paul von Bünau and Franz Kiraly (TU Berlin)

Abstract:Joint work with Frank Meinecke, Duncan Blythe and K.-R. Muller.
In the rst part of this talk, we introduce Stationary Subspace Analysis (SSA) [1], a source-separation method
which linearly factorizes observed data into stationary and non-stationary components. We present an
SSA algorithm for the mean and covariance matrix, discuss conditions for identiability and our simulation
results. In an application to EEG analysis, we demonstrate how SSA reveals meaningful stationary and
non-stationary brain sources.
In the second part of this talk, we present ideal regression, an algebraic-geometric approach to estimating
polynomial relations. Ideal regression is a generalization of classical regression setups. We demonstrate
how to obtain consistent estimators and how to prove properties for these estimators, using techniques
from algebraic geometry. We present a set of possible applications, and show that SSA can be viewed as a
special case of ideal regression [2], exhibiting an algorithmic estimator which we evaluate in simulations.
References
[1] von Bunau, P., Meinecke, F. C., Kiraly, F. J. and Muller, K. R. Finding Stationary Subspaces in
Multivariate Time Series. Phys. Rev. Lett. 103 (21):214101, 2009
[2] Kiraly, F. J., von Bunau, P., Meinecke, F. C., Blythe, D. A. J. and Muller, K. R. Algebraic Geometric
Comparison of Probability Distributions. Oberwolfach preprint (OWP) 2011-30, 2011.