Forschungsgruppe "Stochastische Algorithmen und Nichtparametrische Statistik"
Research Seminar "Mathematical Statistics" Summer semester 2010
last reviewed: April 21, 2010, Christine Schneider
Song Song (HU Berlin)
Flexible Factor Modelling in Time and Space
Abstract:tba
Elmar Diederichs (WIAS)
The Development of Sparse NonGaussian Component Analysis
Abstract:As a meanwhile classical tool for dimension reduction Sparse NonGaussian Component Analysis (SNGCA) has
some earlier and less ancestor versions: Non Gaussian Component Analysis for example is an unsupervised
method of extracting a linear structure from a high dimensional data based on estimating a low-dimensional
non-Gaussian data component. This talk will give an introduction in the basic ideas behind
this development of SNGCA into its current approach using the powerfull means of semidefinite relaxation.
The new procedure differs significantly from the earlier proposals and it improves the method efficiency
and sensitivity to a broad variety of deviations from normality and decreases the computational effort.
Fabrizio Durante (Freie Universität Bozen)
Tail dependence and copula
Abstract:The concept of copula represents a way for describing the dependence
among the components of a given random vector. As such, it has gained a lot
of popularity during the last years, especially in view of its possible
applications to finance, insurance and natural sciences.
According to Sklar's Theorem, a variety of multivariate distribution functions
can be constructed by putting together univariate marginal distribution
functions and a suitable copula, expressing the association among the
variables of interest. In particular, copulas allow to construct flexible
multivariate models that exhibit various kinds of dependencies in the tails of
their distributions, a feature of great interest in risk management.
In this talk, we present some recent results concerning the use (and some
misuses) of copulas for capturing tail dependencies. In particular, we will
show how a multivariate distribution function having a specific tail behavior
can be obtained by considering special copula constructions.
Moreover, we introduce and discuss the concept of threshold copula,
emphasizing its possible application to the detection of (spatial) contagion
between two financial markets.
Andreas Christmann (Bayreuth)
Some Recent Results on Support Vector Machines
Abstract:Support vector machines (SVMs) play an important role in modern statistical learn-
ing theory and are successfully applied to solve hard real life problems even for high-
dimensional and complex data sets. SVMs are based on regularized empirical risk
minimization. The talk gives a short introduction to SVMs and their goals. Some
recent results on consistency and robustness properties of SVMs will be given in the
second part of the talk.
The success of SVMs partially results from the now proven fact that an SVM
is the solution of a well-posed mathematical problem in Hadamard’s sense: for any
data set there exists a unique solution which depends in a continuous way on the
data. The learning properties and the statistical robustness of SVMs depend on the
choice of the loss function and on the kernel used to de%Gο¬%@ne the reproducing kernel
Hilbert space of functions. Leading examples of SVMs are based on the following loss
functions: the hinge and the logistic loss function for classi%Gο¬%@cation, the %GΗ«%@-insensitive
loss function or Huber’s loss function for regression, and the pinball loss function for
quantile regression. A classical kernel is the Gaussian radial basis function kernel.
Denis Belomestny (WIAS, Berlin)
Convergence rates of simulation based algorithms for optimal stopping problems
Abstract:Convergence rates of simulation based algorithms for optimal stopping problems
-Abstract: In this talk we consider simulation-based optimization algorithms for solving discrete time optimal stopping problems. Using large deviation theory for the increments of empirical processes, we derive optimal convergence rates for the value function estimate and show that they can not be improved in general. The rates derived provide a guide to the choice of the number of simulated paths needed in optimization step, which is crucial for the good performance of any simulation-based optimization algorithm. Finally, we present a numerical example of solving optimal stopping problem arising in finance that illustrates our theoretical findings.
Dennis Kristensen (Columbia NY)
Stochastic Demand and Revealed Preference
Abstract:tba
Leonid Pastur (Kharkiv, Ukraine)
Limiting laws for the eigenvalue statistics of large random matrices
Abstract:tba
Korbinian Strimmer (Universität Leipzig)
High-dimensional feature selection by decorrelation:
application in genomics and proteomics
Abstract:We present a novel approach to feature selection and variable
importance using "cat" and "car" scores. These are defined
as correlation-adjusted versions of t-scores and marginal
correlations, thus take account of correlation among predictors,
and are applicable to binary and continuous response, respectively.
"cat" and "car" scores follow naturally from a reformulation of the
linear model, with a canonical decomposition of the coefficient of
determination, natural incorporation of grouping structures, and a
simple link to classical (AIC, Cp, etc.) and adaptive (FDR) model
selection criteria. We propose a shrinkage estimator for "cat" and "car"
scores and apply them to feature selection in high-dimensional gene
expression and proteomics data. Furthermore, we highlight
some connections between FDR (false discovery rates), FNDR
(false non-discovery rates), and the recently proposed approach
of "Higher Criticism" methodologies.
tba
Natalia Bochkina (University of Edinburgh)
Bayesian wavelet estimators: optimality and a priori assumptions
Abstract:We consider Bayesian wavelet estimators in the context of
nonparametric regression. We discuss different choices of prior which
lead to optimal performance, considering in particular the pointwise
optimality in l_p norm of Bayes factor wavelet estimators. However, as
it was shown by Cai (2008), adaptive coefficient-by-coefficient
estimators considered above cannot achieve the global optimal rate
without a log factor. We discuss extensions of the considered Bayesian
models for wavelet coefficients that pool information across the
coefficients, some of them known to achieve the optimal global
rate exactly, and show that they also achieve the optimal local rate.
Bayesian wavelet modelling is usually done in the domain of wavelet
coefficients. We discuss how a priori assumptions in wavelet domain
transfer to the function domain for the considered estimators.
Christoph Rothe (Toulouse)
Analyzing Counterfactual Distributions
Abstract:tba
Jelena Bradic (Princeton University)
Penalized Composite Quasi-Likelihood for Ultrahigh-Dimensional Variable Selection.
Abstract:tba
Richard Samworth (Cambridge)
Maximum likelihood estimation of a multidimensional log-concave
density
Abstract:We show that if $X_1,...,X_n$ are a random sample from a density $f$ in $\mathbb{R}^d$, then with probability one there exists a unique log-concave maximum likelihood estimator $\hat{f}_n$ of $f$. The use of this estimator is attractive because, unlike kernel density estimation, the estimator is fully automatic, with no smoothing parameters to choose. We exhibit an iterative algorithm for computing the estimator and show how the method can be combined with the EM algorithm to fit finite mixtures of log-concave densities. Applications to classification, clustering and functional estimation problems will be discussed. The talk will be illustrated with pictures from the R package LogConcDEAD.
I will also discuss recent theoretical results on the performance of the estimator, which will cover both the case where the true density is log-concave, and when this model is misspecified. These results will be applied to study both linear and isotonic regression problems.
Co-authors: Madeleine Cule (University of Cambridge), Lutz Duembgen (University of Bern), Robert Gramacy (University of Cambridge), Dominic Schuhmacher (University of Bern) and Michael Stewart (University of Sydney).