Evolution on Random Fitness Landscapes

Mutation-selection model describes the evolution on molecular level where the competition between the main driving forces of Darwinian evolution, mutation and selection, occurs at the same time scale. The effects coming from a heterogenous, highly complex fitness landscape have received attraction. In this respect, the study of mathematical evolution models where the fitness landscape is given by a collection of random variables are received as a good starting point. Mathematically, this system is closely related to a branching random walk in a random environment, something that the WIAS intends to explore more in future. Here, the evolution on the sequential space of DNA codes is studied, where the fitness landscape is given by an independent collection of random variables. For a wide range of distributions, which have fat enough upper tails, the time scales are identified upon which the make up of the population makes a shift from a configuration of all aparticles being the type of the starting to a one consists those of the type of the fittest.

Stochastic Encounter-Mating Models

Another type of slective force acting on evolution is due to the availability of a mate to produce offspring with, known as the sexual selection. Evolution of mating preferences, its effects on mating patterns and the role of the encounter mechanism are important aspects of sexual selection. At WIAS recently the study of these questions was taken up by introducing a stochastic encounter-mating model with which it is possible to capture the different relations of these aspects in one model. One of the most important results that was reached is the classification of different mating preferences and encounter mechanisms that lead to panmictic mating patterns, that is, uniform mating. Moreover, in another study it was observed that the dynamics of the mating patterns of infinite populations under individual encounter is related to certain Lotka-Volterra systems.

Other Aspects of Population Genetics

An interesting phenomena observed in some organisms is the creation of seed banks, seeds that are kept in latency, only to be used later, as a protection against possible long lasting hazardeous external conditions. At WIAS one is able to describe the ancestral relations of the populations using moment duality techniques of the mathematical sees-bank model.
Recently, experiments that try to understand the evolution in controlled environments, such as the famous Lenski experiment, have received great attention. At WIAS, a mathematical model for the Lenski experiment is studied. The logarithmic increase of the mean fitness function is proved, which is actually observed in these experiments.



  • TH. Dickhaus, Simultaneous Statistical Inference, Springer, Berlin et al., 2014, 180 pages, (Monograph Published).

  Articles in Refereed Journals

  • A. González Casanova Soberón, N. Kurt, A. Wakolbinger, L. Yuan, An individual-based mathematical model for the Lenski experiment, and the deceleration of the relative fitness, Stochastic Processes and their Applications, 126 (2016) pp. 2211--2252.

  • J. Blath, A. González Casanova Soberón, N. Kurt, M. Wilke-Berenguer, A new coalescent for seed-bank models, The Annals of Applied Probability, 26 (2016) pp. 857--891.

  • K. Schildknecht, S. Olek, Th. Dickhaus, Simultaneous statistical inference for epigenetic data, PLOS ONE, 10 (2015) pp. e0125587/1--e0125587/15.
    Epigenetic research leads to complex data structures. Since parametric model assumptions for the distribution of epigenetic data are hard to verify we introduce in the present work a nonparametric statistical framework for two-group comparisons. Furthermore, epigenetic analyses are often performed at various genetic loci simultaneously. Hence, in order to be able to draw valid conclusions for specific loci, an appropriate multiple testing correction is necessary. Finally, with technologies available for the simultaneous assessment of many interrelated biological parameters (such as gene arrays), statistical approaches also need to deal with a possibly unknown dependency structure in the data. Our statistical approach to the nonparametric comparison of two samples with independent multivariate observables is based on recently developed multivariate multiple permutation tests. We adapt their theory in order to cope with families of hypotheses regarding relative effects. Our results indicate that the multivariate multiple permutation test keeps the pre-assigned type I error level for the global null hypothesis. In combination with the closure principle, the family-wise error rate for the simultaneous test of the corresponding locus/parameter-specific null hypotheses can be controlled. In applications we demonstrate that group differences in epigenetic data can be detected reliably with our methodology.

  • J. Blath, A. González Casanova Soberón, B. Eldon, N. Kurt, M. Wilke-Berenguer, Genetic variability under the seedbank coalescent, Genetics, 200 (2015) pp. 921--934.
    We analyze patterns of genetic variability of populations in the presence of a large seedbank with the help of a new coalescent structure called the seedbank coalescent. This ancestral process appears naturally as a scaling limit of the genealogy of large populations that sustain seedbanks, if the seedbank size and individual dormancy times are of the same order as those of the active population. Mutations appear as Poisson processes on the active lineages and potentially at reduced rate also on the dormant lineages. The presence of "dormant" lineages leads to qualitatively altered times to the most recent common ancestor and nonclassical patterns of genetic diversity. To illustrate this we provide a Wright-Fisher model with a seedbank component and mutation, motivated from recent models of microbial dormancy, whose genealogy can be described by the seedbank coalescent. Based on our coalescent model, we derive recursions for the expectation and variance of the time to most recent common ancestor, number of segregating sites, pairwise differences, and singletons. Estimates (obtained by simulations) of the distributions of commonly employed distance statistics, in the presence and absence of a seedbank, are compared. The effect of a seedbank on the expected site-frequency spectrum is also investigated using simulations. Our results indicate that the presence of a large seedbank considerably alters the distribution of some distance statistics, as well as the site-frequency spectrum. Thus, one should be able to detect from genetic data the presence of a large seedbank in natural populations.

  • TH. Dickhaus, Th. Royen, On multivariate chi-square distributions and their applications in testing multiple hypotheses, Statistics. A Journal of Theoretical and Applied Statistics, (2015) pp. 427--454.
    We are considered with three different types of multivariate chi-square distributions. Their members play important roles as limiting distributions of vectors of test statistics in several applications of multiple hypotheses testing. We explain these applications and provide formulas for computing multiplicity-adjusted $p$-values under the respective global hypothesis.

  • M. Hesse, A. Kyprianou, The total mass of super-Brownian motion upon exiting balls and Sheu's compact support condition, Stochastic Processes and their Applications, 124 (2014) pp. 2003--2022.
    We study the total mass of a d-dimensional super-Brownian motion as it first exits an increasing sequence of balls. The total mass process is a time-inhomogeneous continuous-state branching process, where the increasing radii of the balls are taken as the time-parameter. We characterise its time-dependent branching mechanism and show that it converges, as time goes to infinity, towards the branching mechanism of the total mass of a one-dimensional super-Brownian motion as it first crosses above an increasing sequence of levels.

    Our results identify the compact support criterion in Sheu (1994) as Grey's condition (1974) for the aforementioned limiting branching mechanism.

  • V. Heinrich, T. Kamphans, J. Stange, D. Parkhomchuk, Th. Dickhaus, J. Hecht, P.N. Robinson, P.M. Krawitz, Estimating exome genotyping accuracy by comparing to data from large scale sequencing projects, Genome Medicine, 5 (2013) pp. 69/1--69/11.

  • I. Türbachova, T. Schwachula, I. Vasconcelos, A. Mustea, T. Baldinger, K.A. Jones, H. Bujard, A. Olek, K. Olek, K. Gellhaus, I. Braicu, D. Könsgen, Ch. Fryer, E. Ravot, A. Hellwag, N. Westerfeld, J.O. Gruss, M. Meissner, M. Hassan, M. Weber, U. Hofmüller, S. Zimmermann, Ch. Loddenkemper, S. Mahner, N. Babel, E. Berns, R. Adams, R. Zeilinger, U. Baron, I. Vergote, T. Maughan, F. Marme, Th. Dickhaus, J. Sehouli, S. Olek, The cellular ratio of immune tolerance (immunoCRIT) is a definite marker for aggressiveness of solid tumors and may explain tumor dissemination patterns, Epigenetics, 8 (2013) pp. 1226--1235.

  • M. Schreuder, J. Höhne, B. Blankertz, S. Haufe, Th. Dickhaus, M. Tangermann, Optimizing event-related potential based brain-computer interfaces: A systematic evaluation of dynamic stopping methods, Journal of Neural Engineering, 10 (2013) pp. 036025/1--036025/13.

  • TH. Dickhaus, J. Stange, Multiple point hypothesis test problems and effective numbers of tests for control of the family-wise error rate, Calcutta Statistical Association Bulletin, 65 (2013) pp. 123--144.

  Contributions to Collected Editions

  • J. Blath, E. Bjarki, A. González Casanova Soberón, N. Kurt, Genealogy of a Wright--Fisher model with strong seedbank component, in: XI Symposium of Probability and Stochastic Processes, R.H. Mena, J.C. Pardo, V. Rivero, G. Uribe Bravo, eds., 69 of Birkhäuser Progress in Probability, Springer International Publishing, Switzerland, 2015, pp. 81--100.

  • H.-J. Mucha, H.-G. Bartel, Resampling techniques in cluster analysis: Is subsampling better than bootstrapping?, in: Data Science, Learning by Latent Structures, and Knowledge Discovery, B. Lausen, S. Krolak-Schwerdt, M. Böhmer, eds., Studies in Classification, Data Analysis and Knowledge Organization, Springer, Berlin et al., 2015, pp. 113--122.

  Preprints, Reports, Technical Reports

  • L. Avena, O. Gün, M. Hesse, The parabolic Anderson model on the hypercube, Preprint no. 2319, WIAS, Berlin, 2016, DOI 10.20347/WIAS.PREPRINT.2319 .
    Abstract, PDF (240 kByte)
    We consider the parabolic Anderson model (PAM) on the n-dimensional hypercube with random i.i.d. potentials. We parametrize time by volume and study the solution at the location of the k-th largest potential. Our main result is that, for a certain class of potential distributions, the solution exhibits a phase transition: for short time scales it behaves like a system without diffusion, whereas, for long time scales the growth is dictated by the principle eigenvalue and the corresponding eigenfunction of the Anderson operator, for which we give precise asymptotics. Moreover, the transition time depends only on the difference between the largest and k-th largest potential. One of our main motivations in this article is to investigate the mutation-selection model of population genetics on a random fitness landscape, which is given by the ratio of the solution of PAM to its total mass, with the field corresponding to the fitness landscape. We show that the phase transition of the solution translates to the mutation-selection model as follows: a population initially concentrated at the site of the k-th best fitness value moves completely to the site of the best fitness on time scales where the transition of growth rates happens. The class of potentials we consider involve the Random Energy Model (REM) of statistical physics which is studied as one of the main examples of a random fitness landscape.

  • A. Caiazzo, F. Caforio, G. Montecinos, L.O. Müller, P.J. Blanco, E.F. Toro, Assessment of reduced order Kalman filter for parameter identification in one-dimensional blood flow models using experimental data, Preprint no. 2248, WIAS, Berlin, 2016.
    Abstract, PDF (8646 kByte)
    This work presents a detailed investigation of a parameter estimation approach based on the reduced order unscented Kalman filter (ROUKF) in the context of one-dimensional blood flow models. In particular, the main aims of this study are (i) to investigate the effect of using real measurements vs. synthetic data (i.e., numerical results of the same in silico model, perturbed with white noise) for the estimation and (ii) to identify potential difficulties and limitations of the approach in clinically realistic applications in order to assess the applicability of the filter to such setups. For these purposes, our numerical study is based on the in vitro model of the arterial network described by [Alastruey et al. 2011, J. Biomech. bf 44], for which experimental flow and pressure measurements are available at few selected locations. In order to mimic clinically relevant situations, we focus on the estimation of terminal resistances and arterial wall parameters related to vessel mechanics (Young's modulus and thickness) using few experimental observations (at most a single pressure or flow measurement per vessel). In all cases, we first perform a theoretical identifiability analysis based on the generalized sensitivity function, comparing then the results obtained with the ROUKF, using either synthetic or experimental data, to results obtained using reference parameters and to available measurements.

  Talks, Poster

  • A. González Casanova Soberón, An individual-based model for the Lenski experiment, and the deceleration of the relative fitness, Workshop on Probabilistic Models in Biology, October 24 - 30, 2015, Playa del Carmen, Mexico, October 28, 2015.

  • A. González Casanova Soberón, Modeling the Lenski experiment, Genealogies in Evolution: Looking Backward and Forward, Workshop of the Priority Program (SPP) 1590 ``Probabilistic Structures in Evolution'', October 5 - 6, 2015, Goethe-Universität Frankfurt, October 6, 2015.

  • O. Gün, Stochastic encounter-mating model, Mathematical Model in Ecology and Evolution (MMEE 2015), July 7 - 13, 2015, Collège de France, Paris, France, July 8, 2015.

  • M. Hesse, Asymptotic growth of a branching random walk in a random environment on the hypercube, Friedrich-Alexander-Universität Erlangen-Nürnberg, Department Mathematik, December 4, 2014.

  • TH. Dickhaus, Simultaneous Bayesian analysis of contingency tables in genetic association studies, Bayesian Biostatistics 2014, July 2 - 5, 2014, University of Zurich, Switzerland, July 2, 2014.

  • TH. Dickhaus, Simultaneous Bayesian analysis of contingency tables in genetic association studies, International Workshop ``Advances in Optimization and Statistics'', May 15 - 16, 2014, Russian Academy of Sciences, Institute of Information Transmission Problems (Kharkevich Institute), Moscow, May 15, 2014.


Contributing Groups of WIAS

Related main application areas