September 3, 2018 - September 7, 2018
Experimental or observational data of high or infinite dimensionality are getting common in institutes of all sections of the Leibniz Association. This creates an increasing demand for adequate modern data analysis techniques. At the same time reproducibility of experiments and their statistical analyses lead to new requirements for good scientific practice and requests for open source and open science.
Both topics have been addressed in a way that provides knowledge transfer from mathematical and applied statistics into the various scientific communities and helps to develop skills in R programming, statistical modeling and reproducible data analysis. The school has been problem oriented. Participants have been asked to provide their data and problems in advance (see below).
The focus of the plenary lectures was on three subject areas:
- introduction to the R statistical environment, working with R and Rstudio, writing dynamic documents with RMarkdown, version control using git and good scientific practice in general,
- methods and models for dimension reduction, in both classical statistics and when dimensionality gets large compared to sample size, strategies in multiple testing and variable selection,
- an introduction into modern functional data analysis models.
Another essential part of the program where training sessions in small groups on data problems provided by the participants:
- Clustering of cells - mass cytometry data (problem and data provided by Marie Urbicht from DRFZ),
- Preprocessing and analysis of Raman spectroscopy data (provided by Jing Huang from IPHT),
- Modeling simultaneous measurements of indoor and outdoor particle concentrations (provided by Jiangyue Zhao from TROPOS),
- Sensitivity of high latitude winds to equatorial ionospheric dynamics (provided by Jerry Czarnecki from IAP),
- Air pollution data collected while walking the streets of Leipzig (provided by Honey Dawn Alas from TROPOS).
We thank all the students for their very engaged and active participantion. According to their feedback, they could appreciably improve their skills in data processing and analysis using R, in doing reproducible research with dynamic documents and version control, and have been further trained in interdisciplinary communication with researchers from other diciplines. They got insights into modern statistical concepts and new ideas about how to proceed in their own research.
A particular thank goes to the speakers
- Dr. Clara Happ (LMU Munich, AG Biostatistics): Functional Data Analysis
- Dr. Joerg Polzehl (WIAS): Modeling High-dimensional Data; person in charge of the scientific program
- Dr. Heidi Seibold (LMU Munich, Institute for Medical Information Processing, Biometry, and Epidemiology): R, Open Science, Reproducible Research
- Almond Stöcker (LMU Munich, AG Biostatistics): Functional Data Analysis
- Dr. Alexandra Suvorikova (University of Potsdams): Mathematical Statistics, Multiple testing
Last, but not least, we thank the Mathematical Research Institute Oberwolfach (MFO) for providing the venue, supplying the group with rooms, equipment, coffee and assistance in any form necessary.