Statistical Learning from Dependent Data: Learning Theory, Robust Algorithms and Applications

Machine learning constitutes one of the key technologies to thoroughly analyse empirical data. One of the most common assumptions in machine learning is, that the empirical data is realized from independent random variables. However, in practice this assumption can be violated when the data exhibits temporal and spatial dependencies or is recorded under varying experimental conditions or confounding factors. With this research program we work toward a theoretically sound and general framework of statistical learning from dependent data. At the heart of which lies the development of novel algorithms creating learning in particular cases of these settings and their application to problems from the sciences and technology. A particular emphasis of the program is on gaining an understanding of the theoretical foundations of learning in dependent settings (in order to explain under which circumstances the algorithms will work fine). All algorithms are embedded into a framework of automatic and sound interpretation of the trained models in terms of p-values (in order to facilitate further analysis by domain experts).

Machine learning constitutes one of the key technologies to thoroughly analyse empirical data. One of the most common assumptions in machine learning is, that the empirical data is realized from independent random variables. However, in practice this assumption can be violated when the data exhibits temporal and spatial dependencies or is recorded under varying experimental conditions or confounding factors. With this research program we work toward a theoretically sound and general framework of statistical learning from dependent data. At the heart of which lies the development of novel algorithms creating learning in particular cases of these settings and their application to problems from the sciences and technology. A particular emphasis of the program is on gaining an understanding of the theoretical foundations of learning in dependent settings (in order to explain under which circumstances the algorithms will work fine). All algorithms are embedded into a framework of automatic and sound interpretation of the trained models in terms of p-values (in order to facilitate further analysis by domain experts).

Duration of project

Start date: 03/2015

End date: 02/2018