Bachelor/Master Theses, Semester Projects, and DAS DS Capstone Projects

If you are interested in one of the following topics, please contact the person listed with the respective topic.

These projects serve to illustrate the general nature of projects we offer. You are most welcome to inquire directly with Prof. Bölcskei about tailored research projects. Likewise, please contact Prof. Bölcskei in case you are interested in a bachelor thesis project.

Also, we have a list of finished theses on our website.

List of Semester Projects (SP)

List of Master Projects (MA)

List of DAS Data Science Capstone Projects (CP)

Learning in indefinite spaces (MA)

In classical learning theory, a symmetric, positive semidefinite and continuous kernel function is used to construct a reproducing kernel Hilbert space, which serves as a hypothesis space for learning algorithms [1].

However, in many applications, the kernel function fails to be positive semidefinite [2] and leads to indefinite Krein spaces [3]. The goal of this project is to develop a theory of learning for reproducing kernel Krein spaces.

Type of project: 100% theory
Prerequisites: Strong mathematical background, measure theory, functional analysis
Supervisor: Erwin Riegler
Professor: Helmut Bölcskei

[1] F. Cucker and D. X. Zhou, "Learning Theory," ser. Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press, 2007.

[2] R. Luss and A. d’Aspremont, "Support vector machine classification with indefinite kernels," Mathematical Programming Computation, vol. 1, no. 2-3, pp. 97–118, Oct. 2009.

[3] A. Gheondea, "Reprocucing kernel Krein spaces," Chapter 14 in D. Alpay, Operator Theory, Springer, 2015.

Neural collapse (SP/MA)

Recent experiments show that the last-layer features of prototypical neural networks trained by stochastic gradient descent on common classification datasets favor certain symmetric and geometric patterns. Moreover, the networks develop towards these patterns even when training is continued after the training error has already been driven to zero [1]. For example, the individual last-layer features collapse to their class means.

These patterns, appearing in empirical network training, can be used to explain the generalization ability of deep networks. The phenomenon is called ”neural collapse” and constitutes a type of inductive/implicit bias, which has been studied mathematically for linear networks trained by gradient descent on classification datasets [2,3]. The goal of this project is to develop a general mathematical theory for the ”neural collapse” phenomenon.

Type of project: 80% theory, 20% simulation
Prerequisites: Strong mathematical background and programming skills
Supervisor: Weigutian Ou
Professor: Helmut Bölcskei

[1] V. Papyan, X. Y. Han, and D. L. Donoho, “Prevalence of neural collapse during the terminal phase of deep learning training,” Proceedings of the National Academy of Sciences, vol. 117, no. 40, pp. 24652–24663, 2020. [Link to Document]

[2] D. Soudry, E. Hoffer, M. S. Nacson, S. Gunasekar, and N. Srebro, “The implicit bias of gradient descent on separable data,” The Journal of Machine Learning Research, vol. 19, no. 1, pp. 2822–2878, 2018. [Link to Document]

[3] Z. Ji and M. J. Telgarsky, “Gradient descent aligns the layers of deep linear networks,” Proceedings of the 7th International Conference on Learning Representations, ICLR, 2019. [Link to Document]

Dynamic prediction model discovery/development for recovery after stroke (MA/CP/SP)

With stroke being the third most frequent cause of disease burden worldwide [1], one major challenge is to understand poststroke recovery in order to provide patient-tailored rehabilitation. Many prediction models for stroke outcome are available [2], but dynamic models for the course of recovery are scarce [3]. These are, however, essential for designing stratified rehabilitation interventions and discharge planning.

In the Neurorehabilitation Clinic of the Luzerner Kantonsspital, the patients' ability to perform daily activities is assessed on a weekly basis using the Lucerne ICF Based Multidisciplinary Observation Scale (LIMOS) [4, 5]. This standardized scale consists of 45 items (score range 45-225), divided over seven domains: (1) Learning and applying knowledge, (2) General tasks and demands, (3) Communication, (4) Mobility, (5) Self-care, (6) Domestic life, and (7) Interpersonal interactions and relationships. The assessments are part of the clinic's internal database that currently contains longitudinal data of about 1'300 stroke patients.

The main goal of this thesis project is to employ signal processing and machine learning methods to discover/develop a dynamic prediction model for recovery of daily activities as assessed by the LIMOS scale in subacute stroke patients, admitted to the rehabilitation clinic at the Luzerner Kantonsspital. The secondary aim is to investigate the recovery profiles within each of the seven domains and to understand how they relate to one another by using graphical network models from statistics [6].

The signal processing methods employed in this project range from multiple time-series analysis [7, 8] over clustering [9] to potentially recurrent neural networks for dynamic prediction model discovery [10].

This is a collaborative project between ETH and the neurorehabilitation expert team of the Luzerner Kantonsspital which offers the master student the possibility to learn about graphical model discovery and to closely interact with clinicians and experts in the field of neurorehabilitation, and to get insight into how data in the clinic are collected and used for stroke treatment.

Type of project: 20% theory, 60% implementation/programming, 20% model development
Prerequisites: Good background in signal processing and statistics, programming skills
Supervisor: Clemens Hutter, Dr. Janne Veerbeek, PD Dr. Tim Vanbellingen, Prof. Dr. med. Thomas Nyffeler
Professor: Helmut Bölcskei

[1] GBD 2013 DALYs and HALE Collaborators, C. Murray, R. Barber, K. Foreman, A. Abbasoglu Ozgoren, F. Abd-Allah, et al., "Global, regional, and national disability-adjusted life years (DALYs) for 306 diseases and injuries and healthy life expectancy (HALE) for 188 countries, 1990–2013: Quantifying the epidemiological transition," Lancet, vol. 386, no. 10009, pp. 2145–91, 2015, doi: 10.1016/S0140-6736(15)61340-X. [Link to Document]

[2] J. Veerbeek, G. Kwakkel, E. van Wegen, J. Ket, and M. Heymans, "Early prediction of outcome of activities of daily living after stroke: A systematic review," Stroke, vol. 42, no. 5, pp. 1482–8, 2011, doi: 10.1161/STROKEAHA.110.604090. [Link to Document]

[3] R. Selles, E. Andrinopoulou, R. Nijland, R. van der Vliet, J. Slaman, E. van Wegen, D. Rizopoulos, G. Ribbers, C. Meskers, and G. Kwakkel, "Computerised patient-specific prediction of the recovery profile of upper limb capacity within stroke services: The next step," J. Neurol. Neurosurg. Psychiatry, vol. jnnp-2020-324637, 2021, doi: 10.1136/jnnp-2020-324637. [Link to Document]

[4] B. Ottiger, T. Vanbellingen, C. Gabriel, E. Huberle, M. Koenig-Bruhin, T. Pflugshaupt, S. Bohlhalter, and T. Nyffeler, "Validation of the new Lucerne ICF based Multidisciplinary Observation Scale (LIMOS) for stroke patients," PLoS One, 2015;10(6):e0130925. doi: 10.1371/journal.pone.0130925. [Link to Document]

[5] T. Vanbellingen, B. Ottiger, T. Pflugshaupt, J. Mehrholz, S. Bohlhalter, T. Nef, and T. Nyffeler, "The responsiveness of the Lucerne ICF-Based Multidisciplinary Observation Scale: A comparison with the Functional Independence Measure and the Barthel Index," Front. Neurol., vol. 7, no. 152, 2016, doi: 10.3389/fneur.2016.00152. [Link to Document]

[6] P. L. Loh and P. Bühlmann, "High-dimensional learning of linear causal networks via inverse covariance estimation," The Journal of Machine Learning Research, pp. 3065–3105, Jan. 2014. [Link to Document]

[7] M. Eichler, "A frequency-domain based test for non-correlation between stationary time series," Metrika, vol. 65, no. 2, pp. 133–157, Feb. 2007. [Link to Document]

[8] A. Jung, R. Heckel, H. Bölcskei, and F. Hlawatsch, "Compressive nonparametric graphical model selection for time series," Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 769–773, May 2014. [Link to Document]

[9] M. Tschannen and H. Bölcskei, "Robust nonparametric nearest neighbor random process clustering," IEEE Transactions on Signal Processing, vol. 65, no. 22, pp. 6009–6023, Nov. 2017. [Link to Document]

[10] C. Hutter, R. Gül, and H. Bölcskei, "Metric entropy limits on recurrent neural network learning of linear dynamical systems," Applied and Computational Harmonic Analysis, Apr. 2021, submitted. [Link to Document]

Automatic synopsis generation from amendment proposals for German law (MA)

Changes to German law are proposed in the form of amendments, which contain natural language instructions on how to change individual words or sentences within the current law (see [1] for example). For laypeople, it is difficult to infer from such proposals the text of the law after the amendment is accepted, thus reducing the ability of the general public to participate in the legislative process [2]. The goal of this project is to develop a machine learning algorithm that reads the current version of the law as well as the proposed amendment and then produces the new version of the law. This will allow to automatically generate a synopsis that compares the previous and proposed versions (see [3] for an example).

Recently significant advances in machine translation and question answering were made using transformer models that are pretrained on large unsupervised data sets [4, 5, 6]. Machine learning solutions for the specific task at hand here have, however, not been studied previously. Significant new contributions will hence be required for its solution. In particular, the semi-structured nature of amendments might make it necessary to incorporate a copy mechanism [7, 8, 9]. In this project, you will have the opportunity to, first, make novel contributions to the field of natural language processing and, second, to develop a working algorithm that can be deployed online and used by the general public.

Type of project: 70% implementation/programming, 30% model development
Prerequisites: Experience with deep learning for NLP, knowledge of German
Supervisor: Clemens Hutter, Joseph Rumstadt
Professor: Helmut Bölcskei

[1] "Gesetz zur Modernisierung des notariellen Berufsrechts und zur Änderung weiterer Vorschriften." [Link to Document]

[2] F. Herbert, "Verfassungsblog: On matters constitutional," 2021, doi: 10.17176/20210305-033813-0. [Link to Document]

[3] "Synopse: Gesetz zur Modernisierung des notariellen Berufsrechts und zur Änderung weiterer Vorschriften." [Link to Document]

[4] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," Advances in Neural Information Processing Systems, pp. 5999–6009, 2017. [Link to Document]

[5] A. Radford, T. Narasimhan, T. Salimans, and I. Sutskever, "Improving language understanding by generative pre-training," Preprint, pp. 1–12, 2018. [Link to Document]

[6] J. Devlin, M. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 1, pp. 4171–4186, 2019. [Link to Document]

[7] J. Gu, Z. Lu, H. Li, and V. Li, "Incorporating copying mechanism in sequence-to-sequence learning," 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, vol. 3, pp. 1631–1640, 2016, doi: 10.18653/v1/p16-1154. [Link to Document]

[8] A. See, P. Liu, and C. Manning, "Get to the point: Summarization with pointer-generator networks," ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), vol. 1, pp. 1073–1083, 2017, doi: 10.18653/v1/P17-1099. [Link to Document]

[9] B. McCann, N. Keskar, C. Xiong, and R. Socher, "The natural language decathlon: Multitask learning as question answering." [Link to Document]

Acoustic sensing and trajectory estimation of objects flying at supersonic speed (with industry) (SP)

In shooting sports, hunting, and law-enforcement applications measuring the speed and trajectory of projectile flight at high precision and reliability is an important technical challenge. For supersonic projectiles these quantities are estimated from signals aquired by microphones placed at different locations. Recently, more powerful microprocessors have made it possible to employ more sophisticated algorithms.

The goal of this project is to investigate techniques such as linearization of non-linear systems of equations, least squares fitting, and neural network driven machine learning. Existing hardware and algorithms provide an ideal starting point for the project, which will be carried out in collaboration with an industry partner called SIUS (located in Effretikon, Zurich). SIUS offers close supervision, and the possibility to use real hardware and a test laboratory.

About the industry partner: SIUS is the world’s leading manufacturer of electronic scoring systems in shooting sports. The company is specialized in producing high speed and high precision measurement equipment capable of measuring projectile position and trajectory and has been equipping the most important international competitions including the Olympic Games for decades.

Type of project: 20% literature research, 20% theory, 50% implementation/programming, 10% experiments
Prerequisites: Solid mathematical background, knowledge of SciPy, Matlab or a similar toolset, ideally knowledge on (deep) neural networks
Supervisor: Michael Lerjen, Steven Müllener
Professor: Helmut Bölcskei

[1] SIUS Homepage [Link to Document]

Double descent curves in machine learning (MA)

Classical machine learning theory suggests that the generalization error follows a U-shaped curve as a function of the model complexity [1, Sec 2.9.]. When too few parameters are used to train the model, the generalization error is high due to underfitting. Too many parameters result in overfitting and hence again in a large generalization error. There exists a sweet spot at the bottom of the so-called U-shaped curve. During the past few years, it was observed that increasing the model complexity beyond the so-called interpolation threshold leads to a generalization error that starts decreasing again [2]. The overall generalization error hence follows a so-called double descent curve. To date, there are only experimental results indicating the double descent behavior. These experiments employ vastly different complexity measures and learning algorithms. The goal of this project is to first understand the experiments reported in the literature. Then, you will study the theory of metric entropy [3] and you will try to understand whether, and if so under which learning algorithms, a double descent curve appears when model complexity is measured in terms of metric entropy.

Type of project: 70% theory, 30% simulation
Prerequisites: Programming skills and knowledge in machine learning
Supervisor: Weigutian Ou
Professor: Helmut Bölcskei

[1] J. Friedman, T. Hastie, and R. Tibshirani, "The elements of statistical learning," Springer Series in Statistics, vol. 1, Springer, New York, 2001.

[2] M. Belkin, D. Hsu, S. Ma, and S. Mandal, "Reconciling modern machine-learning practice and the classical bias–variance trade-off," Proceedings of the National Academy of Sciences, 116(32):15849–15854, 2019. [Link to Document]

[3] P. Grohs, D. Perekrestenko, D. Elbrächter, and H. Bölcskei, "Deep neural network approximation theory," IEEE Transactions on Information Theory, Jan. 2019, submitted. [Link to Document]

Deep ReLU network approximation rates (MA)

The compositional nature of deep neural networks allows for a systematic constructive approach to establishing good approximation rates for a wide range of classically considered function classes [1, 2, 3].

The goal of this project is to understand the techniques used in [2] and [3] and to subsequently employ them to characterize approximation rates for Daubechies wavelets. (If so inclined, one could also choose some other interesting class of functions.)

Type of project: 100% theory
Prerequisites: Strong mathematical background
Supervisor: Dennis Elbrächter
Professor: Helmut Bölcskei

[1] D. Yarotsky, "Error bounds for approximations with deep ReLU networks," Neural Networks, vol. 94, pp. 103–114, 2017. [Link to Document]

[2] D. Elbrächter, D. Perekrestenko, P. Grohs, and H. Bölcskei, "Deep neural network approximation theory," IEEE Transactions on Information Theory, vol. 67, no.5, pp. 2581–2623, May 2021. [Link to Document]

[3] I. Daubechies, R. A. DeVore, N. Dym, S. Faigenbaum-Golovin, S. Z. Kovalsky, K.-C. Lin, J. Park, G. Petrova, and B. Sober, "Neural network approximation of refinable functions," arxiv:2105.12806, 2021. [Link to Document]

On the metric entropy of dynamical systems (MA/SP)

The aim of this project is to explore the metric complexity of dynamical systems, i.e., to identify how much information about a system's input-output behavior is needed to be able to describe the sytem dynamics to within a prescribed accuracy. In particular, you will study the asymptotics of ε-entropy in the Kolmogorov sense [1, 2] of a certain class of causal linear systems [3]. Based on these results you will try to develop a general theory that encompasses more general classes of dynamical systems, including time-varying systems [4] and nonlinear systems [5].

Type of project: 100% theory
Prerequisites: Strong mathematical background
Supervisor: Diyora Salimova
Professor: Helmut Bölcskei

[1] A. N. Kolmogorov, "On certain asymptotic characteristics of completely bounded metric spaces," Doklady Akademii Nauk SSSR, vol. 108, no. 3, pp. 385–389, 1956.

[2] A. N. Kolmogorov and V. M. Tikhomirov, "ε-entropy and ε-capacity of sets in functional spaces," in Uspekhi Matematicheskikh Nauk, vol. 14, no. 2, pp. 3–86, 1959.

[3] G. Zames, "On the metric complexity of causal linear systems: ε-entropy and ε-dimension for continuous time," IEEE Transactions on Automatic Control, vol. 24, no. 2, pp. 222–230, 1979. [Link to Document]

[4] G. Matz, H. Bölcskei, and F. Hlawatsch, "Time-frequency foundations of communications," IEEE Signal Processing Magazine, vol. 30, no. 6, pp. 87–96, 2013. [Link to Document]

[5] M. Schetzen, "Nonlinear system modeling based on the Wiener theory," Proceedings of the IEEE, vol. 69, no. 12, pp. 1557–1573, 1981. [Link to Document]

Concentration of measure phenomena in machine learning (MA)

Things tend to get weird in high dimensions [1, 2]. We would like to understand why and how.

One example of this weirdness is the observation that, with increasing dimension, Lipschitz continuous functions become almost constant on inputs sampled according to so-called isoperimetric measures (e.g. the Gaussian distribution). In [3] the problem of fitting noisy data — with n data points in d-dimensional space sampled according to an isoperimetric measure — to an error below the noise level is considered. It is shown that the number of parameters of solutions, which are stably parametrized and (Lipschitz-)robust, must scale at least like nd.

The goal of this project is to understand the arguments in [3] and to subsequently establish either a variation of these results or, ideally, a novel result based on a concentration of measure phenomenon [1].

Type of project: 100% theory
Prerequisites: Strong mathematical background
Supervisor: Dennis Elbrächter
Professor: Helmut Bölcskei

[1] A. S. Bandeira, A. Singer, T. Strohmer, “Mathematics of data science (draft)”. [Link to Document]

[2] R. Vershynin, “High-dimensional probability: An introduction with applications in data science" (Cambridge series in statistical and probabilistic mathematics), Cambridge University Press, 2018. [Link to Document]

[3] S. Bubeck and M. Selke, “A universal law of robustness via isoperimetry,” arxiv:2105.12806, 2021. [Link to Document]