Data di Pubblicazione:
2021
Citazione:
ASPECTS OF DATA STRUCTURE IN MACHINE LEARNING / V. Erba ; supervisore: S. Caracciolo ; coordinatore: M. Paris. Dipartimento di Fisica Aldo Pontremoli, 2021 Oct 21. 34. ciclo, Anno Accademico 2021. [10.13130/erba-vittorio_phd2021-10-21].
Abstract:
It is widely believed that understanding data structure is a crucial ingredient to push forward our comprehension on how (and why) modern machine learning works.
Still, most of the theoretical results we have are obtained under very simplifying assumptions on the structure of the training data.
In this Thesis, I review some novel results on the problem of characterizing the geometric structure of datasets and the consequences that this structure has on learning algorithms.
I also provide pedagogical introductions to manifold learning, random geometric graphs theory and supervised binary classification.
I focus on three different aspects of the problem.
First, I spend some time reviewing techniques to characterize the intrinsic dimensionality of datasets: this is the first "experimental" step towards proper theoretical modelling of data.
Then, I focus on the problem of finding null models of data in high-dimension: does Euclidean structure survive when the dimensionality of data becomes larger and larger?
Finally, I study how geometric data structure alters the expressive potential of simple classifiers.
Tipologia IRIS:
Tesi di dottorato
Elenco autori:
V. Erba
Link alla scheda completa: