K-Means Clustering in Dual Space for Unsupervised Feature Partitioning in Multi-view Learning

Contributo in Atti di convegno

Data di Pubblicazione:

2019

Citazione:

K-Means Clustering in Dual Space for Unsupervised Feature Partitioning in Multi-view Learning / C. Mio, G. Gianini, E. Damiani - In: 2018 14th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS) / [a cura di] G.S. DiBaja, L. Gallo, K. Yetongnon, A. Dipanda, M. CastrillonSantana, R. Chbeir. - [s.l] : IEEE, 2019. - ISBN 9781538693858. - pp. 1-8 (( Intervento presentato al 14. convegno International Conference on Signal Image Technology & Internet Based Systems (SITIS) tenutosi a Las Palmas de Gran Canaria nel 2018 [10.1109/SITIS.2018.00012].

Abstract:

In contrast to single-view learning, multi-view learning trains simultaneously distinct algorithms on disjoint subsets of features (the views), and jointly optimizes them, so that they come to a consensus. Multi-view learning is typically used when the data are described by a large number of features. It aims at exploiting the different statistical properties of distinct views. A task to be performed before multi-view learning - in the case where the features have no natural groupings - is multi-view generation (MVG): it consists in partitioning the feature set in subsets (views) characterized by some desired properties. Given a dataset, in the form of a table with a large number of columns, the desired solution of the MVG problem is a partition of the columns that optimizes an objective function, encoding typical requirements. If the class labels are available, one wants to minimize the inter-view redundancy in target prediction and maximize consistency. If the class labels are not available, one wants simply to minimize inter-view redundancy (minimize the information each view has about the others). In this work, we approach the MVG problem in the latter, unsupervised, setting. Our approach is based on the transposition of the data table: the original instance rows are mapped into columns (the 'pseudo-features'), while the original feature columns become rows (the 'pseudo-instances'). The latter can then be partitioned by any suitable standard instance-partitioning algorithm: the resulting groups can be considered as groups of the original features, i.e. views, solution of the MVG problem. We demonstrate the approach using k-means and the standard benchmark MNIST dataset of handwritten digits.

Tipologia IRIS:

03 - Contributo in volume

Keywords:

Multi-view learning; k-means; dual space clustering; consensus clustering; bagging

Elenco autori: