Data di Pubblicazione:
2007
Citazione:
Combinatorial mixtures of multiparameter distributions / V. Edefonti, G. Parmigiani - In: ISI 2007 : 56. Session of the International Statistical Institute : 22-29 August 2007 Lisboa, Portugal : Book of Abstracts / [a cura di] M.I. Gomez, D. Pestana, P. Silva. - Lisboa : CEAUL, 2007. - ISBN 978-972-8859-71-8. - pp. 279-280 (( Intervento presentato al 56. convegno Session of the International Statistical Institute tenutosi a Lisboa (Portugal) nel 2007.
Abstract:
Combinatorial mixtures refers to a flexible class of models for inference on mixture distributions whose components have multidimensional parameters. The idea behind it is to allow each element of component-specific parameter vectors to be shared by a subset of other components. We develop Bayesian inference and computation approaches for this class of distributions. We define a general
prior distribution structure where a positive probability is put on every possible combination of sharing patterns, whence the name combinatorial mixtures. This partial sharing allows for greater generality and flexibility in comparison with traditional approaches to mixture modeling, while still allowing to assign significant mass to models that are more parsimonious than the general mixture case in which no sharing takes place. One of the implications of our setting is that, once a maximum number of components K∗ is specified, inference on the parameters and the number of components, say K, is subsumed by the inference on combinatorial patterns.
We illustrate our combinatorial mixtures in an application based on the normal model. This work was originally motivated by the analysis of cancer subtypes: in terms of biological measures of interest, subtypes may be characterized by differences in location, scale, correlations or any of the combinations.
We use data on molecular classification of lung cancer from the web-based information supporting the
published manuscript Garber et al. (2001). In this context, the main goals of a mixture model analysis
are to a) estimate the number of subgroups in a sample; b) make inferences about the assignment of samples to these subgroups; and c) generate hypotheses about which of the mechanisms above is likely to characterize the subgroups. Our paper adds a new tool to Bayesian mixture models, that allows to answer all three of these questions.
prior distribution structure where a positive probability is put on every possible combination of sharing patterns, whence the name combinatorial mixtures. This partial sharing allows for greater generality and flexibility in comparison with traditional approaches to mixture modeling, while still allowing to assign significant mass to models that are more parsimonious than the general mixture case in which no sharing takes place. One of the implications of our setting is that, once a maximum number of components K∗ is specified, inference on the parameters and the number of components, say K, is subsumed by the inference on combinatorial patterns.
We illustrate our combinatorial mixtures in an application based on the normal model. This work was originally motivated by the analysis of cancer subtypes: in terms of biological measures of interest, subtypes may be characterized by differences in location, scale, correlations or any of the combinations.
We use data on molecular classification of lung cancer from the web-based information supporting the
published manuscript Garber et al. (2001). In this context, the main goals of a mixture model analysis
are to a) estimate the number of subgroups in a sample; b) make inferences about the assignment of samples to these subgroups; and c) generate hypotheses about which of the mechanisms above is likely to characterize the subgroups. Our paper adds a new tool to Bayesian mixture models, that allows to answer all three of these questions.
Tipologia IRIS:
03 - Contributo in volume
Keywords:
Bayesian inference ; Markov Chain Monte Carlo ; Clustering
Elenco autori:
V. Edefonti, G. Parmigiani
Link alla scheda completa:
Titolo del libro:
ISI 2007 : 56. Session of the International Statistical Institute : 22-29 August 2007 Lisboa, Portugal : Book of Abstracts