Co-Clustering Multi-View Data Using the Latent Block Model
arxiv(2024)
摘要
The Latent Block Model (LBM) is a prominent model-based co-clustering method,
returning parametric representations of each block cluster and allowing the use
of well-grounded model selection methods. The LBM, while adapted in literature
to handle different feature types, cannot be applied to datasets consisting of
multiple disjoint sets of features, termed views, for a common set of
observations. In this work, we introduce the multi-view LBM, extending the LBM
method to multi-view data, where each view marginally follows an LBM. In the
case of two views, the dependence between them is captured by a cluster
membership matrix, and we aim to learn the structure of this matrix. We develop
a likelihood-based approach in which parameter estimation uses a stochastic EM
algorithm integrating a Gibbs sampler, and an ICL criterion is derived to
determine the number of row and column clusters in each view. To motivate the
application of multi-view methods, we extend recent work developing hypothesis
tests for the null hypothesis that clusters of observations in each view are
independent of each other. The testing procedure is integrated into the model
estimation strategy. Furthermore, we introduce a penalty scheme to generate
sparse row clusterings. We verify the performance of the developed algorithm
using synthetic datasets, and provide guidance for optimal parameter selection.
Finally, the multi-view co-clustering method is applied to a complex genomics
dataset, and is shown to provide new insights for high-dimension multi-view
problems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要