A model-based clustering algorithm with covariates adjustment and its application to lung cancer stratification.

Journal of bioinformatics and computational biology(2023)

引用 0|浏览24
暂无评分
摘要
Usually, the clustering process is the first step in several data analyses. Clustering allows identify patterns we did not note before and helps raise new hypotheses. However, one challenge when analyzing empirical data is the presence of covariates, which may mask the obtained clustering structure. For example, suppose we are interested in clustering a set of individuals into controls and cancer patients. A clustering algorithm could group subjects into young and elderly in this case. It may happen because the age at diagnosis is associated with cancer. Thus, we developed CEM-Co, a model-based clustering algorithm that removes/minimizes undesirable covariates' effects during the clustering process. We applied CEM-Co on a gene expression dataset composed of 129 stage I non-small cell lung cancer patients. As a result, we identified a subgroup with a poorer prognosis, while standard clustering algorithms failed.
更多
查看译文
关键词
lung cancer stratification,clustering algorithm,lung cancer,covariates adjustment,model-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要