Hubness reduction improves clustering and trajectory inference in single-cell transcriptomic data

BIOINFORMATICS(2022)

引用 3|浏览11
暂无评分
摘要
Motivation: Single-cell RNA-seq (scRNAseq) datasets are characterized by large ambient dimensionality, and their analyses can be affected by various manifestations of the dimensionality curse. One of these manifestations is the hubness phenomenon, i.e. existence of data points with surprisingly large incoming connectivity degree in the data-point neighbourhood graph. Conventional approach to dampen the unwanted effects of high dimension consists in applying drastic dimensionality reduction. It remains unexplored if this step can be avoided thus retaining more information than contained in the low-dimensional projections, by correcting directly hubness. Results: We investigated hubness in scRNAseq data. We show that hub cells do not represent any visible technical or biological bias. The effect of various hubness reduction methods is investigated with respect to the clustering, trajectory inference and visualization tasks in scRNAseq datasets. We show that hubness reduction generates neighbourhood graphs with properties more suitable for applying machine learning methods; and that it outperforms other state-of-the-art methods for improving neighbourhood graphs. As a consequence, clustering, trajectory inference and visualization perform better, especially for datasets characterized by large intrinsic dimensionality. Hubness is an important phenomenon characterizing data point neighbourhood graphs computed for various types of sequencing datasets. Reducing hubness can be beneficial for the analysis of scRNAseq data with large intrinsic dimensionality in which case it can be an alternative to drastic dimensionality reduction.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要