A New Binary Biclustering Algorithm Based on Weight Adjacency Difference Matrix for Analyzing Gene Expression Data.

IEEE/ACM transactions on computational biology and bioinformatics(2023)

引用 1|浏览9
Biclustering algorithms are essential for processing gene expression data. However, to process the dataset, most biclustering algorithms require preprocessing the data matrix into a binary matrix. Regrettably, this type of preprocessing may introduce noise or cause information loss in the binary matrix, which would reduce the biclustering algorithm's ability to effectively obtain the optimal biclusters. In this paper, we propose a new preprocessing method named Mean-Standard Deviation (MSD) to resolve the problem. Additionally, we introduce a new biclustering algorithm called Weight Adjacency Difference Matrix Biclustering (W-AMBB) to effectively process datasets containing overlapping biclusters. The basic idea is to create a weighted adjacency difference matrix by applying weights to a binary matrix that is derived from the data matrix. This allows us to identify genes with significant associations in sample data by efficiently identifying similar genes that respond to specific conditions. Furthermore, the performance of the W-AMBB algorithm was tested on both synthetic and real datasets and compared with other classical biclustering methods. The experiment results demonstrate that the W-AMBB algorithm is significantly more robust than the compared biclustering methods on the synthetic dataset. Additionally, the results of the GO enrichment analysis show that the W-AMBB method possesses biological significance on real datasets.
Biclustering,gene expression data,weight matrix,binary matrix
AI 理解论文
Chat Paper