Breast Cancer Subtype By Imbalanced Omics Data Through A Deep Learning Fusion Model

PROCEEDINGS OF 2020 10TH INTERNATIONAL CONFERENCE ON BIOSCIENCE, BIOCHEMISTRY AND BIOINFORMATICS (ICBBB 2020)(2020)

引用 4|浏览26
暂无评分
摘要
Breast cancer is a highly heterogeneous disease that consists of subtypes with distinct genetic features and clinical symptoms. The patients with different subtypes react to different therapies, thus identifying molecular subtypes greatly contributes to precision diagnosis and personalized cancer treatment. PAM50 subtype is a widely accepted standard in breast cancer classification. The large amount of multi-omics data in public database like TCGA greatly contribute to the study of breast cancer subtype identification. However, the imbalance of sample subtypes in the existing database results in a large difficulty in correctly identifying subtypes with small sample size. In this paper, we proposed a novel method to accurately identify the PAM50 subtypes by utilizing the patients' omics profiles in TCGA database. Based on the integrated expression profiles of RNA-seq and Copy Number Alteration (CNA), the proposed method identifies subtype-related patterns by a multi-layer Convolutional Neural Network (CNN). A weighted loss function was applied to alleviate the effects of imbalanced samples, thus contributing to the accurate identification. We demonstrated that our method could identify PAM50 subtypes of patients with high precision (90.02%) and outperformed two benchmark methods.
更多
查看译文
关键词
Breast cancer subtypes, Classification, Subtype imbalance, Deep learning, Weighted loss function, Multi-omics data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要