Family-Specific Training Improves Linear B Cell Epitope Prediction for Emerging Viruses

Ran Liu,Ye-Fan Hu,Jin Du,Bao-Zhong Zhang,Thomas Yau,Xiaodan Fan,Jian-Dong Huang

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS（2023）

引用 0|浏览36

暂无评分

摘要

The rational design of vaccines and antibody-based therapeutics against newly emerging viruses relies on B cell epitopes mainly. To predict the B cell epitopes of a novel virus, several algorithms have been developed. While most existing algorithms are trained on a dataset in which B cell epitopes are classified as 'Positive' or 'Negative'. However, we found that training on such data contaminates the target pattern of specific viruses, leading to inaccurate predictions in some cases. In this paper, we introduce a novel framework for predicting linear B cell epitopes of novel viruses by exclusively using highly similar viruses for training data. We employed kernel regression based on seropositive rates, which are the percentages of seropositive samples among the population, to predict the potential epitopes. To assess our method, we conducted simulations and utilized two real-world datasets. Our method significantly outperformed other existing methods on the testing data of four viruses with seropositive rates. Also, our strategy showed a better prediction in a larger dataset from the IEDB. Thus, a novel framework providing better linear B cell prediction of newly emerging viruses is established, which will benefit the rational design of vaccines and antibody-based therapeutics in the future.

查看译文

关键词

Coronaviruses,Computer viruses,Training,Prediction algorithms,Kernel,Training data,Vaccines,Epitope prediction,B cell epitope,family-specific prediction,training data selection

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要