COVER: conformational oversampling as data augmentation for molecules
Journal of Cheminformatics(2020)
摘要
Training neural networks with small and imbalanced datasets often leads to overfitting and disregard of the minority class. For predictive toxicology, however, models with a good balance between sensitivity and specificity are needed. In this paper we introduce conformational oversampling as a means to balance and oversample datasets for prediction of toxicity. Conformational oversampling enhances a dataset by generation of multiple conformations of a molecule. These conformations can be used to balance, as well as oversample a dataset, thereby increasing the dataset size without the need of artificial samples. We show that conformational oversampling facilitates training of neural networks and provides state-of-the-art results on the Tox21 dataset.
更多查看译文
关键词
Deep learning, Toxicity, Imbalanced learning, Upsampling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要