谷歌浏览器插件
订阅小程序
在清言上使用

Semi-automated retrieval of chemical and phylogenetic information from natural products literature

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览3
暂无评分
摘要
Natural products (NPs) are metabolites of great importance due to their fundamental biological role in performing specialized activities, ranging from basic cellular functions to complex ecological interactions. These metabolites have contributed to innovating fields such as agriculture and medicine due to their optimized biological activities, a consequence of evolution. A key factor in ensuring that isolated NPs are novel is to search scientific literature and compare pre-existing chemical entities with the new isolate. Unfortunately, articles are typically not machine-readable, a problem that hinders efficient searching and increases the chances of unintended rediscovery. In addition, the time required to add new compound discoveries to compound databases hinders computational studies on cell metabolism and Quantitative Structure-Activity Relationships (QSAR). Here, we present a modularized tool that uses text mining techniques to retrieve chemical entities and taxonomic mentions present in scientific literature, called NPMINE (Natural Products MINIng). We were able to analyze 55,382 scientific articles from some of the most important applied chemistry journals from Brazil and the world, consistently recovering the expected taxonomic and structural information. This processing resulted in 120,970 unique InChI Keys potentially associated with 21,526 unique species mentioned. Using the PubChem BioAssay database we show how QSAR models can be used to mine active leads. The results indicate that NPMINE not only facilitates natural products cataloging, but also assists in biological source assignment and structure-activity relationships, a time-consuming task, typically performed in low throughput. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
phylogenetic information,natural products,retrieval,chemical,semi-automated
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要