Midget

Computer Methods and Programs in Biomedicine(2021)

引用 0|浏览2
暂无评分
摘要
• Current state-of-the-art methods for detecting differentially expressed genes miss-predict. • There is no open source framework available to compare and assess performance of differentially expressed genes detection methods. • Gene-Bench provides a framework, to compare microarray data methods for detecting differentially expressed genes in experiments on a set of “in vivo” and “in silico” data. • MIDGET provides two high accuracy machine learning algorithms that are better than the current state-of-the-art. • The new Gene-Bench python package is easily extendable with new methods, metrics and data providers. Backgound and Objective : Detecting differentially expressed genes is an important step in genome wide analysis and expression profiling. There are a wide array of algorithms used in today’s research based on statistical approaches. Even though the current algorithms work, they sometimes miss-predict. There is no framework available for measuring the quality of current algorithms. New machine learning methods (like gradient boost and deep neural networks) were not used to solve this problem. The Gene-Bench open source python package addresses these issues by providing an evaluation and data handling system for differentially expressed genes detection algorithms on microarray data. We also provide MIDGET, a new group of algorithms based on state of the art machine learning approaches Methods : The Gene-Bench package provides data collected from real experiments that consists of 73 transcription-factor perturbation experiments with validation data from Chip-seq experiments and 129 drug perturbation experiments, synthetic data generated with our own method and three evaluation metrics (Kolmogorov, F1 and AUC/ROC). Besides the data and metrics, Gene-Bench also contains well-known algorithms and a new method to identify differentially expressed genes, called MIDGET : M achine l earning I dentification D ifferential G ene E xpression T ool that is using big-data and machine learning methods to identify differentially expressed genes. The two new groups of machine learning algorithms provided in our package use extreme gradient boosting and deep neural networks to achieve their results. Results : The Gene-Bench package is highly flexible, allows fast prototyping and evaluating of new and old algorithms and provides multiple new machine-learning algorithms (called MIDGET) that perform better on all evaluation metrics than all the other tested alternatives. While everything provided in Gene-Bench is algorithm independent, the user can also use algorithms implemented in the R language even though the package is written in Python. Conclusions : The Gene-Bench package fills a gap in evaluating and benchmarking differential gene detection algorithms. It also provides machine learning methods that perform detection with higher accuracy in all tested metrics. It is available at https://github.com/raduangelescu/GeneBench/ and can be directly installed from the Python Package Index using pip install genebench
更多
查看译文
关键词
Differentially expressed genes,Deep neural networks,Gradient boost,Metrics,evaluation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要