High-Dimensional Variable Selection For Ordinal Outcomes With Error Control

BRIEFINGS IN BIOINFORMATICS(2021)

引用 9|浏览8
暂无评分
摘要
Many high-throughput genomic applications involve a large set of potential covariates and a response which is frequently measured on an ordinal scale, and it is crucial to identify which variables are truly associated with the response. Effectively controlling the false discovery rate (FDR) without sacrificing power has been a major challenge in variable selection research. This study reviews two existing variable selection frameworks, model-X knockoffs and a modified version of reference distribution variable selection (RDVS), both of which utilize artificial variables as benchmarks for decision making. Model-X knockoffs constructs a 'knockoff' variable for each covariate to mimic the covariance structure, while RDVS generates only one null variable and forms a reference distribution by performing multiple runs of model fitting. Herein, we describe how different importance measures for ordinal responses can be constructed that fit into these two selection frameworks, using either penalized regression or machine learning techniques. We compared these measures in terms of the FDR and power using simulated data. Moreover, we applied these two frameworks to high-throughput methylation data for identifying features associated with the progression from normal liver tissue to hepatocellular carcinoma to further compare and contrast their performances.
更多
查看译文
关键词
false discovery rate, ordinal regression, knockoff filter, L-1 penalization, boosting, ordinal forests
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要