Gene Regulatory Network Inference in the Presence of Dropouts: a Causal View
arxiv(2024)
摘要
Gene regulatory network inference (GRNI) is a challenging problem,
particularly owing to the presence of zeros in single-cell RNA sequencing data:
some are biological zeros representing no gene expression, while some others
are technical zeros arising from the sequencing procedure (aka dropouts), which
may bias GRNI by distorting the joint distribution of the measured gene
expressions. Existing approaches typically handle dropout error via imputation,
which may introduce spurious relations as the true joint distribution is
generally unidentifiable. To tackle this issue, we introduce a causal graphical
model to characterize the dropout mechanism, namely, Causal Dropout Model. We
provide a simple yet effective theoretical result: interestingly, the
conditional independence (CI) relations in the data with dropouts, after
deleting the samples with zero values (regardless if technical or not) for the
conditioned variables, are asymptotically identical to the CI relations in the
original data without dropouts. This particular test-wise deletion procedure,
in which we perform CI tests on the samples without zeros for the conditioned
variables, can be seamlessly integrated with existing structure learning
approaches including constraint-based and greedy score-based methods, thus
giving rise to a principled framework for GRNI in the presence of dropouts. We
further show that the causal dropout model can be validated from data, and many
existing statistical models to handle dropouts fit into our model as specific
parametric instances. Empirical evaluation on synthetic, curated, and
real-world experimental transcriptomic data comprehensively demonstrate the
efficacy of our method.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要