Kernel Distillation for Fast Gaussian Processes Prediction

arXiv: Machine Learning（2018）

引用 23|浏览30

暂无评分

摘要

Gaussian processes (GPs) are flexible models that can capture complex structure in large-scale dataset due to their non-parametric nature. However, the usage of GPs in real-world application is limited due to their high computational cost at inference time. In this paper, we introduce a new framework, textit{kernel distillation}, to approximate a fully trained teacher GP model with kernel matrix of size $ntimes n$ for $n$ training points. We combine inducing points method with sparse low-rank approximation in the distillation procedure. The distilled student GP model only costs $O(m^2)$ storage for $m$ inducing points where $m ll n$ and improves the inference time complexity. We demonstrate empirically that kernel distillation provides better trade-off between the prediction time and the test performance compared to the alternatives.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要