Kernel Distillation for Fast Gaussian Processes Prediction

Congzheng Song, Yiming Sun

arXiv: Machine Learning(2018)

引用 23|浏览30
暂无评分
摘要
Gaussian processes (GPs) are flexible models that can capture complex structure in large-scale dataset due to their non-parametric nature. However, the usage of GPs in real-world application is limited due to their high computational cost at inference time. In this paper, we introduce a new framework, textit{kernel distillation}, to approximate a fully trained teacher GP model with kernel matrix of size $ntimes n$ for $n$ training points. We combine inducing points method with sparse low-rank approximation in the distillation procedure. The distilled student GP model only costs $O(m^2)$ storage for $m$ inducing points where $m ll n$ and improves the inference time complexity. We demonstrate empirically that kernel distillation provides better trade-off between the prediction time and the test performance compared to the alternatives.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要