Distributed Source Coding for Parametric and Non-Parametric Regression
arxiv(2024)
摘要
The design of communication systems dedicated to machine learning tasks is
one key aspect of goal-oriented communications. In this framework, this article
investigates the interplay between data reconstruction and learning from the
same compressed observations, particularly focusing on the regression problem.
We establish achievable rate-generalization error regions for both parametric
and non-parametric regression, where the generalization error measures the
regression performance on previously unseen data. The analysis covers both
asymptotic and finite block-length regimes, providing fundamental results and
practical insights for the design of coding schemes dedicated to regression.
The asymptotic analysis relies on conventional Wyner-Ziv coding schemes which
we extend to study the convergence of the generalization error. The
finite-length analysis uses the notions of information density and dispersion
with additional term for the generalization error. We further investigate the
trade-off between reconstruction and regression in both asymptotic and
non-asymptotic regimes. Contrary to the existing literature which focused on
other learning tasks, our results state that in the case of regression, there
is no trade-off between data reconstruction and regression in the asymptotic
regime. We also observe the same absence of trade-off for the considered
achievable scheme in the finite-length regime, by analyzing correlation between
distortion and generalization error.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要