Prediction from compression for models with infinite memory, with applications to hidden Markov and renewal processes
Consider the problem of predicting the next symbol given a sample path of
length n, whose joint distribution belongs to a distribution class that may
have long-term memory. The goal is to compete with the conditional predictor
that knows the true model. For both hidden Markov models (HMMs) and renewal
processes, we determine the optimal prediction risk in Kullback- Leibler
divergence up to universal constant factors. Extending existing results in
finite-order Markov models [HJW23] and drawing ideas from universal
compression, the proposed estimator has a prediction risk bounded by redundancy
of the distribution class and a memory term that accounts for the long-range
dependency of the model. Notably, for HMMs with bounded state and observation
spaces, a polynomial-time estimator based on dynamic programming is shown to
achieve the optimal prediction risk Θ(log n/n); prior to this work, the
only known result of this type is O(1/log n) obtained using Markov
approximation [Sha+18]. Matching minimax lower bounds are obtained by making
connections to redundancy and mutual information via a reduction argument.
MoreTranslated text
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined