Output-feedback Synthesis Orbit Geometry: Quotient Manifolds and LQG Direct Policy Optimization
arxiv(2024)
摘要
In this paper, we consider direct policy optimization for the
linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been
recognized that the landscape of stabilizing output-feedback controllers of
relevance to LQG has an intricate geometry, particularly as it pertains to the
existence of spurious stationary points. In order to address such challenges,
in this paper, we first adopt a Riemannian metric for the space of stabilizing
full-order minimal output-feedback controllers. We then proceed to prove that
the orbit of such controllers modulo coordinate transformation admits a
Riemannian quotient manifold structure. This geometric structure is then used
to develop a Riemannian gradient descent for the direct LQG policy
optimization. We prove a local convergence guarantee with linear rate and show
the proposed approach exhibits significantly faster and more robust numerical
performance as compared with ordinary gradient descent for LQG. Subsequently,
we provide reasons for this observed behavior; in particular, we argue that
optimizing over the orbit space of controllers is the right theoretical and
computational setup for direct LQG policy optimization.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要