On the linear convergence of random search for discrete time lqr
IEEE Control Systems Letters(2020)
摘要
Model-free reinforcement learning techniques directly search over the parameter space of controllers. Although this often amounts to solving a nonconvex optimization problem, for benchmark control problems simple local search methods exhibit competitive performance. To understand this phenomenon, we study the discrete-time Linear Quadratic Regulator (LQR) problem with unknown state-space parameters. In spite of the lack of convexity, we establish that the random search method with two-point gradient estimates and a fixed number of roll-outs achieves -accuracy in ( (1/ )) iterations. This significantly improves existing results on the model-free LQR problem which require (1/ ) total roll-outs.
更多查看译文
关键词
Data-driven control,linear quadratic regulator,model-free control,nonconvex optimization,random search method,reinforcement learning,sample complexity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络