Efficient Reinforcement Learning for Routing Jobs in Heterogeneous Queueing Systems
CoRR(2024)
摘要
We consider the problem of efficiently routing jobs that arrive into a
central queue to a system of heterogeneous servers. Unlike homogeneous systems,
a threshold policy, that routes jobs to the slow server(s) when the queue
length exceeds a certain threshold, is known to be optimal for the
one-fast-one-slow two-server system. But an optimal policy for the multi-server
system is unknown and non-trivial to find. While Reinforcement Learning (RL)
has been recognized to have great potential for learning policies in such
cases, our problem has an exponentially large state space size, rendering
standard RL inefficient. In this work, we propose ACHQ, an efficient policy
gradient based algorithm with a low dimensional soft threshold policy
parameterization that leverages the underlying queueing structure. We provide
stationary-point convergence guarantees for the general case and despite the
low-dimensional parameterization prove that ACHQ converges to an approximate
global optimum for the special case of two servers. Simulations demonstrate an
improvement in expected response time of up to 30
routes to the fastest available server.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要