On Learning the cμ Rule: Single and Multiserver Settings.

arXiv: Performance(2018)

引用 23|浏览14
暂无评分
摘要
consider learning-based variants of the $c mu$ rule -- a classic and well-studied scheduling policy -- in single and multi-server settings for multi-class queueing systems. In the single server setting, the $c mu$ rule is known to minimize the expected holding-cost (weighted queue-lengths summed both over classes and time). focus on the setting where the service rates $mu$ are unknown, and are interested in the holding-cost regret -- the difference in the expected holding-costs between that induced by a learning-based rule (that learns $mu$) and that from the $c mu$ rule (which has knowledge of the service rates) over any fixed time horizon. first show that empirically learning the service rates and then scheduling using these learned values results in a regret of holding-cost that does not depend on the time horizon. The key insight that allows such a constant regret bound is that a work-conserving scheduling policy in this setting allows explore-free learning, where no penalty is incurred for exploring and learning server rates. We next consider the multi-server setting. show that in general, the $c mu$ rule is not stabilizing (i.e. there are stabilizable arrival and service rate parameters for which the multi-server $c mu$ rule results in unstable queues). then characterize sufficient conditions for stability (and also concentrations on busy periods). Using these results, we show that learning-based variants of the $cmu$ rule again result in a constant regret (i.e. does not depend on the time horizon). This result hinges on (i) the busy period concentrations of the multi-server $c mu$ rule, and that (ii) our learning-based rule is designed to dynamically explore server rates, but in such a manner that it eventually satisfies an explore-free condition.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要