Revisiting the Classics: Online RL in the Programmable Dataplane

NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium（2022）

引用 2|浏览7

暂无评分

摘要

Data-driven networking is becoming more capable and widely researched, partly driven by the efficacy of Deep Reinforcement Learning (DRL) algorithms. Yet the complexity of both DRL inference and learning force these tasks to be pushed away from the dataplane to hosts, harming latency-sensitive applications. Online learning of such policies cannot occur in the dataplane, despite being useful techniques when problems evolve or are hard to model.We present OPaL—On Path Learning—the first work to bring online reinforcement learning to the dataplane. OPaL makes online learning possible in constrained SmartNIC hardware by returning to classical RL techniques—avoiding neural networks. Our design allows weak yet highly parallel SmartNIC NPUs to be competitive against commodity x86 hosts, despite having fewer features and slower cores. Compared to hosts, we achieve a 21 × reduction in 99.99 ^th tail inference times to 34 µs, and 9.9 × improvement in online throughput for real-world policy designs. In-NIC execution eliminates PCIe transfers, and our asynchronous compute model ensures minimal impact on traffic carried by a co-hosted P4 dataplane. OPaL’s design scales with additional resources at compile-time to improve upon both decision latency and throughput, and is quickly reconfigurable at runtime compared to reinstalling device firmware.

查看译文

关键词

classical RL techniques,SmartNIC NPU,In-NIC execution,On Path Learning,data-driven networking,programmable dataplane,asynchronous compute model,real-world policy designs,online throughput,tail inference times,commodity x86 hosts,constrained SmartNIC hardware,online reinforcement,OPaL,latency-sensitive applications,DRL inference,deep reinforcement learning algorithms,time 34.0 mus

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要