Performance improvement of reinforcement learning algorithms for online 3D bin packing using FPGA

Kavya Borra, Ashwin Krishnan,Harshad Khadilkar,Manoj Nambiar,Ansuma Basumatary,Rekha Singhal,Arijit Mukherjee

AIMLSystems（2023）

引用 0|浏览2

暂无评分

摘要

Online 3D bin packing is a challenging real-time combinatorial optimisation problem that involves packing of parcels (typically rigid cuboids) arriving on a conveyor into a larger bin for further shipment. Recent automation methods have introduced manipulator robots for packing, which need a processing algorithm to specify the location and orientation in which each parcel must be loaded. Value-based Reinforcement learning (RL) algorithms such as DQN are capable of producing good solutions in the available computation times. However, their deployment on CPU based systems employs rule-based heuristics to reduce the search space which may lead to a sub-optimal solution. In this paper, we use FPGA as a hardware accelerator to reduce inference time of DQN as well as its pre-/post-processing steps. This allows the optimised algorithm to cover the entire search space within the given time constraints. We present various optimizations, such as accelerating DQN model inference and fast checking of constraints. Further, we show that our proposed architecture achieves almost 15x computational speed-ups compared to an equivalent CPU implementation. Additionally, we show that as a result of evaluating the entire search space, the DQN rewards generated for complex data sets has improved by 1%, which can cause a significant reduction in enterprise operating costs.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要