Alibaba Cloud Quantum Development Kit: Large-Scale Classical Simulation of Quantum Circuits

Zhang Fang
Zhang Fang
Newman Michael
Newman Michael
Cai Junjie
Cai Junjie
Yu Huanjun
Yu Huanjun
Tian Zhengxiong
Tian Zhengxiong
Yuan Bo
Yuan Bo
Xu Haihong
Xu Haihong
Wu Junyin
Wu Junyin
Gao Xun
Gao Xun
Chen Jianxin
Chen Jianxin
Cited by: 0|Bibtex|Views71
Other Links: arxiv.org
Weibo:
Several variables will affect the performance of our simulator:

Abstract:

We report, in a sequence of notes, our work on the Alibaba Cloud Quantum Development Kit (AC-QDK). AC-QDK provides a set of tools for aiding the development of both quantum computing algorithms and quantum processors, and is powered by a large-scale classical simulator deployed on Alibaba Cloud. In this note, we report the computational...More

Code:

Data:

0
Introduction
  • The computational engine of AC-QDP is at present the classical quantum circuit simulator Tai-Zhang, deployed on Alibaba Cloud.
  • Simulating these circuits will provide a direct comparison between quantum and classical implementations for the same task.
  • Like all simulations based on tensor contraction, the approach requires computational resources that scale exponentially in the treewidth of the quantum circuit, and is limited fundamentally in the size it can simulate.
Highlights
  • The computational engine of Alibaba Cloud Quantum Development Platform (AC-QDP) is at present our classical quantum circuit simulator Tai-Zhang, deployed on Alibaba Cloud
  • The computational engine of AC-QDP is at present our classical quantum circuit simulator Tai-Zhang, deployed on Alibaba Cloud
  • In [4], we described Tai-Zhang’s algorithm and the computational experiments that deployed it on the computing facilities in Alibaba Group’s Data Infrastructure and Search Technology Division
  • 72-qubit random quantum circuits for Bristlecone with depth 1 + 32 + 1 have been benchmarked in [5], which reported runtimes of 14.1 minutes, or 846 seconds to compute a single amplitude on 16384 Sunway SW26010 260C nodes, with 256 cores each
  • Several variables will affect the performance of our simulator:
  • We can calculate the execution time of those subtasks assigned to a single node from a cluster and predict the full execution time of the whole task on that cluster
Results
  • A series of papers address the simulation of these revised random quantum circuits [8, 9, 5, 6].
  • 72-qubit random quantum circuits for Bristlecone with depth 1 + 32 + 1 have been benchmarked in [5], which reported runtimes of 14.1 minutes, or 846 seconds to compute a single amplitude on 16384 Sunway SW26010 260C nodes, with 256 cores each.
  • Bristlecone-70, the 70-qubit random quantum circuit family for that architecture, is equivalent for simulation purposes to Bristlecone-72, the 72-qubit version, since two qubits of the latter network can be contracted with their only neighbor.
  • The authors can compare the above reported runtimes with results using Bristlecone-70 at depth 1 + 32 + 1, even though the authors use less CPU cores than previously reported work.
  • The 70-qubit circuits that the authors benchmark in this paper, Bristlecone-70, correspond to circuit description files in Bris_11.tar.gz.
  • The authors use 1449 Alibaba Cloud Elastic Compute Service (ECS) instances, each with 88 virtual CPU cores and 160 GB of memory.
  • The number Ns of subtasks for calculating a single amplitude.
  • There is no shared data access across these subtasks, and so distributing subtasks among CPU cores or among nodes does not strongly affect the performance.
  • The authors can calculate the execution time of those subtasks assigned to a single node from a cluster and predict the full execution time of the whole task on that cluster.
Conclusion
  • Na Ns ×#vCPU 127,512 subtasks for each node and choose the largest one as the predicted execution time of the whole calculation on a cluster with 127, 512 vCPU cores and 2 × 127, 512 gigabytes of memory.
  • The authors observe that even when using same number of total vCPU cores, execution time will be slightly reduced when the authors use a larger number of smaller ECS instances.
  • Based on the 200, 000 amplitudes the authors calculated for Bristlecone-70 circuits with depth 1 + 28 + 1, the authors plot the distribution of N p, which closely matches the PorterThomas form.
Summary
  • The computational engine of AC-QDP is at present the classical quantum circuit simulator Tai-Zhang, deployed on Alibaba Cloud.
  • Simulating these circuits will provide a direct comparison between quantum and classical implementations for the same task.
  • Like all simulations based on tensor contraction, the approach requires computational resources that scale exponentially in the treewidth of the quantum circuit, and is limited fundamentally in the size it can simulate.
  • A series of papers address the simulation of these revised random quantum circuits [8, 9, 5, 6].
  • 72-qubit random quantum circuits for Bristlecone with depth 1 + 32 + 1 have been benchmarked in [5], which reported runtimes of 14.1 minutes, or 846 seconds to compute a single amplitude on 16384 Sunway SW26010 260C nodes, with 256 cores each.
  • Bristlecone-70, the 70-qubit random quantum circuit family for that architecture, is equivalent for simulation purposes to Bristlecone-72, the 72-qubit version, since two qubits of the latter network can be contracted with their only neighbor.
  • The authors can compare the above reported runtimes with results using Bristlecone-70 at depth 1 + 32 + 1, even though the authors use less CPU cores than previously reported work.
  • The 70-qubit circuits that the authors benchmark in this paper, Bristlecone-70, correspond to circuit description files in Bris_11.tar.gz.
  • The authors use 1449 Alibaba Cloud Elastic Compute Service (ECS) instances, each with 88 virtual CPU cores and 160 GB of memory.
  • The number Ns of subtasks for calculating a single amplitude.
  • There is no shared data access across these subtasks, and so distributing subtasks among CPU cores or among nodes does not strongly affect the performance.
  • The authors can calculate the execution time of those subtasks assigned to a single node from a cluster and predict the full execution time of the whole task on that cluster.
  • Na Ns ×#vCPU 127,512 subtasks for each node and choose the largest one as the predicted execution time of the whole calculation on a cluster with 127, 512 vCPU cores and 2 × 127, 512 gigabytes of memory.
  • The authors observe that even when using same number of total vCPU cores, execution time will be slightly reduced when the authors use a larger number of smaller ECS instances.
  • Based on the 200, 000 amplitudes the authors calculated for Bristlecone-70 circuits with depth 1 + 28 + 1, the authors plot the distribution of N p, which closely matches the PorterThomas form.
Tables
  • Table1: Estimated Execution Time for Simulating Bristlecone-70 Circuits Using 127, 512 CPU Cores
  • Table2: Benchmarking Results for Simulating Bristlecone-70 Circuits on Alibaba Cloud
Download tables as Excel
Funding
  • We would like to thank our colleagues from various teams in Alibaba Cloud Intelligence supporting us in the numerical experiments presented in this paper
Reference
  • Random quantum circuits for circuit sampling with superconducting qubits, available at https://github.com/sboixo/GRCS.
    Findings
  • Sergio Boixo, Sergei V Isakov, Vadim N Smelyanskiy, Ryan Babbush, Nan Ding, Zhang Jiang, Michael J Bremner, John M Martinis, and Hartmut Neven. Characterizing quantum supremacy in near-term devices. Nature Physics, page 1, 2018.
    Google ScholarLocate open access versionFindings
  • Sergio Boixo and Charles Neill. The question of quantum supremacy. https://ai.googleblog.com/2018/05/the-question-of-quantum-supremacy.html.
    Findings
  • Jianxin Chen, Fang Zhang, Cupjin Huang, Michael Newman, and Yaoyun Shi. Classical simulation of intermediate-size quantum circuits. arXiv preprint arXiv:1805.01450, 2018.
    Findings
  • Ming-Cheng Chen, Riling Li, Lin Gan, Xiaobo Zhu, Guangwen Yang, Chao-Yang Lu, and Jian-Wei Pan. Quantum teleportation-inspired algorithm for sampling large random quantum circuits. arXiv preprint arXiv:1901.05003, 2019.
    Findings
  • Chu Guo, Yong Liu, Min Xiong, Shichuan Xue, Xiang Fu, Anqi Huang, Xiaogang Qiang, Ping Xu, Junhua Liu, Shenggen Zheng, et al. General-purpose quantum circuit simulator with projected entangled-pair states and the quantum supremacy frontier. arXiv preprint arXiv:1905.08394, 2019.
    Findings
  • Julian Kelly. A preview of Bristlecone, google’s new quantum processor. http://ai.googleblog.com/2018/03/a-preview-of-bristlecone-googles-new.html.
    Findings
  • Benjamin Villalonga, Sergio Boixo, Bron Nelson, Christopher Henze, Eleanor Rieffel, Rupak Biswas, and Salvatore Mandrà. A flexible high-performance simulator for the verification and benchmarking of quantum circuits implemented on real hardware. arXiv preprint arXiv:1811.09599, 2018.
    Findings
  • Benjamin Villalonga, Dmitry Lyakh, Sergio Boixo, Hartmut Neven, Travis S Humble, Rupak Biswas, Eleanor G Rieffel, Alan Ho, and Salvatore Mandrà. Establishing the quantum supremacy frontier with a 281 pflop/s simulation. arXiv preprint arXiv:1905.00444, 201
    Findings
Full Text
Your rating :
0

 

Tags
Comments