AnalyticDB: Real-time OLAP Database System at Alibaba Cloud
PVLDB, pp. 2059-2070, 2019.
EI
Weibo:
Abstract:
With data explosion in scale and variety, OLAP databases play an increasingly important role in serving real-time analysis with low latency (e.g., hundreds of milliseconds), especially when incoming queries are complex and ad hoc in nature. Moreover, these systems are expected to provide high query concurrency and write throughput, and su...More
Code:
Data:
Introduction
- AnalyticDB is an OLAP database system designed for high-concurrency, low-latency, and real-time analytical queries on PB scale, and has been running on 2000+ physical machines on Alibaba Cloud [1].
- It serves external clients on.
- Indexing is a straightforward way to improve query performance, building indexes on pre-specified columns is often no longer effective
Highlights
- AnalyticDB is an OLAP database system designed for high-concurrency, low-latency, and real-time analytical queries on PB scale, and has been running on 2000+ physical machines on Alibaba Cloud [1]
- To further improve query latency and concurrency, we enhance the optimizer and execution engine in AnalyticDB to fully utilize the advantages of our storage and indexes
- This paper presents AnalyticDB, a high-concurrent, lowlatency, and real-time OLAP database at Alibaba
- AnalyticDB extends hybrid row-column layout to support both structured and other complex-typed data that may be involved in complex queries
Methods
- The experiments are conducted on a cluster of eight physical machines, each with an Intel Xeon Platinum 8163 CPU (@2.50GHz), 300GB memory and 3TB SSD.
- The second table is called Orders Table, which uses order id as its primary key, and has 64 primary partitions and 10 secondary partitions.
- These two tables are associated with each other through user id.
- It is because Druid MUST have a timestamp column as the partition key, and queries without specifying the timestamp column are much slower [36]
Results
- The authors evaluate AnalyticDB in both real workloads and TPC-H benchmark [11] to demonstrate AnalyticDB’s performance under different types of queries and its write capability.
Conclusion
- This paper presents AnalyticDB, a high-concurrent, lowlatency, and real-time OLAP database at Alibaba.
- AnalyticDB has an efficient index engine to build index for all columns asynchronously, which helps improve query performance and hide index building overhead.
- AnalyticDB extends hybrid row-column layout to support both structured and other complex-typed data that may be involved in complex queries.
- To provide both highthroughput write and high-concurrent query, AnalyticDB follows a read/write decoupling architecture.
- The authors' experiments have showed that all these designs help AnalyticDB achieve better performance compared to stateof-art OLAP systems
Summary
Introduction:
AnalyticDB is an OLAP database system designed for high-concurrency, low-latency, and real-time analytical queries on PB scale, and has been running on 2000+ physical machines on Alibaba Cloud [1].- It serves external clients on.
- Indexing is a straightforward way to improve query performance, building indexes on pre-specified columns is often no longer effective
Methods:
The experiments are conducted on a cluster of eight physical machines, each with an Intel Xeon Platinum 8163 CPU (@2.50GHz), 300GB memory and 3TB SSD.- The second table is called Orders Table, which uses order id as its primary key, and has 64 primary partitions and 10 secondary partitions.
- These two tables are associated with each other through user id.
- It is because Druid MUST have a timestamp column as the partition key, and queries without specifying the timestamp column are much slower [36]
Results:
The authors evaluate AnalyticDB in both real workloads and TPC-H benchmark [11] to demonstrate AnalyticDB’s performance under different types of queries and its write capability.Conclusion:
This paper presents AnalyticDB, a high-concurrent, lowlatency, and real-time OLAP database at Alibaba.- AnalyticDB has an efficient index engine to build index for all columns asynchronously, which helps improve query performance and hide index building overhead.
- AnalyticDB extends hybrid row-column layout to support both structured and other complex-typed data that may be involved in complex queries.
- To provide both highthroughput write and high-concurrent query, AnalyticDB follows a read/write decoupling architecture.
- The authors' experiments have showed that all these designs help AnalyticDB achieve better performance compared to stateof-art OLAP systems
Tables
- Table1: Comparison between AnalyticDB (ADB)
- Table2: Three kinds of queries for evaluation. Query SELECT * FROM orders ORDER BY o trade time LIMIT 10 SELECT * FROM orders WHERE o trade time BETWEEN ’2018-11-13 15:15:21’ AND ’2018-11-13 16:15:21’ AND o trade prize BETWEEN 50 AND 60 AND o seller id=9999 LIMIT 1000 SELECT o seller id, SUM(o trade prize) AS c FROM orders JOIN user ON orders.o user id = user.u id WHERE u age=10 AND o trade time BETWEEN ’2018-11-13 15:15:21’ AND ’2018-11-13 16:15:21’ GROUP BY o seller id ORDER BY c DESC LIMIT 10
- Table3: Write throughput under different numbers of write nodes
Related work
- AnalyticDB is built from scratch for large-scale and realtime analysis on cloud platform. In this section, we compare AnalyticDB with other systems.
OLTP databases. OLTP databases such as MySQL [6] and PostgreSQL [8] are designed to support transactional queries, which can be considered as point lookups that involve one or several rows. Hence, storage engines in OLTP databases are row-oriented and build B+tree index [16] to speed up query performance. However, row-store does not fit for analytical queries as these queries only require a subset of columns, where row-store significantly amplifies I/Os. Moreover, OLTP databases usually actively update indexes in the write path, which is so expensive that affects both write throughput and query latency.
Funding
- Figure 4 shows this partition placement among read nodes, With the storage-aware optimizer (Section 5.1), this placement helps to save the cost of data redistribution by more than 80%, measured from our production service
- As we observe in our real-world AnalyticDB service, this overhead contributes to less than 5% of the overall query latency
- In addition, by consolidating the internal data representation between storage layer and execution engine, AnalyticDB is able to operate directly on serialized binary data rather than Java objects. This helps eliminate the overhead of serialization and de-serialization, which accounts for more than 20% of time when shuffling a large amount of data
- With careful designs, the all-column index just consumes 66% more storage
Study subjects and analysis
trillion rows of records: 100
AnalyticDB has been successfully deployed on Alibaba Cloud to serve numerous customers (both large and small). It is capable of holding 100 trillion rows of records, i.e., 10PB+ in size. At the same time, it is able to serve 10m+ writes and 100k+ queries per second, while completing complex queries within hundreds of milliseconds
Reference
- Alibaba Cloud. https://www.alibabacloud.com.
- ANTLR ASM. https://www.antlr.org.
- Apache ORC File. https://orc.apache.org/.
- Benchmarking Nearest Neighbours. https://github.com/erikbern/ann-benchmarks.
- Greenplum. https://greenplum.org/.
- MySQL. https://www.mysql.com/.
- Pangu. https://www.alibabacloud.com/blog/pangu—the-highperformance-distributed-file-system-by-alibaba-cloud 594059.
- PostgreSQL. https://www.postgresql.org/.
- Presto. https://prestodb.io/.
- Teradata Database. http://www.teradata.com.
- TPC-H Benchmark. http://www.tpc.org/tpch/.
- D. J. Abadi, S. R. Madden, and N. Hachem. Column-stores vs. row-stores: how different are they really? In SIGMOD, pages 967–980. ACM, 2008.
- M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan, M. J. Franklin, A. Ghodsi, et al. Spark sql: Relational data processing in spark. In SIGMOD, pages 1383–1394. ACM, 2015.
- J. Backus. Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs. ACM, 2007.
- P. A. Bernstein and N. Goodman. Multiversion concurrency control-theory and algorithms. ACM Transactions on Database Systems (TODS), 8(4):465–483, 1983.
- D. Comer. Ubiquitous b-tree. ACM Computing Surveys (CSUR), 11(2):121–137, 1979.
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT press, 2009.
- J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107–113, 2008.
- A. Eisenberg, J. Melton, K. Kulkarni, J.-E. Michels, and F. Zemke. Sql: 2003 has been published. ACM SIGMOD Record, 33(1):119–126, 2004.
- M. Grund, J. Kruger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden. Hyrise: a main memory hybrid storage engine. PVLDB, 4(2):105–116, 2010.
- A. Gupta, D. Agarwal, D. Tan, J. Kulesza, R. Pathak, S. Stefani, and V. Srinivasan. Amazon redshift and the case for simpler data warehouses. In SIGMOD, pages 1917–1923. ACM, 2015.
- K. Hajebi, Y. Abbasi-Yadkori, H. Shahbazi, and H. Zhang. Fast approximate nearest-neighbor search with k-nearest neighbor graph. In IJCAI, pages 1312–1317, 2011.
- S. Harizopoulos, V. Liang, D. J. Abadi, and S. Madden. Performance tradeoffs in read-optimized databases. In VLDB, pages 487–498. VLDB Endowment, 2006.
- P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. Zookeeper: Wait-free coordination for internet-scale systems. In USENIX ATC, volume 8. Boston, MA, USA, 2010.
- J.-F. Im, K. Gopalakrishna, S. Subramaniam, M. Shrivastava, A. Tumbde, X. Jiang, J. Dai, S. Lee, N. Pawar, J. Li, et al. Pinot: Realtime olap for 530 million users. In SIGMOD, pages 583–594. ACM, 2018.
- H. Jegou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell., 33(1):117–128, 2011.
- F. V. Jensen. An introduction to Bayesian networks, volume 210. UCL press London, 1996.
- M. Kornacker, A. Behm, V. Bittorf, T. Bobrovytsky, C. Ching, A. Choi, J. Erickson, M. Grund, D. Hecht, M. Jacobs, et al. Impala: A modern, open-source sql engine for hadoop. In Cidr, volume 1, page 9, 2015.
- A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandiver, L. Doshi, and C. Bear. The vertica analytic database: C-store 7 years later. PVLDB, 5(12):1790–1801, 2012.
- G. M. Lohman. Grammar-like functional rules for representing query optimization alternatives, volume 17. ACM, 1988.
- S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton, and T. Vassilakis. Dremel: interactive analysis of web-scale datasets. PVLDB, 3(1-2):330–339, 2010.
- T. Neumann. Efficiently compiling efficient query plans for modern hardware. PVLDB, 4(9):539–550, 2011.
- K. Sato. An inside look at google bigquery.(2012). Retrieved Jan, 29:2018, 2012.
- M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O’Neil, et al. C-store: a column-oriented dbms. In VLDB, pages 553–564. VLDB Endowment, 2005.
- A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive: a warehousing solution over a map-reduce framework. PVLDB, 2(2):1626–1629, 2009.
- F. Yang, E. Tschetter, X. Leaute, N. Ray, G. Merlino, and D. Ganguli. Druid: A real-time analytical data store. In SIGMOD, pages 157–168. ACM, 2014.
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, pages 2–2. USENIX Association, 2012.
- Z. Zhang, C. Li, Y. Tao, R. Yang, H. Tang, and J. Xu. Fuxi: a fault-tolerant resource management and job scheduling system at internet scale. PVLDB, 7(13):1393–1404, 2014.
- M. Zukowski, S. Heman, N. Nes, and P. Boncz. Super-scalar RAM-CPU cache compression. IEEE, 2006.
Full Text
Tags
Comments