Improving RDF Query Performance Using In-memory Virtual Columns in Oracle Database

2019 IEEE 35th International Conference on Data Engineering (ICDE)(2019)

引用 3|浏览45
暂无评分
摘要
Many RDF Knowledge Graph Stores use IDs to represent triples to save storage and for ease of maintenance. Oracle is no exception. While this design is good for a small footprint on disk, it incurs overhead in query processing, as it requires joins with the value table to return results or process aggregates/filter/order-by queries. It becomes especially problematic as the result size increases or the number of projected variables increases. Depending on queries, the value table join could take up most of the query processing time. In this paper, we propose to use in-memory virtual columns to avoid value table joins. It has advantages in that it does not increase the footprint on disk, and at the same time it avoids the value table joins by utilizing in-memory virtual columns. The idea is to materialize the values only in memory and utilize the compression and vector processing that come with Oracle Database In-Memory technology. Typically, the value table in RDF is small compared to the triples table. Therefore, its footprint is manageable in memory, especially with compression in columnar format. The same mechanism can be applied to any application where there exists a one-to-one mapping between an ID and its value, such as data warehousing or data marts. The mechanism has been implemented in Oracle 18c. Experimental results using the LUBM1000 benchmark show up to two orders of magnitude query performance improvement.
更多
查看译文
关键词
Resource description framework,Dictionaries,Query processing,Aggregates,Warehousing,Benchmark testing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要