Data Mining Using Light Weight Object Management In Clustered Computing Environments

Robert L. Grossman,Stuart Bailey,David Hanley

PERSISTENT OBJECT SYSTEMS: PRINCIPLES AND PRACTICE（1997）

引用 27|浏览6

暂无评分

摘要

In this note, we describe the design, implementation and our initial experience with an object warehouse specifically designed for selecting, computing and filtering very large collections of objects, each of which has a large number of attributes. The object warehouse is built on top of a persistent object manager. We are especially interested in persistent object managers which are monotone, that is designed for data which is read-mostly, occasionally appended, and infrequently updated. These operations and access patterns are common when data mining large data stores, which provides the main motivation for our current work. For object warehouses to prove useful, they must scale as the number of objects increase, as the selectivity of queries increases, and as the computational complexity of queries increases. We show that our implementation scales in each of the dimensions over three orders of magnitudes: from queries taking seconds touching all the attributes on megabytes of data to queries taking hours touching a small fraction of the data on stores approaching one hundred gigabytes.

查看译文

关键词

persistent object stores, data mining, data warehouses, scientific computing, numerically intensive queries

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要