Overcoming The Curse Of Dimensionality When Clustering Multivariate Volume Data

VISIGRAPP 2018: PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS / INTERNATIONAL CONFERENCE ON INFORMATION VISUALIZATION THEORY AND APPLICATIONS (IVAPP), VOL 3(2018)

引用 10|浏览11
暂无评分
摘要
Visual analytics of multidimensional data suffer from the curse of dimensionality, i.e., that even large numbers of data points will be scattered in a high-dimensional space. The curse of dimensionality prohibits the proper use of clustering algorithms in the high-dimensional space. Projecting the space before clustering imposes a loss of information and possible mixing of separated clusters. We present an approach where we overcome the curse of dimensionality for a particular type of multidimensional data, namely for attribute spaces of multivariate volume data. For multivariate volume data, it is possible to interpolate between the data points in the high-dimensional attribute space based on their spatial relationship in the volumetric domain (or physical space). We apply this idea to a histogram-based clustering algorithm. We create a uniform partition of the attribute space in multidimensional bins and compute a histogram indicating the number of data samples belonging to each bin. Only non-empty bins are stored for efficiency. Without interpolation, the analysis is highly sensitive to the cell sizes yielding inaccurate clustering for improper choices: Large cells result in no cluster separation, while clusters fall apart for small cells. Using tri-linear interpolation in physical space, we can refine the data by generating additional samples. The refinement scheme can adapt to the data point distribution in attribute space and the histogram's bin size. As a consequence, we can generate a density computation, where clusters stay connected even when using very small cell sizes. We exploit this result to create a robust hierarchical cluster tree. It can be visually explored using coordinated views to physical space visualizations and to parallel coordinates plots. We apply our technique to several datasets and compare the results against results without interpolation.
更多
查看译文
关键词
Multi-dimensional Data Visualization, Multi-field Data, Clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要