Reuse-centric k-means configuration

Lijun Zhang,Hui Guan,Yufei Ding,Xipeng Shen,Hamid Krim

Information Systems（2021）

引用 0|浏览0

暂无评分

摘要

Abstract K -means configuration is to find a configuration of k -means (e.g., the number of clusters, feature sets) that maximize some objectives. It is a time-consuming process due to the iterative nature of k -means. This paper proposes reuse-centric k -means configuration to accelerate k -means configuration. It is based on the observation that the explorations of different configurations share lots of common or similar computations. Effectively reusing the computations from prior trials of different configurations could largely shorten the configuration time. To materialize the idea, the paper presents a set of novel techniques, including reuse-based filtering, center reuse, and a two-phase design to capitalize on the reuse opportunities on three levels: validation, number of clusters, and feature sets. Experiments on k -means–based data classification tasks show that reuse-centric k -means configuration can speed up a heuristic search-based configuration process by a factor of 5.8, and a uniform search-based attainment of classification error surfaces by a factor of 9.1. The paper meanwhile provides some important insights on how to effectively apply the acceleration techniques to tap into a full potential.

查看译文

关键词

<mmlmath xmlnsmml=http//wwww3org/1998/math/mathml,reuse-centric

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要