Explaining data with descriptions

Information Systems(2020)

引用 4|浏览42
暂无评分
摘要
With the advent of Big Data, it is impossible for a human user to properly inspect and understand data at a glance. In this paper, we introduce the problem of generating data descriptions: a set of compact, readable and insightful formulas of boolean predicates that represents a set of data records. Unfortunately, finding the best description for a dataset is both NP-hard and task-specific. Therefore, we introduce a dynamic programming approach which, in concert with a set of heuristics, allows us not only to generate descriptions at interactive speed but also to accommodate diverse user needs—from anomaly detection to data exploration. Using real datasets, we evaluate our approach both quantitatively and qualitatively, and prove that descriptions are indeed a viable and powerful tool for supporting data enthusiasts and practitioners in gaining insights from data.
更多
查看译文
关键词
Data explanation,Data exploration,Outlier analysis,Data profiling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要