Consistency of incomplete data

Patrick G. Clark,Jerzy W. Grzymala-Busse,Wojciech Rzasa

Information Sciences（2015）

引用 19|浏览3

暂无评分

摘要

Consistency is well-known for completely specified data sets. A specified data set is defined as consistent when any pair of cases with the same attribute values belongs to the same concept. In this paper we generalize the definition of consistency for incomplete data sets using rough set theory. We discuss two types of missing attribute values: lost values and “do not care” conditions. For incomplete data sets there exist three definitions of approximations: singleton, subset and concept. Any approximation is lower or upper, so we may define six types of consistencies. We show that two pairs of such consistencies are equivalent, hence there are only four distinct consistencies of incomplete data. Additionally, we discuss probabilistic approximations and study properties of corresponding consistencies. We illustrate the idea of consistency for incomplete data sets using experiments on many incomplete data sets derived from eight benchmark data sets.

查看译文

关键词

Consistency,Incomplete data,Missing attribute value,Rough set theory,Probabilistic approximation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要