Improving Constrained Search Results By Data Melioration

2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021)(2021)

引用 3|浏览7
暂无评分
摘要
The problem of finding an item-set of maximal aggregated utility that satisfies a set of constraints is at the cornerstone of many search applications. Its classical definition assumes that all the information needed to verify the constraints is explicitly given. However, in real-world databases, the data available on items is often partial. Hence, adequately answering constrained search queries requires the completion of this missing information. A common approach to complete missing data is to employ Machine Learning (ML)-based inference. However, such methods are naturally error-prone. More accurate data can be obtained by asking humans to complete missing information. But, as the number of items in the repository is vast, limiting human effort is crucial. To this end, we introduce the Probabilistic Constrained Search (PCS) problem, which identifies a bounded-size item-set whose data completion is likely to be highly beneficial, as these items are expected to belong to the result set of the constrained search queries in question. We prove PCS to be hard to approximate, and consequently propose a best-effort PTIME heuristic to solve it. We demonstrate the effectiveness and efficiency of our algorithm over real-world datasets and scenarios, showing that our algorithm significantly improves the result sets of constrained search queries, in terms of both utility and constraints satisfaction probability.
更多
查看译文
关键词
data completion,bounded-size item-set,Probabilistic Constrained Search problem,missing information,search applications,maximal aggregated utility,data melioration,improving constrained search,result set,constrained search queries
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要