Cloudmatcher: a hands-off cloud/crowd service for entity matching

Hosted Content(2018)

引用 21|浏览91
暂无评分
摘要
AbstractAs data science applications proliferate, more and more lay users must perform data integration (DI) tasks, which used to be done by sophisticated CS developers. Thus, it is increasingly critical that we develop hands-off DI services, which lay users can use to perform such tasks without asking for help from developers. We propose to demonstrate such a service. Specifically, we will demonstrate CloudMatcher, a hands-off cloud/crowd service for entity matching (EM). To use CloudMatcher to match two tables, a lay user only needs to upload them to the CloudMatcher's Web page then iteratively label a set of tuple pairs as match/no-match. Alternatively, the user can enlist a crowd of workers to label the pairs. In either case, the lay user can easily perform EM end-to-end without having to involve any developers. Cloud-Matcher has been used in several domain science projects at UW-Madison and at several organizations, and is scheduled to be deployed in a large company in Summer 2018. In the demonstration we will show how easy it is for lay users to perform EM (either via interactive labeling or crowdsourcing), how users can easily create and experiment with a range of EM workflows, and how CloudMatcher can scale to many concurrent users and large datasets.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要