Bridging Quantities in Tables and Text

international conference on data engineering(2019)

引用 31|浏览118
暂无评分
摘要
There is a wealth of schema-free tables on the Web, holding valuable information about quantities on sales and costs, environmental footprint of cars, health data and more. Table content can only be properly interpreted in conjunction with the textual context that surrounds the tables. This paper introduces the quantity alignment problem: bidirectional linking between textual mentions of quantities and the corresponding table cells, in order to support advanced content summarization and faster navigation between explanations in text and details in tables. We present the BriQ system for computing such alignments. BriQ is designed to cope with the specific challenges of approximate quantities, aggregated quantities, and calculated quantities in text that are common but cannot be directly matched in table cells. We judiciously combine feature-based classification with joint inference by random walks over candidate alignment graphs. Experiments with a large collection of tables from the Common Crawl project demonstrate the viability of our methods.
更多
查看译文
关键词
Aggregates,Web pages,Knowledge based systems,Joining processes,Informatics,Automobiles,Ice
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要