ORCA - a Benchmark for Data Web Crawlers

2021 IEEE 15th International Conference on Semantic Computing (ICSC)(2021)

引用 1|浏览63
暂无评分
摘要
The number of RDF knowledge graphs available on the Web grows constantly. Gathering these graphs at large scale for downstream applications hence requires the use of crawlers. Although Data Web crawlers exist, and general Web crawlers could be adapted to focus on the Data Web, there is currently no benchmark to fairly evaluate their performance. Our work closes this gap by presenting the Orca benchmark. Orca generates a synthetic Data Web, which is decoupled from the original Web and enables a fair and repeatable comparison of Data Web crawlers. Our evaluations show that Orca can be used to reveal the different advantages and disadvantages of existing crawlers. The benchmark is open-source and available at https://w3id.org/dice-research/orca.
更多
查看译文
关键词
RDF knowledge graphs,general Web crawlers,Orca benchmark,original Web,synthetic Data Web crawlers
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要