Visual Web Archive Quality Assessment

LINKING THEORY AND PRACTICE OF DIGITAL LIBRARIES (TPDL 2022)(2022)

引用 0|浏览4
暂无评分
摘要
The large size of today's web archives makes it impossible to manually assess the quality of each archived web page, i.e., to check whether a page can be reproduced faithfully from an archive. For automated web archive quality assessment, previous work proposed to measure the pixel difference between a screenshot of the original page and a screenshot of the same page when reproduced from the archive. However, when categorizing types of reproduction errors (we introduce a respective taxonomy in this paper) one finds that some errors cause high pixel differences between the screenshots, but lead to only a negligible degradation in the user experience of the reproduced web page. Therefore, we propose to visually align page segments in such cases before measuring the pixel differences. Since the diversity of reproduction error types precludes a one-size-fits-all solution for visual alignment, we focus on one common type (translated segments) and investigate the usefulness of video compression algorithms for this task.
更多
查看译文
关键词
Web archiving,Automatic quality assessment,Visual web page alignment
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要