The Web Archives Workbench (WAW) Tool Suite: Taking an Archival Approach to the Preservation of Web Content

LIBRARY TRENDS(2009)

引用 7|浏览4
暂无评分
摘要
The ECHO DEPository (also known as ECHO DEP, an abbreviation for Exploring Collaborations to Harvest Objects in a Digital Environment for Preservation) is an NDIIPP-partner project led by the University of Illinois at Urbana-Champaign in collaboration with OCLC and a consortium of partners, including five state libraries and archives. A core deliverable of the project's first phase was OCLC's development of the Web Archives Workbench (WAW), an open-source Suite of Web archiving tools for identifying, describing, and harvesting Web-based content for ingestion into an external digital repository. Released in October 2007, the suite is designed to bridge the gap between manual selection and automated capture based on the "Arizona Model," which applies a traditional aggregate-based archival approach to Web archiving. Aggregate-based archiving refers to archiving items by group or in series, rather than individually. Core functionality of the suite includes the ability to identify Web content of potential interest through crawls of "seed" URLs and the domains they link to; tools for creating and managing metadata for association with harvested objects; website structural analysis and visualization to aid human content selection decisions; and packaging using a PREMIS-bascd METS profile developed by the ECHO DEPository to support easier ingestion into multiple repositories. This article provides background on the Arizona Model; an overview of how the tools work and their technical implementation; and a brief summary of user feedback from testing and implementing the tools.
更多
查看译文
关键词
structure analysis,dep,digital repository
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要