Anne R. Kenney
Nancy Y. McGovern
Peter Botticelli
Richard Entlich
William R. Kehoe
Carl Lagoze
Sandra Payette
3
Preservation Risk Management
Increased reliance by research libraries on Web resources not owned or controlled
Need to monitor and evaluate resources
Identify risks to resources and appropriate responses
Technology introduces new threats, enables new solutions
4
The Research Agenda
see, "Preservation Risk Management for Web resources: Virtual Remote Control
in Cornell’s Project Prism,"
by Anne R. Kenney, Nancy Y. McGovern, Peter Botticelli, Richard Entlich, Carl
Lagoze, and Sandra Payette
in DLib Magazine, January 2002
in local context, considering the internal and external links
A Web site:
as a semantically coherent set of linked Web pages
as an entity in a braoder technical and organizational context
9
Contextual Layers
10
Page-level Monitoring
Formatting: TIDY
Standards compliance
Document structure
Metadata:
HTTP headers
HTML headers
Changes
Content
Location
Links
Out-link struction
In-link struction
Intra-site
Hub
Volatility
Page provenance
URL parsing
Log analysis
11
Site-level Monitoring
graph analysis
static site analysis and Longitudinal study
Aggregate page analyses
Site maintenance indicators
Backup and archiving policies and procedures
Hardware and software environment
Network configuration and maintenance
12
Appraisal
enable portfolia management:
Hypothetical appraisal of a Web resource: Scope: highly relevant Value: high value, not essential; numerous links to page Relationship: secondary archives; informal agreement Maintenance: key indicators of good management Redundancy: captured by more than one archive Risk response: very responsive to risk notifications Capture: complex structure; cyclical updates; formats Size: medium-sized; 3-level crawl
13
Portfolio Management
14
Strategy
Develop an organization-specific program:
Low Trust
High Trust
Low Control
no agreement; monitor and as-is metadata capture; no risk notification
informal agreement; monitor and metadata capture with permission; minimal
risk notification