[an error occurred while processing this directive]
Emulation, Migration, and Long-Term Preservation of Electronic Records
Cal Lee
University of Michigan
School of Information
ECURE 2001: Preservation and Access for
electronic College and University Records
October 13, 2001
1
Outline
Digital Preservation Problem
Base-Line Assumptions
Major Approaches: Migration and Emulation
Migration
Emulation
For Further Reference
2
The Digital Preservation Problem
3
Technological Dependency
Digital objects are useless if we can’t interact with them
Those interactions depend on numerous technical components.
4
Key Concept - Abstraction
“Computer science is largely a matter of abstraction: identifying a wide range of
applications that include some overlapping functionality, and then working to abstract out
that shared functionality into a distinct service layer (or module, or language, or whatever).
That new service layer then becomes a platform on top of which many other functionalities
can be built that had previously been impractical or even unimagined. How does this activity
of abstraction work as a practical matter? It’s technical work, of course, but it's also
social work. It is unlikely that any one computer scientist will be an expert in every one
of the important applications areas that may benefit from the abstract service. So
collaboration will be required.” (emphasis added)
— Phil Agre, Red Rock Eater, March 25, 2000
5
Oh so many layers
Physical medium — only layer yielding real consensus
Bit
Byte
Character encoding
Instruction set architecture
Physical organization of bytes
Logical organization of chunks
Reading hardware
Input/output hardware
Input/output software
6
But, wait, there’s more
Operating system kernel
Network operating system
Networking protocols
Desktop and windowing environment
Data syntax
Data structure
Data semantics
Data content
Data values
Contextual linking within and between objects
7
Obsolescence
“Those who forget the past are condemned to reload it.”
— Nick Montfort, July 2000
All layers undergo change over time, at varying rates.
8
Some Base-Line Assumptions
Several assumptions which I will take to be given.
Making them explicit can help us to be more precise
about available options and their costs/benefits.
9
Assumption #1: Digital objects are instructions for future interaction
Only a small part of preservation work is about treating them like
physical artifacts.
Jeff Rothenberg takes this even farther, contending that all digital
objects should be seen as programs.
10
Assumption #2: Bits will be Bits
Bit rot and advantages of newer media both call for
periodic refresh and reformatting.
Ensuring the integrity of the bit stream
in such transfers is extremely important.
See Charles Dollar’s 1999 book for an excellent explanation of
these processes.
11
Assumption #3: Change Happens
Any long-term strategy must recognize that any underlying
technical platform will eventually be
abandoned by the industry and thereafter
increasingly difficult to support.
Ongoing preservation effort is assumed,
regardless of the strategy adopted.
Goal is to minimize (rather than eliminate) work and maximize the
benefits.
12
Assumption #4: Must identify what’s desirable and what’s possible
Best, most informed guess about how
objects will be used.
Characteristics that support such use.
Currently available technical approaches.
Whether using any given approach can cost-effectively
reserve those characteristics.
All of these decisions should be well documented
and revisited periodically.
13
Major Approaches: Migration and Emulation
14
Migration
Periodic transformation of the bits/bytes to run directly on newer platforms.
Used widely as an approach to actively managing legacy systems.
Work can be expensive and introduce errors of translation.
Since the resulting objects can run directly on newer platforms, layers of
technology can be minimized.
15
Emulation - Oxford English Dictionary, Second Edition
“To reproduce the action of or behave like (a different type of computer) with the aid of
hardware or software designed to effect this; to run (a program,
etc., written for another type of computer) by
this means.”