[an error occurred while processing this directive]
Metadata:
An Introduction
By
Wendy Duff
October 13, 2001
ECURE
1
Metadata
The term “meta” comes from a Greek word that denotes
something of a higher or more fundamental nature. Metadata, then, is
data about other data.
The term refers to any data used to aid the identification,
description and location of networked electronic resources.
2
Defining Metadata
Does data about data mean anything?
Librarians equate it with a complete bibliographic record.
Information technologists equate it to database schema or
definitions of the data elements.
Archivists include context information, restrictions and access
terms, index terms,
etc.
3
Bibliographic Metadata
Providing a description of the information package along with other
information necessary for management and preservation.
Encoding.
Providing access to this description.
Predominantly discovery and retrieval.
4
Encoding
Surrogate records are encoded by assigning tags, letter, or
words.
Why encode?
For display
Provide access
Integration of surrogate
Management
5
Beyond Discovery and Retrieval
Gilliland-Swetland (1998) explains “metadata also documents
how that objects behaves, its functions and use, relationship to other
objects and how it should be managed.”
6
Definition Proposed by Cunningham
Structured information that describes and/or allows us to find,
manage, control, understand, or preserve other information over
time.
7
Different Communities … Different Metadata
Developers of the Interoperabilty of Data in E-Commerce Systems
(INDECS)
ideintified metadata for protecting intellectual property rights of
creators and publishers.
The Research Library Group’s Working Group on Preservation
Issues of Metadata identified metadata for “digital master files
that have preservation-based intent.”
8
Metadata to Information Technologists
The data that define the data elements in a table.
Data that control or explain other data.
Something that is not part of the bit stream of a record but needed
to understand the data in the record.
One system’s metadata is another systems data.
9
Source of Metadata
Automatically generated.
Supplied by creator of electronic resource.
Supplied by 3rd party.
10
Dublin Core
Metadata to improve information retrieval of Internet
resources.
Developed predominantly by the bibliographic community. Elements
similar to bibliographic surrogate.
11
Characteristics of Dublin Core
Simplicity
Semantic Interoperability
International Consensus
Extensibility
Metadata Modularity on the Web
12
Dublin Core Elements
Content
Coverage
Description
Type
Relation
Source
Subject
Title
Intellectual Property
Contributor
Creator
Publisher
Rights
Instantiation
Date
Format
Identifier
Language
13-14
Lawsuits over Metatags
Playboy!
15
Resource Description Framework
(RDF)
RDF
provides interoperability between applications that exchange
machine-understandable information on the Web.
16
Metadata and
XML
Provides a means of encoding and exchanging metadata.
EAD,
TEI,
VERS
17
XML Example
<?xml version="1.0" encoding="UTF-8"
standalone="yes"?>
<!DOCTYPE FAQ SYSTEM "FAQ.DTD">
<FAQ>
<INFO>
<SUBJECT> XML </SUBJECT>
<AUTHOR> Lars Marius Garshol </AUTHOR>
<EMAIL> larsga@ifi.uio.no </EMAIL>
<VERSION> 1.0 </VERSION>
<DATE> 20.jun.97 </DATE>
</INFO>
<PART NO="1">
<Q NO="1">
<QTEXT>What is XML?</QTEXT> <A>SGML light.</A>
</Q>
. . .
</PART>
</FAQ>
18
Electronic Records Metadata Project
Functional Requirements for Evidence in Recordkeeping
At the center of the image is a rectangle labeled BUSINESS, into
which flow three arrows.
The first arrow, labeled Govern, comes from an oval labeled
MANDATES. That oval has an arrow, labeled Establish Competencies
Of, leading out to a rectangle labeled PEOPLE [AGENTS], and
another arrow, labeled Account For Execution Of, leading in from a
rectangle labeled RECORDS.
The second arrow, labeled Are Responsible For, leads into
BUSINESS from the PEOPLE [AGENTS] rectangle.
The third arrow, labeled Are Evidence Of, leads into
BUSINESS from the RECORDS rectangle.
From PEOPLE [AGENTS] an arrow labeled Authenticate leads
to RECORDS, and another arrow, doubly labeled Function as
corporate & collective memory of and Provide authoritative
sources of information, leads back from RECORDS to PEOPLE
[AGENTS].
The image is composed of two rows. The first is formed by a single square
labeled Information Object. The second is formed by four
squares — labeled Content Information,Preservation
Description Information,Packaging Information, and
Descriptive Information — and an ellipsis. Lines
proceeding from the four lower squares and the ellipsis merge to form an arrow
entering the upper Information Object square.
The image is composed of two rows. The first is formed by a single
rectangle labeled Preservation Description Information. The second is
formed by four rectangles, labeled Reference Information,Provenance Information,Context Information, and Fixity
Information. Arrows flow from each of the lower-row rectangles into
the upper rectangle.
Table 4-1: Examples of
PDI
Types
Content Information Type
Reference
Provenance
Context
Fixity
Space Science Data
Object identifier
Journal reference
Mission, instrument, title, attribute set
Instrument description
Processing history
Sensor description
Instrument
Instrument mode
Decommutation map
Software interface specification
Calibration history
Related data sets
Mission
Funding history
CRC
Checksum
Reed-Solomon coding
Digital Library Collections
Bibliographic description
Persistent identifier
For scanned collections:
metadata about the digitisation process
pointer to master version
For born-digital publications:
pointer to the digital original
Metadata about the preservation process:
pointers to earlier versions of the collection
item
change history
Pointers to related documents in original
environment at the time of publication
The image consists of a set of rectangles interconnected by arrows. From
the first rectangle, labeled form and structure, an arrow leads to
records, which is joined by double-headed arrows to
provenance,original technical context,activities,
and data files.
The activities rectangle is connected to data files by an
additional double-headed arrow. Into activities leads an arrow from a
rectangle labeled Strategy, methods; into the latter rectangle leads
an arrow from a rectangle labeled requirements, rules.
Into the data files rectangle leads an arrow from a final
rectangle, labeled current technical context.
Metadata Facts to Remember
Metadata do not have to be digital.
Metadata relate to more than the description of an
object.
Metadata can come from a variety of sources.
Metadata continue to accrue during the life of an information
object or system.
One information object’s metadata can simultaneously be
another information object’s data. (Anne Gilliland-Swetland,
“Setting the Stage”)