STAFF
PAPER
Preservation
Metadata
for
Digital
Collections
Presentation for workshop "Metadata for long term preservation in NEDLIB" Paris, Bibliotheque national de France, 25 February 2000 prepared by Kevin Bradley and Deborah Woodyard
1
Introduction
There have been a number of efforts to develop metadata specifications and sets to support preservation of a variety of digital resources. Because of its pressing business needs to manage both ‘born digital’ and ‘digital surrogate’ collections, the National Library of Australia has tried to find, or if necessary develop, metadata models to accommodate both.
The National Library of Australia, through its PANDORA Project (Preserving and Accessing Networked Documentary Resources of Australia), has been working at two levels in its efforts to ensure long-term access to Australian online publications. At a conceptual level, the Library has defined its business processes in a Business Process Model, and identified the data that will need to be collected for current and future management of each title in a Logical Data Model. In addition, in December 1998, the Library published its Digital Services Information Paper, which sets out requirements for a technical infrastructure to collect, store, provide access to, and manage its PANDORA Archive of Australian online publications, as well as to support the management of other digital and paper-based collections.
Concurrently, the Library has been working at a practical level, implementing the business principles by developing selection guidelines, liaising with publishers, and building a small archive of titles, which by February 2000, numbered over 400 and occupied approximately fifteen gigabytes of storage space.
Our purpose is to make Australia's cultural heritage available to future generations, as well as to today's scholars and researchers. Because to date there is little commercial publishing on the Internet in Australia, we have not yet had to deal with the complications of
archiving subscription-only publications. We have, however, developed principles for managing commercial publications and have begun discussion with publishers on implementation.
Future preservation strategies for online publications will require detailed information about the nature of the item and how it has been treated over time. Future researchers may also want historical information about the items they are using: what format it was
originally in, and whether anything has been lost in the capture and preservation process. Day-to-day management of titles for the Archive also requires administrative information such as whether the publisher has given permission to archive.
To date we have no facility for recording the full complement of metadata required for each title as outlined in the Logical Data Model. We await the implementation of a full archive management system. In the meantime, to enable us to document the administrative history of the titles being archived, our IT Section created the PANDORA Archive
Management System (PAMS) database. PAMS is rather a grandiose title for what is only a small metadata repository. Yet while it does not provide for all of the data elements that are required for long-term preservation, it does provide us with sufficient information about a title to manage archiving. Once an archive management system is available, the data from PAMS can be migrated to it.
At present PAMS does not record all the preservation metadata encompassed by the logical data model. In the absence of a satisfactory preservation metadata model that seems to achieve this objective, the NLA has invested in drafting its own model: a statement of the information it believes will be needed to manage the preservation of its digital collections. (1)
The draft Preservation Metadata Set draws on our corporate experience in a range of relevant fields:
• preservation, and preservation documentation, of library collections
• management of archives of online digital publications, physical format digital publications, and analogue and digital audio collections
• management of digitisation projects for text-based and image-based collections
• development of logical data models for a specific digital archiving implementation
• website database design.
This means that the draft Preservation Metadata Set is built on considerable relevant experience and thinking about the issues involved. However, we are very keen to subject the draft to critical scrutiny from specialists in all of these fields and others with an interest inmanaging digital collections over time, especially in a library context.
Research Libraries Group (RLG) PRESERV Working Group on Preservation Uses of Metadata(9), which mainly addressed digitisation projects. RLG invited us to adapt this set to describe a wider range of materials.
While we have learned a great deal from all these models, we accept responsibility for the metadata set we are proposing.
3
What the Preservation Metadata Set is
It is most important to realise that our proposed Preservation Metadata Set is intended to be a statement of the information we believe is needed to manage preservation of digital collections. It is meant to be a data output model, not a data input model. It indicates the information we want out of a metadata system, not necessarily what data should be entered, how it should be entered, by whom and at what time; nor does it concern itself with how the metadata should be associated with what it is describing. We believe this model should be applicable to many implementations that may decide to record this information in a variety of ways. This model simply says: ‘however you do it, this is what you have to deliver so we can manage preservation.’
It is also important to note that we are focusing solely on preservation requirements. The proposed metadata set does not attempt to deal with anything else. We recognise that in any implementation system there is likely to be an overlap between metadata recorded for different purposes. By focusing on the information we need out of the system to manage preservation, we put aside the question of whether particular elements may already be included in, say, other administrative or resource discovery metadata.
Different types of digital materials, and different archiving systems, will need different metadata support. There may be types of material and processes that are not adequately accommmodated by our proposal despite our intentions, and we would welcome feedback.
4
Granularity
The metadata set is based on the need to manage and describe collections, objects, and sub-objects (which we have called "files"). We have tried to show where we expect the
elements in the metadata set to be relevant to these different levels. We expect to make pragmatic decisions about the level at which records are needed, based on the level at which collections, objects and files are managed separately. This model assumes that the digital object is the primary focus of management and description. File and collection descriptions are created when appropriate.
5
Change history
Maintaining a history of what is being described is one of the essential objectives of any preservation documentation system. We looked at two options:
• maintaining a single record over time, which records all changes and processes applied to the item being described; or
We chose the latter approach. Managing digital objects and collections over time will mean creating and managing considerable amounts of information about them. We believe that the creation of a new record for each new manifestation will organise this information more clearly and conveniently.
6
Supporting alternative preservation strategies
It is impossible to determine unequivocally what we will need to know in order to manage digital preservation in the future, so our set of metadata elements necessarily reflects assumptions about our future requirements. Our aim with this proposed metadata set is to support both migration and emulation approaches. Just what is needed for these approaches will become clearer as we gain more collective experience with them.
7
Some key terms
To minimise confusion, we need to explain some of the terms we have used in the draft proposed Preservation Metadata Set:
• ‘work’, ‘manifestation’ – we have distinguished between a work, as a concept, and the physical or virtual manifestations that instance it. Most preservation processes involve managing manifestations. However, we found it useful to recognise that archiving decisions could be made for the work (eg ‘we will maintain this work in perpetuity’), with different archiving decisions applying to particular manifestions of it (eg ‘we do not need to keep this copy of it’).
• repeatability – because of the approach we have taken (a 1:1 relationship between each manifestation and its metadata record), our comments about the repeatability of information in any element do not refer to a sequence of changes, but to the possibility of multiple bits of information that may be true at the same time; for example, two agencies may collaborate in an archiving decision.
• obligation – we have avoided terms like ‘mandatory’, ‘conditional’, and ‘optional’, because they are so closely associated with data input models. Instead, we use the terms ‘essential’, ‘essential if appropriate’, and ‘desirable’, in their common usages.
Essential information we believe will definitely be required. Some elements are more relevant to some materials or processes than others, so they may be essential if applicable. Desirable information will not be critical, but is expected to be helpful.
• examples – we have provided examples wherever they are applicable. In some cases we have found it more useful to give generic examples, which appear in square brackets.
8
Comments
We invite comments on the draft Preservation Metadata Set. These may apply to the overall approaches we have taken, the details of any elements, the presentation, and any other issues.
Colin Webb
Director of Preservation National Library of Australia Canberra, ACT 2600
AUSTRALIA
Telephone + 61 2 6262 1662 Facsimile +61 2 6257 1703
9
References
(1) The NLA Preservation Metadata Working Group consists of: Margaret Phillips, Deborah Woodyard, Kevin Bradley, Colin Webb.
(2) Consultative Committee for Space Data Systems (CCSDS), CCSDS 650.0-R-1, May 1999.. Reference Model for an Open Archival Information System (OAIS) Draft
Recommendation for Space Data System Standards. Online. Available: http://www.ccsds.org/RP9905/RP9905.html. 7 October 1999.
(3) Koninklijke Bibliotheek. NEDLIB Networked European Deposit Library (home page). Online. Available: http://www.konbib.nl/nedlib/. 7 October 1999. Also see: van der Werf-Davelaar, Titia. "Long-term preservation of electronic publications: The NEDLIB Project", D-Lib Magazine. Volume 5 Number 9 (1999). Online. Available:
http://www.dlib.org/dlib/september99/vanderwerf/09vanderwerf.html. 8 October 1999.
(4) National Library of Australia. PANDORA Project: Preserving and Accessing Networked DOcumentary Resources of Australia (home page). Online. Available: http://www.nla.gov.au/pandora/. 7 October 1999.
See also: National Library of Australia. Digital Services Project. Online. Available: http://www.nla.gov.au/dsp/ http://www.nla.gov.au/dsp/. 8 October 1999.
(5) Carl Fleischhauer. Library of Congress-CNRI Experiment Project Proposed Metadata Set. 12 March 1999. Online. Available:
http://lcweb2.loc.gov/ammem/award/docs/nisometa/NISOintr.html
http://lcweb2.loc.gov/ammem/award/docs/nisometa/NISOintr.html. 8 October 1999.
(6) The Making of America II Testbed Project White Paper. Version 2.0 (September 15, 1998). Online. Available: http://sunsite.berkeley.edu/MOA2/wp-v2.html
http://sunsite.berkeley.edu/MOA2/wp-v2.html. 8 October 1999.
(7) Day, Michael. Metadata for Preservation Cedars Project Document AIW01. CEDARS, 3 August 1998. Online. Available: http://www.ukoln.ac.uk/metadata/cedars/AIW01.html. 8 October 1999, and later papers.
(8) National Archives of Australia. Recordkeeping Metadata Standard for Commonwealth Agencies. Version 1.0. May 1999. Online. Available: HYPERLINK
http://www.naa.gov.au/govserv/techpub/rkms/intro.htm
(9) RLG Working Group on the Preservation Uses of Metadata. Final Report. May 1998. Online. Available: http://www.rlg.org./preserv/presmeta.html
http://www.rlg.org./preserv/presmeta.html. 8 October 1999.
10
Recommended Elements
11Element Name 121. Persistent Identifier - type and identifier
Definition An identifier or 'permanent name' for an object that identifies it
uniquely and persistently, and enables links to different manifestations of it, to metadata about it, and to other objects related to it.
Rationale Each object described must have a persistent identifier to identify it uniquely, to discriminate between different manifestations of it and to link it with its metadata record.
LEVEL COLLECTION OBJECT FILE
Scope
Unique Identifier can be used to define collection if description exists at that level.
Unique Identifier must be used to define object.
Unique Identifier may be used to define file if this is different from the object. It is not necessary for an object with only one file.
Examples
1.Handle:
loc.ndlp.amrlp/3a1611622. URN:NBN:fi-fe19981122
Repeatable Yes Yes Yes
Obligation Essential Essential Essential
Remarks This metadata set permits any scheme for Unique Identifier in use by the agency
13Element Name 142. Date of Creation
Rationale The date, in combination with other metadata elements, provides evidence of an object's authenticity and provenance.
LEVEL COLLECTION OBJECT FILE
Scope If applicable, date that this instance of a collection came into being. May be a start date or a range of dates.
Date that this manifestation of the object came into being.
Date that this manifestation of the file came into being.
Examples 20010815
Repeatable No No No
Obligation Essential Essential Essential
Remarks Other dates will be recorded under appropriate elements.
15Element Name 163. Structural Type
Definition The type of object or collection being described using one of the following categories: Image, Sound, Video, Text, Database, Software, or, where the object comprises more than one form, Web Document or Multi-media. This list is extensible to accommodate new formats.
Rationale Choice of appropriate preservation strategy depends on knowing structural type.
LEVEL COLLECTION OBJECT FILE
Scope Collection Structural Type describes the collection using one of the following categories: Image, Sound, Video, Text, Database, Software, or where the
documents in the
Object Structural Type describes the object using one of the following categories: Image , Sound, Video, Text, Database, Software, or where the object comprises more than
collection comprise more than one form, Web Document, or Multi-media. This list is extensible to accommodate new formats.
one form, Web Document or Multi-media. This list is extensible to accommodate new formats..
Examples Example 1: 52 images Example 2: various (where the collection contains documents in a number of different formats.)
Example 1: Video Example 2: Web Document
Repeatable No No
Obligation Essential Essential
Remarks Many complex documents will require multiple descriptions in 5. File Description.
17Element Name 184. Technical Infrastructure of Complex Object
Definition The over-arching technical infrastructure of a complex object.
Rationale Managing preservation will require managing the structure of complex objects as well as their components.
LEVEL COLLECTION OBJECT FILE
Scope
Not relevant at collection level.
Describe the technical aspects of a complex object. This may include format of a Web page, or a CD-ROM. It will also include the total number of files and total of each type of file in the complex object. If the object comprises a single
file, or a collection of files with no
functional
relationship beyond being described as a collection, then this field is not used.
Examples
Example 1: CD-ROM containing 22 files - 14 .gif image files, 3 .wav audio files, 3 .txt files and 2.ex
executables assembled in
accordance with ISO 9660.
Example 2: Access database containing 1 .mdb file.
Repeatable No
Obligation Essential
Remarks
19Element Name 205. File Description
Remarks We have not yet ascertained whether these headings will
accommodate all components, and will continue work to test them. MIME types could be used to automatically populate the fields. We anticipate, however, that some files would be wrongly labelled by this approach. We welcome ideas both on the completeness of our
descriptive fields and the processes by which they could be populated.
Obligation Essential if applicable
5.6 Executables
The table below provides a comparison of sub-elements between file types.
5.5.6
Minimum size of data
element values
215.1 Image
5.1.1 Image Format and Version
Definition: The file type and version.
Examples: TIFF v 4.0
5.1.2 Image Resolution
Definition: The spatial resolution of the image, expressed as pixels per inch or cm (ppi, p/cm) or dots per inch or cm (dpi, d/cm).
Examples: 600 dpi; 320 dpi, 1500 d/cm
5.1.3 Image Dimensions
Definition: The number of pixels along the vertical and horizontal dimensions
Examples: 4096 x 6144 pixels
5.1.4 Image Tonal Resolution
Definition: Bit depth of each pixel, and whether multiple bits convey grey tones or colour
Examples: 1-bit; 8-bit greyscale; 24-bit colour
5.1.6 Image Colour Space
Definition: The colour space used for the image.
Examples: CMYK; RGB
5.1.7 Image Colour Management
Definition: Any system used to improve consistency of colour across capture, display and output of image.
Examples: PhotoCD; OptiCal; Profile/80; Softproof (Photoshop plug-in)
5.1.8 Image Colour Lookup Table
Definition: Location and encoding for any CLUT used to map from low to high colour depth.
Examples: FResident (if CLUT inside image file), Base64 (if CLUT binary encoded)
5.1.9 Image Orientation
Examples: 000 (ie top of image is correctly oriented);
090 (ie top of image is 90 degrees clockwise from where it should be)
5.1.10 Compression Definition: The type and level of compression.
Examples: CCIT 4
225.2 Audio
5.2.1 Audio Format and Version
Definition: The file type and version.
Examples: AIFF interleaved
5.2.2 Audio Resolution
Definition: The sampling frequency in kHz
Examples: 44.1kHz; 96kHz
5.2.3 Duration Definition: The length of the audio recording in minutes and seconds, or minutes, seconds, 100ths of seconds, and frames.
Examples: 67 minutes 12 seconds; 03:12:24:20
5.2.4 Audio Bit Rate Definition: Word length used to encode the audio. Consequently an indication of dynamic range.
Examples: 16 bit, 24 bit.
5.2.5 Compression Definition: The type and level of compression (note audio
compression, or bit rate reduction is a non reversable, "lossy" process)
Examples: MPEG 3
5.2.6 Encapsulation Definition: The delivery format and version.
Examples: Real Audio II
5.2.7 Track Number and Type
Definition: The number of tracks and how they are related to each other.
Examples: 1. 2 track Stereo 2. Single Track
3. 5 channel surround
5.3.1 Video File Format and Version
Definition: The file type and version
Examples: Quicktime version 1.1
5.3.2 Frame Dimensions
Definition: The resolution in pixels of a single still frame
Examples: 640 pixels x 480 pixels
5.3.3 Duration Definition: The length of the video recording in minutes and seconds, or minutes, seconds, 100ths of seconds, and frames.
Examples: 67 minutes 12 seconds; 03:12:24:20
5.3.4 Frame Rate Definition: The standard frame rate per second of the video material
Examples: 25 fps
5.3.5 Compression Definition: The type and level of compression. (note video
compression, or bit rate reduction is a non reversable, "lossy" process)
Examples: MPEG 3
5.3.6 Video
Encoding Structure
Definition: The type of encoding structure and version
Examples: Mpeg 3
Remark: It is possible for MPEG to be both encapsulation or delivery format and file type.
5.3.7 Video Sound Definition: The sound parameters where they are incorporated into a single video file structure. May include all fields specified in audio.
245.4 Text
5.4.1 Text Format and Version
Definition: The file type and version.
Examples: MS Word 97
5.4.2 Compression Definition: The type and level of compression.
Examples: .zip file
5.4.3 Text Character Set
Definition: The character set used in the document
5.4.4 Text Associated DTD
Definition: Name of the Document Type Definition applied to the structured text
Examples: EAD
5.4.5 Text Structural Divisions.
Definition: The logical divisions in a structured text file
Examples: TEI element DIVn used
255.5 Database
5.5.1 Database Format and Version
Definition: The file type and version.
Examples: MS Access 3.1
5.5.2 Compression Definition: The type and level of compression.
Examples: .zip file
5.5.3 Datatype and Representation category
Definition: Type of symbol, character or other designation used to represent a data element found in a database and the type of values used to represent it. May be general description of symbols or characters found in the database, or be specific to database elements.
Examples 1: Alphanumeric characters and graphical image. 2. The database element known as "xxx1" contains alphanumeric characters, the database element known as "xxx2" contains Graphical images.
5.5.4 Representation Form and Layout
Definition: Name or description of the form of representation for the data element and the layout of the characters that represent it (as appropriate). May be general description of form of representation found in the database, or be specific to database elements.
Examples 1. Text:Alphabetic, code:numeric, quantitative value:currency$$,$$$.99, date:yyyy:mm:dd.
2. The database element known as "xxx1" contains date:yyyy:mm:dd, the database element known as "xxx2" contains a quantitative
value:numericNNNN.NN the database element known as "xxx3" contains a quantitative value:currency$$,$$$.99
5.5.5 Maximum size of data element values
Definition: The maximum number of data units (eg characters) of the corresponding datatype.
maximum character count of 9.
5.5.6 Minimum size of data element values
Definition: The minimum number of data units (eg characters) of the corresponding datatype that would be present if data has been entered.
Examples: The database element known as "xxx1" (date) has a minimum character count of 8.
265.6 Executables
Remark: These are the executable components of a complex object, such as a CD-ROM or Web document. These executables perform certain operations within the digital object. They are not the software stated in system requirements, though they may be supported by it.
5.6.1 Code Type and Version
Definition: The code type used to compile the executable and version.
Examples: 1. Compiled using Intel code executable for Windows 95 environment
2. Compiled using Perl script 3. Java version 1.2
27Element Name 286. Known System Requirements
Definition The system or software necessary to access the information in the object or to use it. May describe the range of systems on which the object will operate, or the earliest version if the object continues to be compatible with newer version. May also describe system
requirements or plug-ins for operation, or memory requirements for an uncompressed file. Should state whether the requirements are
preferred or mandatory
Rationale Needed to manage requirements for accessing and operating digital objects.
LEVEL COLLECTION OBJECT FILE
Scope
If useful, may be summarised at this level.
Describes the system or software necessary to access the
information in the object or to use it.
If appropriate, may be described at this level.
better mandatory 2. PC, Windows 3.1 to Windows 98 3. Pentium 200 or better mandatory, Netscape Navigator v 4.0 with preferred. 4. Windows 95 Netscape Navigator v 4.0 with WinZip and 'x' plug-ins.
5. Java Virtual Machine.Real Audio G2 or better
Repeatable Yes Yes Yes
Obligation Desirable if useful Essential Essential if applicable
Remarks
29Element Name 307. Installation Requirements
Definition Any specialised procedures needed to install an object.
Rationale To enable access to objects with special installation requirements.
LEVEL COLLECTION OBJECT FILE
Scope
If useful, may be summarised at this level.
Record any additional specific instructions on passwords, how to start the program, etc.
May be described at this level, eg an executable file in an object
Examples
Use password [xxxxxxxxx]
1. Copy files to A drive
2. Copy to C drive and click on icon
This file needs to be copied into a separate directory
Repeatable Yes Yes Yes
Remarks This information will be particularly useful when undertaking future migrations.
31Element Name 328. Storage Information
Definition Storage capacity for objects and details of the storage system, including physical format.
Rationale May help in planning preservation action relevant to particular carriers and storage systems.
LEVEL COLLECTION OBJECT FILE
Scope Storage size and system/carrier for collection
Storage size and system/carrier for object
Storage size and system/carrier for file
Examples 1. 3.8 Gb on IBM
digital library 1. 1.3 Mb on CD
1. 500kb on exabyte tape
Repeatable No No No
Obligation Desirable Desirable Desirable
Remarks May record compressed or uncompressed size, as applicable, and should indicate which.
33Element Name 349. Access Inhibitors
Definition Any method used to inhibit access, which would impact on preservation procedures, such as encryption or watermarking.
Rationale Without this information, the object may not be able to be accessed, copied or migrated.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be summarised at this level.
If useful may be summarised at this level.
Examples
Use password [xxxxxxxx]
Associated dongle required.
1. Watermark by Digimarc
Professional 2. Watermark by Invisible Ink for Images, embedded before acquisition.
Repeatable Yes Yes Yes
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks Dongle may be more appropriately described under 6. Known System Requirements
35Element Name 3610. Finding and Searching Aids, and Access Facilitators
Definition Any system or method used to enhance access to information within the digital object, which need to be maintained in successive
generations.
Rationale To enable the aids and facilitators to be taken into account in any preservation process.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be summarised at this level.
Describes systems or methods at object level.
Not described at this level.
Examples
1. CD type ID points linked to file
2. Video and text time code linked.
Repeatable Yes
Obligation Essential if applicable
37Element Name 3811. Preservation Action Permission
Definition A statement of whether or not permission is held to create copies of the object for preservation purposes.
Rationale To record information about whether permission to copy for preservation is held by the agency, to facilitate management of preservation action.
LEVEL COLLECTION OBJECT FILE
Scope
If useful, may be summarised at this level.
Describes whether permission is held. Where permission is held, records date and who gave permission. Where permission is not held, records detail of negotiation status and date.
Describe at this level if different from object.
Examples
Redhead Publications granted permission 19991005
Permission to copy
URN:NBN:au:nla:nph-arch/ 1999/Q1999-Feb-1//http:// www.lib.latrobe.edu.au/AHR/ archive/Issue-December-1998/ smith.html withheld by the author
Repeatable No No No
Obligation Desirable Desirable Desirable
Remarks The need for this information may be influenced by provisions in relevant legal deposit legislation.
39Element Name 4012. Validation
Rationale To verify authenticity and to provide information for decision making on preservation pathways.
LEVEL COLLECTION OBJECT FILE
Scope Describe at appropriate level.
Describe at appropriate level.
Describes validation mechanism
Examples
1. Standard Internet checksum applied by publisher
2. Roland checksum applied by NLA 19991912
Repeatable Yes Yes Yes
Obligation Desirable Desirable Desirable
Remarks We are not sure whether this should be recorded in a separate element or whether it should be recorded under 23. Process
This is a mechanism, usually consisting of a number, that allows one to verify that an electronically transmitted file is what it purports to be, ie, the file is what is described in the metadata. At the simplest level, such a key might consist of the number of lines in a file (similar to the way that one indicates the number of pages that are transmitted via fax.) Or it might consist of a checksum which is an algorithm based on a manipulation of the sum of the bits that make up a file to yield a number that serves as a unique identifier for that file.
41Element Name 4213. Relationships
Definition Relationships between this manifestation and other objects necessary for preservation management.
Rationale To enable an object to be linked to its metadata, to earlier or later manifestations of it, other forms of it, and other objects, including finding aids. It is essential to maintaining a history of the change of an object by linking to the metadata of earlier manifestations, including that of the source object.
COLLECTION Describes links relevant to a collection.
1.Linked to previous manifestation in a migration sequence, eg, was migrated from [Unique Identifier and unique identifier type]
2.Linked to following manifestation in a migration sequence, eg, was migrated to [Unique Identifier and Unique Identifier type]
3. Contains the lower component (must be repeatable) eg contains [Unique Identifier and Unique Identifier type]
4. Relation to the primary instance of the collection, eg. This is the 5th generation copy of [Unique Identifier and Unique Identifier type]
5. Link to Preservation Master (if it exists), eg. Linked to [Unique Identifier and unique identifier type of preservation master]
6. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and Unique Identifier of duplication master]
7. Link to finding aid, eg. Linked to [Unique Identifier and Unique Identifier type]
Repeatable Yes
Obligation Essential if applicable
OBJECT Describes links relevant to an object.
1. Linked to previous in a migration sequence, eg was migrated from [Unique Identifier and unique identifier type]
2. Linked to following in a migration sequence, eg was migrated to [Unique Identifier and unique identifier type]
3. Is a part of a higher aggregation, eg part of [collection unique Identifier and unique identifier type]
4. Contains the lower component (must be repeatable) eg contains [Unique Identifier and unique identifier type]
5. Relation to the primary instance of the collection, eg this is the 5th generation copy of [unique identifier of primary instance and unique identifier type].
6. Related to accompanying material, eg accompanied by book [call number]
7. Link to Preservation Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of preservation master] 8. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of duplication master] 9. Linked to a previous object in a sequence in a periodic capture process eg, sequential copies of a web page.
10. Linked to a previous object in a sequence related to content, eg page in a book
11. Linked to a following object in a sequence in a periodic capture process, eg sequential copies of a web page.
13. Number in sequence and number of total in the sequence eg, 3 of 54.
14). Linked to items derived from the same instance, eg high definition copy available at [Unique Identifier and unique identifier type].
15. Linked to a database specification in accordance with ISO 11179.
Repeatable Yes
Obligation Essential if applicable
FILE Describes links relevant to a file.
1. Linked to previous in a migration sequence, eg was migrated from [Unique Identifier and unique identifier type]2. Linked to following in a migration sequence, eg was migrated to [Unique Identifier and unique identifier type] 3. Is a part of a higher aggregation, eg part of [collection unique Identifier and unique identifier type]
4. Relation to the primary instance of the collection, eg this is the 5th generation copy of [unique identifier of primary instance and unique identifier type].
5. Link to Preservation Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of preservation master] 6. Link to Duplication Master (if it exists), eg Linked to [Unique Identifier and unique identifier type of duplication master] 7. Linked to a previous object in a sequence in a periodic capture process eg, sequential copies of a web page.
8. Linked to a previous object in a sequence related to content, eg page in a book
9. Linked to a following object in a sequence in a periodic capture process, eg sequential copies of a web page.
10. Linked to a following object in a sequence related to content, eg page in a book
11. Number in sequence and number of total in the sequence eg, 3 of 54.
12. Linked to items derived from the same instance, eg high definition copy available at [Unique Identifier and unique identifier type]. 13. Linked to a database specification in accordance with ISO 11179.
Repeatable Yes
Obligation Essential if applicable
43Element Name 4414. Quirks
Definition Any characteristic that may appear as a loss in functionality or change in the look and feel of a collection , object or file. May describe quirks or provide links to quirks. Includes only descriptions of quirks that are relevant to the use of the current instance. Should include any relevant dates.
Rationale To assist preservation managers to assess the success or otherwise of preservation strategies and should prevent time being spent on trying to solve problems that were inherent in the object at the time the strategy was applied. This element documents changes that occur as a result of digitisation, duplication or migration, as well as those that might be inherent in the source document.
LEVEL COLLECTION OBJECT FILE
Scope If useful, quirks at the object or file levels may be summarised at collection level.
Describes quirks at the object level.
Describes quirks at the file level.
Examples
1.For all Web documents in the collection produced prior to HTML 4, the text format tag is no longer supported.
1. The Shockwave files could not be captured from the source document.
1. The text format tag is no longer supported by many browsers due to changes in HTML 4.
2. In the transfer from the previous format, the functionality of the mpeg video was impaired.
3. The original printed item contains high levels of bleed through, which degrades the image quality.
Repeatable Yes Yes Yes
Remarks
45Element Name 4615. Archiving Decision (work)
Definition The decision whether this work should be archived and the date of that decision. This field may also include a retention period or review date.
Rationale This information contributes to the preservation history of the work and facilitates future decision making.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be taken and described at this level, or may be summarised at this level.
Decision may be taken and described at this level, or may be summarised at this level.
Not described at this level.
Examples
Hansen Collection of digitised images to be archived. Date of Decision: 19990321 [yyy:mm:dd]
Australian Humanities Review to be archived. Date of Decision: 19991013[yyyymmdd], Date of Review
20011013[yyyymmdd]
Repeatable No No
Obligation Essential Essential
Remarks
47Element Name 4816. Decision Reason (work)
Definition Why the decision to archive the work (or not) was made.
Rationale This information contributes to the preservation history of the object and facilitates future decision making.
Scope Decision may be taken and described at this level, or may be summarised at this level.
Decision may be taken and described at this level, or may be summarised at this level.
Not described at this level.
Examples
1. Source images on glass negative are very fragile and are not available for research purposes.
1. Conforms to PANDORA selection guidelines [version and date yyyymmdd] 2. National Archives of Australia Disposal Authority reference number
Repeatable No No
Obligation Essential Essential
Remarks
49Element Name 5017. Institution Responsible for Archiving Decision (work)
Definition The name of the agency responsible for the decision that this work should be archived.
Rationale In a distributed archiving model, the agency making the archiving decision may be different from the one actually archiving the object.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be taken and described at this level, or may be summarised at this level.
Decision may be taken and described at this level, or may be summarised at this level.
Not described at this level.
Examples State Library of Victoria
State Library of
Victoria
Obligation Essential Essential
Remarks Responsibility at the manifestation level is described separately at Elements 18-20.
51Element Name 5218. Archiving Decision (manifestation)
Definition The decision whether this manifestation should be archived/retained and date of that decision. This field may also include a retention period or review date.
Rationale This information facilitates decision-making about the particular manifestation, recognising that while some manifestations of a work may be retained indefinitely, other manifestations may not.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be taken and described at this level, or may be summarised at this level.
Decision may be taken and described at this level, or may be summarised at this level.
Not described at this level.
Examples
1. Decision: To be archived. Date of Decision: 19971013 [yyyymmdd]. Review Decision: Do not retain. Date of Review: 19991013 [yyyymmdd] 22. Decision: To be archived. Date of Decision: 19971013 [yyyymmdd]. Review Decision: Retain indefinitely. Date of Review 19991013 [yyyymmdd]
Repeatable Yes
Remarks
53Element Name 5419. Decision Reason (manifestation)
Definition Why the decision to archive/retain the manifestation (or not) was made.
Rationale This information contributes to the preservation history of the work and facilitates future decision-making about the manifestation. Although the work itself may be required for permanent retention, a particular manifestation may be redundant in the archive.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be taken and described at this level, or may be summarised at this level.
Decision may be taken and described at this level, or may be summarised at this level.
Not described at this level.
Examples
1. Manifestation has hit a migration dead end. Future
migrations will be done from an earlier manifestation. 2. Source
manifestation - retain indefinitely
Repeatable Yes
Obligation Essential
Remarks
55Element Name 5620. Institution Responsible for Archiving Decision (manifestation)
Rationale In a distributed archiving model, the agency making the decision about archiving or retention may be different from the one actually archiving the object.
LEVEL COLLECTION OBJECT FILE
Scope Decision may be taken and described at this level, or may be summarised at this level.
Decision may be taken and described at this level, or may be summarised at this level.
Not described at this level.
Examples State Library of Victoria
State Library of
Victoria
Repeatable No No
Obligation Essential Essential
Remarks
57Element Name 5821. Intention Type
Definition The intended use of a particular manifestation.
Rationale Provides information necessary to manage various copies of an object.
LEVEL COLLECTION OBJECT FILE
Scope
Not described at this level.
Describes the intended use of the particular
manifestation.
Not described at this level.
Examples
1. Preservation master
2. Access copy
Repeatable No
Remarks
59Element Name 6022. Institution with preservation responsibility
Definition The name of the agency that has accepted responsibility for
preservation. Should include date of commencement of acceptance of responsibility, or range of dates of responsibility.
Rationale Attributes responsibility and provides information for allocation of resources and prevention of unwanted duplication. May be different from the agency selecting and the agency actively carrying out processes.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be described or summarised at this level. Records the name of the agency responsible for preservation of this collection and relevnat dates.
Records the name of the agency responsible for the preservation of this object and the relevant dates.
Not described at this level.
Examples National Library of Australia, 1 July 2000 -
National Library of Australia, 1 July 2000 -
Repeatable Yes Yes
Obligation Essential Essential
Remarks Primary level of description is the object. If useful, may be dscribed or summarised at collection level. Information about responsibility should be available at all levels, even if input only at object level.
61Element Name 6223. Process
responsible agencies or persons.
Rationale This element documents what has happened to a particular
manifestation of an object. The series of linked records pertaining to manifestations of an object builds up a change history over time. This information is essential to document what preservation methods have been applied to the object and how the various manifestations might differ from each other.
LEVEL COLLECTION OBJECT FILE
Repeatable Yes Yes Yes
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks The entire element, including sub-elements, must be repeatable.
Sub-elements 23.1 Description of Process
23.2 Name of the Agency Responsible for the Process
23.3 Critical Hardware Used in the Process
23.4 Critical Software Used in the Process
23.5 How Process was Carried Out
23.6 Guidelines Specified to Implement Process
23.7 Date and time
23.8 Result
23.9 Process Rationale
23.10 Changes
23.11 Other
63Sub-element
Name 6423.1 Name of the Process
Definition Name of the process applied.
Rationale To record what process was applied
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be summarised at this level.
Describes the process applied.
Examples
1. Move from UNIX to Solaris platform
1. Copy from floppy disk to CD-R
2. Copy from publishers' Web site to archive
1. Conversion of .wav to .aiff
Repeatable No No No
Obligation Essential Essential Essential
Remarks
65Sub-element
Name 6623.2 Agency
Definition The name of the agency responsible for the process.
Rationale Track responsibility for changes to the collection, object or file.
LEVEL COLLECTION OBJECT FILE
Scope
If useful, may be summarised at this level.
Records the name of the agency
responsible for the process applied.
Records the name of the agency
responsible for the process applied.
Examples 1. Migration Unlimited
(commercial firm to whom migration has been outsourced)
1. Migration Unlimited
(commercial firm to whom migration has been outsourced)
1. Migration Unlimited
(commercial firm to whom migration has been outsourced)
Repeatable Yes Yes Yes
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks
67Sub-element
Definition Itemisation of critical hardware used in the process.
Rationale Track equipment used to make changes to collection, object or file.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be summarised at this level.
Records the critical hardware used in the applied process.
Records the critical hardware used in the applied process.
Examples 1. [Particular brand and model of analogue to digital converter]
2. [Particular brand and model of digital camera]
1. [Particular brand and model of analogue to digital converter]
2. [Particular brand and model of digital camera]
1. [Particular brand and model of analogue to digital converter]
2. [Particular brand and model of digital camera]
Repeatable Yes Yes Yes
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks
69Sub-element
Name 7023.4 Critical Software
Definition Itemisation of critical software used in the process.
Rationale Track software used to make changes to collection, object and file.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be summarised at this level.
Records the critical software used in the applied process.
Records the critical software used in the applied process
Examples 1. Gathered using Harvest version 2.2
1. Gathered using Harvest version 2.2
1. File save, using Netscape
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks
71Sub-element
Name 7223.5 How Process was Carried Out
Definition Description of significant steps involved in the process.
Rationale To understand the details of the process.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be summarised at this level.
Records how the process was carried out
Records how the process was carried out
Examples 1.The relevant files were identified and batch scanned with the OCR option turned off
2. [Image colour bar, Image technical targets, Image colour profile for scanner and/or image light source used]
1.The relevant files were identified and batch scanned with the OCR option turned off
2. [Image colour bar, Image technical targets, Image colour profile for scanner and/or image light source used]
1. File was scanned with batch 69 with OCR option turned on.
Repeatable Yes Yes Yes
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks
73Sub-element
Name 7423.6 Specifications
Definition Specification or guidelines used to implement the process.
applied.
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be summarised at this level.
Describes or supplies guidelines specified to implement process.
Describes or supplies guidelines specified to implement process
Examples 1. The standards for transfer were
specified in contract [reference number] 2. Scanned in
accordance with [xyz] scanning
specifications
1. The standards for transfer were
specified in contract [reference number] 2. Scanned in
accordance with [xyz] scanning
specifications
1. The standards for transfer were
specified in contract [reference number] 2. Scanned in
accordance with [xyz] scanning
specifications
Repeatable Yes Yes Yes
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks Could include link to specifications.
75Sub-element
Name 7623.7 Date and time
Definition Date and time of process
Rationale To identify sequence of processes and provide a record of dates significant to the history of the collection, object or file.
LEVEL COLLECTION OBJECT FILE
Scope
If useful, may be summarised at this level.
Records the date, or range of dates and time, if relevant, of process being carried out
Repeatable No No No
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks If other dates are required that can not be expressed as a range, then a new process description is required.
77Sub-element
Name 7823.8 Result
Definition Notes success or otherwise of process
Rationale To record the outcome of quality control assessment
LEVEL COLLECTION OBJECT FILE
Scope If useful, may be summarised at this level.
Records success or otherwise of the process
Records success or otherwise of the process
Examples 1. All files converted successfully
2. All files converted, however, data was lost from title header.
1. All files converted successfully
2. All files converted, however, data was lost from title header.
1. File successfully converted
2. File converted, however, data was lost from the header
Repeatable Yes Yes Yes
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks There may be some overlap with 14. Quirks
79Sub-element
Name 8023.9 Process Rationale
Definition The reason for applying the process.
Rationale To understand the objectives of the process.
Scope
If useful, may be summarised at this level.
Describes the objectives for the application of the process to the object.
Describes the objectives for the application of the process to the file.
Examples 1. The object was converted to PDF to provide online access
1. The object was converted to PDF to provide online access
1. The file was converted to PDF to provide online access
Repeatable Yes Yes Yes
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks
81Sub-element Name
8223.10 Changes
Definition The changes that were made to the collection, object or file by the
process.
Rationale To record the changes for the preservation history of the collection,
object or file.
LEVEL COLLECTION OBJECT FILE
Scope
If useful, may be summarised at this level.
Describes the changes to the object that resulted from the process
Describes the changes to the file that
resulted from the process
Examples
1. File names were modified to display through nph-arch program.
2. Data now stored in standard Word 97 format
1. File names were modified to display through nph-arch program.
2. Data now stored in standard Word 97 format
1. File names were modified to display through nph-arch program.
2. Data now stored in standard Word 97 format
Obligation Essential if applicable Essential if applicable Essential if applicable
Remarks We are not sure to what extent this element overlaps with other
elements.
83Sub-element Name
8423.11 Other
Definition Any other information about the process that may be useful.
Rationale To cover anything that may not fit into other sub-elements
LEVEL COLLECTION OBJECT FILE
Scope
If useful, may be summarised at this level.
Records any other relevant information
Records any other relevant information
Examples
Repeatable Yes Yes Yes
Obligation Undesirable :) Undesirable :) Undesirable :)
Remarks
85Element Name 8624. Record Creator
Definition The name of the institution and the names of individuals who have contributed data to this record.
Rationale To record responsibility for the metadata.
LEVEL COLLECTION OBJECT FILE
Scope Records names of agency and individual
Records names of agency and individual
Records names of agency and individual
Examples 1. National Library of Australia
1. National Library of Australia
2. Colin Webb 2. Deb Woodyard 2. Margaret Phillips
Repeatable Yes Yes Yes
Obligation Desirable Desirable Desirable
Remarks System-generated log would be one way of recording this information.
87Element Name 8825. Other
Definition Any other information relevant to the preservation of the collection, object or file.
Rationale To cover anything that may not fit into other elements.
LEVEL COLLECTION OBJECT FILE
Scope Records any relevant information about collection
Records any relevant information about object
Records any relevant information about file
Examples
Repeatable Yes Yes Yes
Obligation Undesirable :) Undesirable :) Undesirable :)