• Tidak ada hasil yang ditemukan

The role of metadata

Dalam dokumen Preserving Digital Materials (Halaman 100-104)

Metadata (‘structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource’ (NISO, 2004, p.1)) has captured the attention of many. A considerable amount of energy has been channeled into its development, and many words have been written about it. A subset of this activity is concerned with preservation metadata. Preservation metadata is now considered an integral part of the strategies required for long- term maintenance of and access to digital materials, although some (such as Chivers and Feather, 1998, and Lazinger, 2001) claim for it an even more domi- nant role, suggesting it to be almost synonymous with digital preservation. Day notes that metadata is an intrinsic part of the current key digital preservation strategies of emulation, migration, and encapsulation (Day, 2004, p.255). The UNESCO Guidelines reinforce the essential nature of preservation metadata, listing among their fundamental principles four relating to metadata:

17. Digital heritage materials must be uniquely identified, and described using appropriate metadata for resource discovery, management and preservation.

18. Taking the right action later depends on adequate documentation. It is easier to document the characteristics of digital resources close to their source than it is to build that documentation later.

19. Preservation programmes should use standardised metadata schemas as they become available, for interoperability between programmes.

20. The links between digital objects and their metadata must be securely main- tained, and the metadata must be preserved (UNESCO, 2004, p.23).

The second of these principles is reinforced by the view of US experts that best practice for preservation metadata is to create it ‘at the information creation stage’ (Meeting of Experts on Digital Preservation, 2004).

The importance of metadata is corroborated by Australian digital preservation specialists interviewed in 2004. One was a self-confessed ‘metaphile’. Another considered that intellectual control was the key to effective digital preservation:

the only way to manage [digital preservation] is to have an intellectual control system that applies across all media, that is not media-specific . . . building systems that are not technologically or system dependent, [where] the intellectual control system is a sort of conceptual system above whatever technology you use.

From his experience of ‘10 or 15 years of work’ in the digital preservation envi- ronment, he claims that metadata works ‘really really well, you know . . . [If]

you use that across all technologies or media or record types, then you don’t ever have a problem finding anything’.

Preservation metadata – defined on the PADI web site as ‘structured ways to describe and record information needed to manage the preservation of digital resources’ – should not be confused with descriptive metadata, those based on schemas such as Dublin Core or AGLS (the Australian Government Locator Service). These have the primary aim of resource discovery, that is of enabling information resources to be identified and linked to users’ requests, and are best considered as ‘a set of signposts for digital surfers . . . there to guide people to 1111

2 3 4 51 6 7 8 9 10 1 2 3111 4 5 6 7 8 9 20111 1 211 3 4 5 6 71 8 9 30111 1 2 3 4 5 6 7 8 9 40111 1 2 3 44 45 46 47 48 49 50111

The role of metadata 83

resources’ (Wilson, 2003, p.33, citing Tom Baker). By comparison, preservation metadata stores ‘technical details on the format, structure and use of the digital content, the history of all actions performed on the resource including changes and decisions, the authenticity information such as technical features or custody history, and the responsibilities and rights information applicable to preserva- tion actions’ (Preservation metadata, 2003).

Preservation metadata consists of two types of metadata: content information, and preservation description information (to use the terms of the OAIS reference model). Content information consists of ‘details about the technical nature of the object which tells the system how to re-present the data as specific data types and formats’. Preservation description information is ‘other information needed for long-term management and use of the object, including identifiers and bibliographic details, information on ownership and rights, provenance, history, context including relationships to other objects, and validation informa- tion’ (UNESCO, 2003, p.98). More specifically, preservation metadata

• Identifies the material for which a preservation programme has responsibility

• Communicates what is needed to maintain and protect data

• Communicates what is needed to re-present the intended object (or its defined essential elements) to a user when needed, regardless of changes in storage and access technologies

• Records the history and the effects of what happens to the object

• Documents the identity and integrity of the object as a basis for authenticity

• Allows a user and the preservation programme to understand the context of the object in storage and in use (UNESCO, 2003, p.97).

There are, as yet, no commonly accepted standards for preservation metadata, although several contenders are in development. Perhaps the best-known schemes to date include those of the OCLC/RLG Working Group, the National Library of New Zealand, and the National Library of Australia; another Australian scheme is the National Archives of Australia’s recordkeeping meta- data scheme. Early preservation metadata schemes were developed by NEDLIB and CEDARS, and the National Library of Australia.

The National Library of Australia’s metadata scheme was grounded in that Library’s experience with digital preservation, including the PANDORA web archiving project, and was issued in 1999 (Lupovici, 2001, pp.6–7; National Library of Australia, 1999b; Bradley and Woodyard, 2000). An exposure draft issued by the National Library of Australia in October 1999 outlined the informa- tion it believed was needed to manage the preservation of its digital collections.

This scheme was informed by other schemes and models, such as the OAIS Reference Model, the NEDLIB project, and activities relating to projects at the Library of Congress, CEDARS, the National Archives of Australia, and the Research Libraries Group, none of which was considered to be fully satisfactory for the National Library of Australia’s requirements at that time. The National Library of Australia model is ‘meant to be a data output model, not a data input model’; that is, it defines ‘the information we want out of a metadata system, not necessarily what should be entered, how it should be entered, by whom and at what time; nor does it concern itself with how the metadata should be associated with what it is describing’ (National Library of Australia, 1999b).

It consists of 25 elements:

84 What Attributes of Digital Materials Do We Preserve?

NEDLIB (the Networked European Deposit Library) published a metadata scheme in 2000, which was intended specifically to manage the preservation of digital materials received through national deposit. CEDARS immediately devel- oped the NEDLIBS metadata scheme further to cover administrative, technical and legal information (Lupovici, 2001, pp.6–7). These and other schemes were analysed by a Working Group on Preservation Metadata convened by OCLC and the Research Libraries Group (RLG), with representatives from the RLG’s international membership, including Australia. This Working Group released a recommended preservation metadata framework in 2002 (OCLC/RLG Working Group on Preservation Metadata, 2002). The framework is based on the OAIS reference model recommendations. Following its release OCLC and RLG convened the PREMIS Working Group to focus on how preservation metadata could be implemented in digital preservation systems. In 2004 PREMIS released a report on metadata practices and other digital preservation activities of respon- dents in 13 countries (OCLC/RLG PREMIS Working Group, 2004). Among other outcomes will be an implementable set of core preservation metadata elements and a data dictionary (Guenther, 2004; Caplan, 2004).

The National Library of New Zealand has been active in developing a meta- data scheme for digital preservation as well as tools for extracting metadata automatically from digital materials. Its scheme is based on the OCLC/RLG metadata framework. Knight describes progress and expectations of the National Library of New Zealand’s preservation metadata scheme (Knight, 2003, p.18).

Its metadata standards framework is ‘a work in progress which, when complete, will provide a comprehensive statement of the standards environment within the National Library’. The scheme has 18 elements describing the logical object, 13 elements to record the history of actions performed on the object, nine for 1111

2 3 4 51 6 7 8 9 10 1 2 3111 4 5 6 7 8 9 20111 1 211 3 4 5 6 71 8 9 30111 1 2 3 4 5 6 7 8 9 40111 1 2 3 44 45 46 47 48 49 50111

The role of metadata 85

1. Persistent Identifier 14. Quirks

2. Date of Creation 15. Archiving Decision (work) 3. Structural Type 16. Decision Reason (work) 4. Technical Infrastructure of 17. Institution Responsible for

Complex Object Archiving Decision (work) 5. File Description 18. Archiving Decision

(manifestation)

6. Known System Requirements 19. Decision Reason (manifestation) 7. Installation Requirements 20. Institution Responsible for

Archiving Decision (manifestation) 8. Storage Information 21. Intention Type

9. Access Inhibitors 22. Institution with preservation responsibility

10. Finding and Searching Aids, and 23. Process Access Facilitators

11. Preservation Action Permission 24. Record Creator 12. Validation 25. Other.

13. Relationships

Figure 5.1 Elements of the National Library of Australia Metadata Scheme

technical information about file types, and five to record the history of changes made to the preservation metadata (Searle and Thompson, 2003). A resource discovery metadata scheme was released in 2000, followed by a preservation metadata scheme in November 2002, which was revised in June 2003. A data model and its associated XML schema definitions, released in July 2003, describe how the metadata scheme can be implemented.

The metadata extraction tool developed by the National Library of New Zealand has attracted the attention of the international digital preservation community. It is designed to extract automatically preservation metadata from a range of file formats and output the extracted data in XML format so that it can be loaded into a preservation metadata repository. It is being developed to extract data from a number of common file formats, such as MS Word (various versions), WordPerfect, Open Office, MS Works, MS Excel, MS Powerpoint, TIFF, JPEG, WAV, MP3, HTML, PDF, GIF, and BMP. This tool ‘is designed for use by the wider digital preservation community and it is hoped that any future development will be informed by that community’, notes the National Library of New Zealand’s web site, describing the strongly collaborative approach taken by the international preservation metadata community. This metadata extrac- tion tool, as well as information about it, is available free from the National Library of New Zealand’s web site. Its innovative nature and significance was acknowledged by its short-listing for the digital preservation award, introduced in 2004 as one of the Pilgrim Trust Conservation Awards.

The National Archives of Australia’s recordkeeping metadata scheme is part of its ‘e-permanence’ initiative. It is designed to be used in conjunction with AGLS metadata, to uniquely identify records, authenticate them, document and preserve their content, context and structure over time, administer conditions of access and their disposal, track usage history and management, and simplify the transfer or migration of electronic records between computer systems (Robertson and Cunningham, 2000, p.196). Some of its elements are specific to preservation: no. 2, Rights Management, and no. 17, Preservation History (National Archives of Australia, 2001c).

Although implementing preservation metadata undoubtedly has the poten- tial to improve digital preservation practice, its full potential will not be realized until there is widespread agreement on a standard set of elements. US experts concluded that 107 elements were needed in a preservation metadata scheme (Meeting of Experts on Digital Preservation, 2004), and the current lack of consensus is apparent even from the brief descriptions of preservation meta- data schemes, above. There is potential for reducing the costs of developing preservation metadata schemes, as well as reducing the costs of creating meta- data by using automated metadata creation software. Standardization also allows greater sharing of information, making it easier to move digital objects from one archive to another, and encourages the standardization of preserva- tion processes (UNESCO, 2003, pp.92–93). Here, perhaps, is an analogy with the history of the MARC format in libraries: from a single format in the mid-1960s, it proliferated during the 1970s and 1980s, and only after several decades has it once again become a single internationally-adopted standard. The advantages of a limited number of standard schemes appear not to have yet been fully recognized by the preservation metadata community.

Preservation metadata schemes continue to evolve. There is still relatively little experience of the effective application of preservation metadata. The

86 What Attributes of Digital Materials Do We Preserve?

NSF-DELOS Working Group on Digital Archiving and Preservation notes that there has been ‘limited evaluation of the effectiveness or cost of metadata for managing digital entities over time’ and suggests that research is needed, for example to demonstrate its value for specific purposes and to determine how much metadata is needed. Tools are needed to ‘aid in the creation, authoring and management of metadata’, such as those which automate, or partly automate, the creation of metadata; these are being developed, an example being the National Library of New Zealand’s metadata extraction tool. Also required are tools that manage metadata schemes so that they are useful over time, such as those ‘to track the provenance of metadata schema, for version control, and to allow users to navigate from current metadata schema and ontologies to those used when the digital entity was created’. The value of meta- data could be assessed relative to the costs of ‘extracting, creating and managing’

it, to provide a better understanding of the ‘minimum amount of metadata neces- sary for digital preservation’ (NSF-DELOS Working Group on Digital Archiving and Preservation, 2003, p.19).

Dalam dokumen Preserving Digital Materials (Halaman 100-104)