Since major projects exist in several countries (Australia, Germany, the Netherlands, the UK and the US) to create the required network and storage infrastructure, the current book positions conservation within this infrastructure without describing the infrastructure in detail . But as with any engineering discipline, "the devil is in the details." This motivates a typical engineering approach where a problem is broken down into separate manageable components.
Acknowledgements
As this book's manuscript neared completion, the final report and recommendations of the Warwick Workshop, Digital Curation and Preservation: Defining the research agenda for the next decade, appeared.3 European experts on the full spectrum of the digital life cycle mapped the current state of game and future agenda. Encoding of documents and programs for interpretation, display and execution on computers whose architecture is not known when the information is corrected and archived.
Why We Need Long-term Digital Preservation 1
Information Object Structure 53 3 Introduction to Knowledge Theory 57
Distributed Content Management 135
Digital Object Architecture for the Long Term 205 10 Durable Bit-Strings and Catalogs 209
Preservation
What is Digital Information Preservation?
The Task Force sees digital information repositories as being held together in a national archival system primarily through the operation of two essential mechanisms. First of all, the task force report overlooks that the periodic migration of digital data involves two distinct notions.
What Would a Preservation Solution Provide?
Many institutions already have digital libraries, and will want to expand their services to durable content. They will want to achieve this without disruption, such as incompatible modification of their installed software.
Why Do Digital Data Seem to Present Difficulties?
Information producers will want to please consumers, and archive managers will want to please both producers and consumers. When a repository shares a holding with another repository—whatever the reason for the sharing—the recipient will want the distribution to include information closely related to that holding.
Characteristics of Preservation Solutions
1 to 2 Encode human output to create artifacts (typically on paper) that can be stored in conventional libraries and can also be posted. 1 to 3 Encode analog input to create digital representations using transformation rules that can be precisely described, along with their inevitable information losses, additions, and distortions.
Technical Objectives and Scope Limitations
Its technical measures naturally extend without modification to works digitized from their traditional predecessors, such as paper books. Other salient topics, such as intellectual property rights management and copyright compliance, are not significantly complicated by adding preservation to the other digital content management requirements,19 and are therefore treated only in passing. superficially.
Summary
The distinction is particularly evident in the program of the National Archives of Australia, which divides its system into three components that only share documents via transported storage media: a quarantine server, a preservation server and a digital repository.21. It discusses digital repository design only to the extent necessary to provide preservation context—the technical infrastructure into which preservation software must be integrated.
The Information Revolution
NDIIPP Plan, p.1 We are in the midst of widespread changes in how people interact with information, how it affects their lives and how information will be managed in a networked world. 22. In the digital environment, computer programming is to codify ideas and principles that have historically been vague or subjective, or that are based on situational legal or social constructs.23.
Economic and Technical Trends
- Digital Storage Devices
- Search Technology
31 The Museum of Media History (http://www.broom.org/epic/ provides a condensed history of the next decade in the form of a video presentation that begins: "The New York Times has gone offline. Since then, Capacity Increases have been the equivalent of new applications and the dollar size of the storage industry has increased.
Democratization of Information
Social Issues
As a preservation medium [it is perceived] as unstable, experimental, immature, unproven on a mass scale, and unreliable in the long term."49 This summary should not be accepted without careful examination of the context within which it was made. It is the traditional culture , to an extent remarkably little diminished by the rise of the scientific which governs the Western world." (p. 7).
Documents as Social Instruments
- Ironic?
- Future of the Research Libraries
- Cultural Chasm around Information Science
- Preservation Community and Technology Vendors
Compendium of File Formats The alleged statistic on which "ironic" is based involves an unreasonable comparison, viz. the fact that some old paper documents have survived compared to the fact that some digital documents may not survive. The real question for libraries is, what is the 'value proposition' they offer in a digital future.
Why So Slow Toward Practical Preservation?
The changing nature of digital information objects is one of the main obstacles to creating archives for their long-term storage and preservation. Misunderstandings contributed to a decade-long delay between the clear and widely recognized identification of the digital archiving challenge and its solution.
Selection Criteria: What is Worth Saving?
- Cultural Works
- Video History
- Bureaucratic Records
- Scientific Data
Speaking to the [Immigration and Naturalization Service], we are trying to see if [my son] qualifies for [US] citizenship based on the fact that I did the border crossing [between Canada and the US] for most of the country mine. - ral life. Awareness of the size and complexity of potential repositories for digital science data has grown greatly in recent years.
Summary
Of the autopsychological object type we consider the experiences, their individual constituents, and the qualities (of impressions, emotions, volitions, etc.). Language problems in digital preservation are a small example of Wittgenstein's dictum, "Most of the propositions and questions found in philosophical works are not false, but nonsensical.
Conceptual Objects: Values and Patterns
You should use rule set Y," we might say, "if you want result X with properties Z under circumstances W, consider the method defined by rule set Y." Here Z might be something like "with least human effort" or. Whatever the words 'real' and 'exist' mean, is it something different for "3" than for "the apple on my kitchen table" or for.
Ostensive Definition and Names
It's a different kind of value than 3—a kind of value important enough that we have a name for it. And if we carefully excluded definitional circularity, we would soon decide that we had no starting point for creating a dictionary.
Objective and Subjective: Not a Technological Issue
The speaker's claim includes a common error—the definite article suggesting that only one table is pertinent. He therefore infers something different (represented by the balloon above his head) than what the speaker implied.
Facts and Values: How Can We Distinguish?
How can we determine whether a sentence "X is the case" that is claimed to be about the world expresses a fact or a value. On Monday, John Doe said, "I believe that P" is objective, even if it happens to be false.
Representation Theory: Signs and Sentence Meanings
There's an apple on my kitchen table' has a meaning, even if no one knows if the statement is true. Ideally, I could lead any reader into my kitchen to see for myself the apple on the table - an impractical proposition, of course.
Documents and Libraries: Collections, Sets, and Classes
It uses 'set' to denote what has been called extension above; more precisely, this book means by 'set' an entity of Zermelo-Fraenkel set theory.117 It uses 'class' to denote what has been called intension. By this definition, the inventory of a repository is a special type of collection, e.g. "all holdings of the University of California Library".
Syntax, Semantics, and Rules
An example of value occurs in one topic we might want to discuss—what a librarian calls "subject classification." Interpretations of the expressions and productions of this symbolic system—elaborations of the conceptual model—depend in part on tacit assumptions about the meanings of the symbols, the relevance of the axioms, and the production rules of the system.
Summary
In mathematical symbol systems, these produced strings would be sentences that can be associated with natural language statements that have been shown to be true. The other arrows show communication that can be limited to purely syntactic transformations, including some between analog and digital representations.
Intentional and Accidental Information
You could object that 'digital' is not an essential feature of what needs to be preserved. The points already made above are that what is essential and what is merely accidental is a speaker's subjective choice, and that choice is likely to be guided by what the speaker wants to convey.
Distinctions Sought and Avoided
Common English verbs like "read" and "print" and nouns like "report" are often too narrow. In contrast to the phrase "knowledge management", the phrase "knowledge worker" does not introduce any confusion, since each knowledge worker operates on the basis of knowledge that he has but cannot communicate - the knowledge of how.
Trusted and Trustworthy
Specifically, the trusted entity is software that is provided by a development team that works closely with designers of the hardware and firmware security components on which it depends. It is usually an operating system that interacts closely with a sealed computer logic and memory component, which is the TCB itself.
Relationships and Ontologies
The critical distinction is that for a TCB, the identity and logic of the trusted entity is known. The structure of a part of the world that constitutes a domain of discourse is the set of relations between its objects.
What Copyright Protection Teaches
The paper and archive exist to provide evidence for the conceptual abstraction - the pattern inherent in existing and potential replicas of the paper's contents. Much of the digital content whose owners claim copyright protection also qualifies for long-term preservation.
Summary
The save action is to create a symbol string associated with the original model to be saved. There seems to be a sense that digital information should be held to a higher standard for authenticity and integrity than printed information.
What Can We Trust?
Broader issues regarding the correctness of what the preserved documents assert are outside the curatorial scope and responsibilities. They also tend to be clearer, both in their intentions and their commitments to damages, than those between individuals.
What Do We Mean by ‘Authentic’?
We can describe the transformations that took place in each part of the transmission channel that was used for the case under consideration. The rest of the mathematical language defines what we mean by integrity, true origin, and authenticity.
Authenticity for Different Information Genres
- Digital Objects
- Material Artifacts
- Natural Objects
- Artistic Performances and Recipes
- Literature and Literary Commentary
The consumer can cause a binary image 3 in the producer's computer to be copied to an identical binary image 3' in the consumer's computer. However, we do speak of "an authentic Gucci bag" even though the object at hand is not authentic in the sense we require of a famous painting.
How Can We Preserve Dynamic Resources?
The potentially wide distribution of repurposed documents threatens the authenticity of the original materials, as well as their authors' moral rights. Its meaning is simpler for digital documents than for analog recordings or for live performances because digital conditions are static most of the time, while we think of real performances as continuous in time.
Summary
This must be the end user – the person for whom the document in question is held and who assumes the risk of using it. In reality, the table occurs only in combinations, as in the propositions: 'The table is in the room' or 'The carpenter made the table', etc.
Testable Archived Information
The storage layout is chosen and managed by a file system to provide better reliability, economy, performance, and flexibility than is likely to be provided by a simple, contiguous layout.
Syntax Specification with Formal Languages
- String Syntax Definition with Regular Expressions Regular expressions are a context-independent syntax that can represent a
- BNF for Program and File Format Specification
- ASN.1 Standards Definition Language
- Schema Definitions for XML
Non-developers need only know that an ASN.1 Abstract Syntax is typically a specification of a list of typed elements that are either primitive (such as integers or octet strings) or constructed (such as arrays and sequences of additional elements . ). ASN.1 is widely used to describe security protocols, interfaces, and service definitions, such as the X.500 Directory and X.400 Messaging systems, which include extensive security models.
Monographs and Collections
See also Archival Finding Aids at the Library of Congress, http://www.loc.gov/rr/ead/. The EAD Document Type Definition (DTD) is a standard for encoding archival retrieval aids using XML.
Digital Object Schema
- Relationships and Relations
- Names and Identifiers, References, Pointers, and Links
- Representing Value Sets
- XML “Glue”
185 Ternary relations are the core data of the RDF standard, perhaps for the same reasons that we find them the most convenient structuring primitives. It is either visible as part of the switch character string or otherwise explicitly known to the current process.
From Ontology to Architecture and Design
- From the OAIS Reference Model to Architecture
- Languages for Describing Structure
- Semantic Interoperability
197 Shortcomings of [OAIS], http://www.ieee-tcdl.org/Bulletin/v2n2/egger/egger.html, reminds readers that the 2002 version of OAIS emphasizes: “This reference model does not specify a design or implementation.”. 198 This subsection simply rewrites some of the reference model for an open archive information system (OAIS.
Metadata
- Metadata Standards and Registries
- Dublin Core Metadata
- Metadata for Scholarly Works (METS)
- Archiving and Preservation Metadata
ISO/IEC 11179 specifies basic aspects of the composition of metadata elements for sharing between humans and machines.210. The NLNZ model is said to have had a major impact on younger schemes such as LMER Deutsche Bibliothek.
Summary
Phrases such as "the essential properties of the information object" are common in the literature. It depends on what the end users of the document want to achieve with its content.
Character Sets and Fonts
- Extended ASCII
- Unicode/UCS and UTF-8
235 Unicode currently defines nearly 100,000 characters; see http://www.unicode.org/charts/ and http://www.unicode.org/charts/charindex.html. The online edition of The Unicode Standard Version 3.0, http://www.unicode.org/book/u2.html, points to the latest versions.
File Formats
- File Format Identification, Validation, and Registries Identification, validation, and characterization of a file format are fre-
- Text and Office Documents
- Still Pictures: Images and Vector Graphics
- Audio-Visual Recordings
- Relational Databases
- Describing Computer Programs
- Multimedia Objects
239 For suggestions on how to deal with multiple file formats, see http://www.stack.com/file/extension/. Media Matters 2004, Dance Heritage Coalition Digital Video Preservation Report, http://www.danceheritage.org/preservation/Digital_Video_Preservation_Report.doc.
Perpetually Unique Resource Identifiers
- Equality of Digital Documents
- Requirements for UUIDs
- Identifier Syntax and Resolution
- A Digital Resource Identifier
- The “Info” URI
Furthermore, what one means by "same as" is subjective and may be different for different speakers, even for the same speaker at different times. A segment identifier provides granularity—either a location within the resource or a portion of the resource.
Summary
The most important things to note are that a prefix, "info:", has been registered as belonging to a new URI class, and that the next substring, namespace, identifies a pre-existing naming authority. The only exceptions seem to be inclusions of a few reserved punctuation characters, such as "#" and.
Security
- PKCS Specification
- Audit Trail, Business Controls, and Evidence
- Authentication with Cryptographic Certificates
- Trust Structures and Key Management
- Time Stamp Evidence
- Access Control and Digital Rights Management
These two keys are chosen together in a way to make it computationally impossible to guess the second key - the verification key - from the value of the first key - the signing key. Second, a repository service can collect the same information for each accessioned record in an announced period (eg, the last quarter of the year) to build and stamp a document containing this data.
Recordkeeping Standards
The ISAD(G) standard provides general guidelines for creating archival descriptions and says that "the purpose of archival description is to identify and explain the context and content of archival material to promote its accessibility [through] accurate and appropriate representations organized according to predetermined models.”327 ISAD(G) processes enable “the intellectual controls necessary to propel reliable, authentic, meaningful and accessible descriptive documents through time”.
Archival Best Practices
At least as long as an item is held by an archive, its ISAD(G) information "remains dynamic and may be subject to change in light of further knowledge of its content or the context of its creation." Its descriptions do not depend on the forms or media of the archival material. Nor does ISAD(G) provide guidance on the description of particular materials such as seals, sound recordings or geographical maps, because manuals on such subjects are available from other sources.
Repository Audit and Certification
However, faithfully implementing semi-human processes for decades would be difficult and expensive without the technology and business controls that few archival institutions can afford or manage well. Any differences would be reported and offline storage or mirrored repositories would be used to restore object integrity.
Summary
For the first thirty years of digital preservation, archives managed their digital collections. The rapid growth of digital materials in both volume and complexity [and] rising expectations of archive users.
Software Layering
25, the administrative interfaces are likely to be separated, as suggested on the right side of Fig. 25, applications will probably be implemented lower in the storage stack to manage storage processes—detecting system, network, and media degradation and failures.
A Model of Storage Stack Development