• Tidak ada hasil yang ditemukan

Selection Criteria: What is Worth Saving?

Preservation

2.7 Selection Criteria: What is Worth Saving?

Patricia Battin, when President of the Commission on Preservation and Access and considering the deterioration of works on acid paper, observed that, “We faced very painful and wrenching choices—we had to accept the fact that we couldn't save it all, that we had to accept the inevitability of triage, that we had to change our focus from single-item salvation to a mass production process, and we had to create a comprehensive coopera- tive strategy. We had to move from the cottage industries in our individual library back rooms to a coordinated nationwide mass-production effort.” 87

Contrast this with the NARA opinion that preservation by triage may not be an option. “NARA does not have discretion to refuse to preserve a format. It is inconceivable … that a court would approve of a decision not to preserve e-mail attachments, which often contain the main substance of the communication, because it is not in a format NARA chose to preserve.”88

Perhaps a great deal of information will be lost. However, the information available will be much greater, much more accessible, and much easier to find than has historically ever been the case. It seems less likely that information wanted will have been irretrievably lost than that it

87 Battin 1992, Substitution: the American Experience, quoted in http://www.clir.org/pubs/

reports/pub82/pub82text.html.

See also A Framework of Guidance for Building Good Digital Collections, http://www.niso.org/

framework/Framework2.html.

88 Quoted in Talbot 2005, The Fading Memory of the State, http://

www.technologyreview.com/articles/05/07/issue/feature_memory.asp.

will still exist somewhere in digital form, but will nevertheless not be found.89

2.7.1 Cultural Works

One theme is the understandable reluctance of scholars to make choices be- cause of the unpredictability of research needs. Scholars are loath to say,

“this book will be more useful for future research than that one,” because the history of their fields shows that writers and subjects that seem inconsequen- tial to scholars in one era may become of great interest in the next, and vice versa. Moreover, discovery and serendipity may lead to lines of inquiry un- foreseen.

George 1995, Difficult Choices Between 1988 and 1994, scholarly advisory committees considered con- tent selection for History, Renaissance Studies, Philosophy, Mediaeval Studies, Modern Language and Literature, and Art History. Although this work was mostly done before digital capture was practical, its insights seem applicable today.

Selection seems to be difficult, but is not a challenge in the sense of be- ing hampered by technical research issues. If the technical and organiza- tional challenges are overcome, digital preservation is likely to become a routine activity with priorities set by each institution’s resource allocation process. The funding challenges are likely to continue, because more con- tent than research libraries can save will forever be generated. Copyright issues mostly involve conflicting interests that will not be quickly re- solved.

Today’s selection costs are exacerbated by the accelerating transforma- tion from information scarcity to information overflow. It might today be neither possible nor desirable to save everything. Decisions will occur, ei- ther by default or with varying degrees of care and insight. For govern- ments and ordinary folk, Lysakowski suggests looming disaster for office files in popular formats.90

Selection is much less challenging for old documents than for modern content. Writing and dissemination were relatively rare and relatively slow in earlier centuries. For instance, at the time of the American Revo- lution the British Departments of State had about 50 clerks; these clerks wrote longhand with quill pens. Their letters to North America took six to ten weeks to deliver and as long again to receive responses. Compare

89 See MegaNet’s Online Backup Market Research at http://www.meganet.net/pdfs/

onlinebkresearch.pdf.

90 Lysakowski 2000, Titanic 2020: A Call to Action, http://www.censa.org/html/Publications/

Titanic2020_bookmarks_Jan-21-2000.pdf.

these circumstances to those of today’s bureaucracies and to the tools the latter use to create and disseminate information. For old content, we bene- fit from de facto selection at the source—little was written.

In a 2003 posting to a Michigan State University discussion group frequented by fellow historians,88 Eduard Mark wrote, “It will be impossible to write the history of recent diplomatic and military history as we have written about World War II and the early Cold War. Too many records are gone. Think of Villon's haunting refrain, “Ou sont les neiges d'antan?” and weep. ... History as we have known it is dying, and with it the public accountability of government and rational public administration.

… The federal system for maintaining records has in many agencies—

indeed in every agency with which I am familiar—collapsed utterly.”

About the 1989 U.S. invasion of Panama, in which U.S. forces removed Manuel Noriega, Mark wrote that he could not secure many basic records of the invasion, because a number were electronic and had not been kept.

Even if this assessment were to be accurate, suggesting a discouraging prospect to some scholars, other communities will be easily satisfied. Stu- dents' wants are easier to satisfy than scholars’, because the secondary school student or college undergraduate assigned a term paper will choose the first pertinent and interesting material that he encounters. Digitization can provide more interesting material than has been commonly available.

It is becoming realistic for teachers to require students to find and work from original sources rather than from secondary opinions and other peo- ple's selections. The top issues for content usage by students are not avail- ability or selection, but rather accuracy, authenticity, and balance of view- points.

2.7.2 Video History

Even for scholars, the prospect of permanently lost historical information is not nearly as worrisome as Eduard Mark’s comments might suggest.

For the facts about what actually happened, news reports surely provide as accurate an account as the missing government memoranda might have done. What the press does not report about the Panama crisis, the decision processes that led to the invasion, might be nice to have. However, infor- mation about modern events much exceeds that about similar and, argua- bly more significant, earlier historical events. It would be an interesting exercise to compare information available about the Panama episode, es- pecially after some digging into private records and other nongovernment sources that will eventually become available, with information about the events and decisions leading to Napoleon’s defeat at of Waterloo.

To some extent, Mark’s concern is with the difference between what will be the case and what might have been the case had records retention circumstances been somewhat more favorable, rather than with what is needed by historical scholars. What is needed will often be a troublesome question. Shakespeare has the mad King Lear argue, “Oh, reason not the need. Our basest beggars are in the poorest things superfluous.”

Multimedia information representations appeared only recently, but still earlier than the period for which doomsayers suggest records will be for- ever lost. Not only do the major television networks have immense ar- chives that are being converted to digital formats,91 but consumers have acquired large numbers of digital and video cameras, computers, HDDs, and writeable optical disks. And roughly half of the U.S. local police de- partments routinely use video cameras. Surely government agencies and private citizens are squirreling away a historian’s treasure trove that, years from now, will be mined for what is of broad interest.

Only with fifty years’ perspective does it become clear whose personal history might be worth saving in public records. For instance, Leonard Bernstein’s childhood letters were acquired by the Library of Congress in the 1990s after a bidding war with the University of Indiana Library.92 In 2005, copies of many Dorothea Lange photographs were discovered in a neglected personal collection. They had been retrieved from a San Jose Chamber of Commerce dumpster about 40 years earlier. Found when a daughter was clearing her deceased parents’ house, the collection fetched a fortune in a Sotheby’s auction.

2.7.3 Bureaucratic Records

The future archives of the U.S. government will undoubtedly be one of the largest and most complicated digital collections ever. “We operate on the premise that somewhere in the government they are using every software program that has ever been sold, and some that were never sold because they were developed for the government. The scope of the problem is … open-ended, because the formats keep changing.”93 Numbers that suggest the scale and complexity are approximately 40 million e-mail messages from the Clinton White House, approximately 600 million TIFF files from the 2000 census, and up to a million pages in a single patent application that might include 3D protein molecule models or aircraft CAD drawings.

91 Williams 2002, Preserving TV and Broadcast Archives, http://www.dpconline.org/graphics/

events/presentations/pdf/DPCJune5th.pdf.

92 See the Bernstein collection at http://memory.loc.gov/ammem/lbhtml/lbhome.html.

93 Ken Thibodeau, director of the NARA electronic records program, quoted in Talbot 2005, The Fading Memory of the State.

Some business records will eventually be used by academic historians, in ways suggested by current use of sixteenth century galleon bills of lading held in Sevilla’s Archivo General de Indias.94

In the public sector, the visible efforts toward preservation of “born- digital stuff” are focused on cultural content, on scientific data,95 and on records of national significance. The literature makes few allusions to smaller political units, to educational priorities other than those of research scholars, to judicial systems, to health delivery systems, or to administra- tive collections of interest to ordinary citizens. (Among political issues, which include international terrorism, global warming, hunger and illness in Africa, and world trade rivalries, it would be naïve to expect most tax- payers to know or care much about their personal risks associated with disappearing documents.)

For about thirty years, some physicians have dreamed of the “longitudi- nal patient record”—a medical history accompanying each individual from birth to grave. Since the useful lifetime of uncurated digital records is much less than human lifetimes, preservation technology would be needed to fulfill this dream. (However, digital preservation is not the biggest chal- lenge to realization of lifelong health records. Patient privacy, information standards, and medical system infrastructure are more challenging.)

A personal letter from a schoolmate illustrates other needs:

Speaking of the [Immigration and Naturalization Service], we are trying to see if [my son] qualifies for [U.S.] citizenship on the basis of the fact that I did the border shuffle [between Canada and the U.S.] for most of my natu- ral life. Now it is a question of proving I exist, it seems.

[I am] trying to unearth papers to prove to the lawyers that I actually spent about half my time on either side of the border from birth until I mar- ried! Did you know that anyone who attended high school in the 1950s is clearly so far back in the Dark Ages as to be almost a non-person? Welcome to the real world. The schools in [city], where I attended the first three grades, tell me they have no records of any students born between 1931 and 1942; so much for that!

The school board in [city] says [XYZ] school no longer exists. “If we had records, they would have been forwarded to the school you went to.” The school fortunately had registered me as having come in from [XYZ] school, but kept no transcripts … And we haven't even been bombed or anything.

No wonder half of people who lose their papers die of despair. Bureaucracy is immovable! Yet, a front page story about a restaurant on my block starts

94 Archivo General de Indias, http://en.wikipedia.org/wiki/Archivo_General_de_Indias.

95 Examples include the 2-Micron All Sky Survey (10 Terabytes of data, five million images), the NSF Digital Library (preservation of curricula modules). The projects are driven by the research communities that use the data.

out with how the executive chef came to this country as an illegal immigrant from Mexico!

What might preservation priorities be if the public understood its risks?

2.7.4 Scientific Data

Awareness of the size and complexity of potential repositories for the digi- tal records of science has increased greatly in recent years. Over the next ten years science projects will produce more data than have been collected in all human history. Some European research organizations are already each generating approximately 1,000 gigabytes of data annually. Ap- proximately 15,000 academic periodicals exist; many of these are moving toward electronic versions. These circumstances are stimulating commu- nity efforts that are likely to replace fragmentary projects by individual re- search teams and institutional repositories, including the creation of a European Task Force to drive things forward quickly.96 Potential econo- mies of scale constitute a key incentive for creating a European infrastruc- ture for permanent access.97