• Tidak ada hasil yang ditemukan

Knowledge Representation

Dalam dokumen Knowledge Management - SILO.PUB (Halaman 130-148)

representation The action or fact of exhibiting in some visible image or form.

1483 CAXTON Cato Aiijb, Thymages of sayntes . . . gyue us memorye and make representation of the sayntes that ben in heuen.

In ThIs ChapTeR

Key ConCepts

Cognition, Artificial Intelligence, and Design Representation and Cognition

Representation and Artificial Intelligence Representation in Knowledge Bases

Representation and the World Wide Web: The Semantic Web

Key papers

Vannevar Bush, “As We May Think”

William A. Woods, “What’s Important about Knowledge Representation?”

Terrence A. Brooks, “Where Is Meaning When Form Is Gone? Knowledge Representa- tion on the Web”

CognITIon, aRTIfICIal InTellIgenCe, and desIgn

Knowledge representation is a concept primarily of concern in three areas: (1) cogni­

tive science, (2) artificial intelligence, and (3) design for the World Wide Web. Although 123

there are meaningful differences in the ways in which representation is understood in these three domains, there are also important similarities.

RepResenTaTIon and CognITIon

In cognitive science, knowledge representation has primarily to do with the ways in which people form and use mental representations of knowledge. The concept and nature of mental representation have been explored from many points of view, resulting in many approaches to and models for understanding mental representation of know­

ledge. There is no one model agreed upon by psychologists and cognitive scientists.

Markman, in his extensive exploration of approaches to representing knowledge, con­

tended that “mental representation is a critical part of psychological explanation” and identified four fundamental characteristics or components of a successful definition of mental representation:

1. A represented world that is described or explained through representation. The repre­

sented world presumably has some internal integrity, some sense of reality, that makes it an appropriate subject for representation. A key element of a mental or cognitive representation is that it exists in the mind of a human. The represented world may exist outside the mind or inside the mind, or it may be characterized by some combination of external and internal. The represented world may itself be a set of representations, as would be the case when the represented world is a collection of works of art or a library of publications.

2. A representing world that consists of the content of a system or set of representations.

There are many possible approaches to representing any domain, including, but not limited to, images, sounds, words, and numbers. Some representations, such as images or sounds, are analogs of the representing world. Others, such as letters or numbers, are symbols that have no inherent relationship to the representing world. The expres­

siveness of a representation is in part a function of how the representing world reflects the represented world. The process of representation inevitably results in the loss of some information about the represented world—it is never possible for the representing world to completely encapsulate the represented world.

3. Representing rules that determine how representations can be used to connect the rep­

resenting world and the represented world. The link between the represented world and the representing world is achieved primarily through the application of such rules. Just as there are many different possible approaches to defining the nature of the represent­

ing world, there are many possible rule sets and rule structures that can be used to link the representing world to the represented world. The meaning or interpretation of a representation is as much a function of the nature of the representing rules as of the nature of the represented world or the nature of the representing world.

4. A process that uses the representation to accomplish some goal or achieve some purpose.

Knowledge representation doesn’t occur in a vacuum or at random. A representation is formulated for some reason; the reason defines the process, which may to some extent define the rules and influence the nature of the representing world, thereby determining the extent to which the represented world is properly represented.1

It is the combination of these components that defines the knowledge representation, allows for its use, and defines its usefulness.

According to Markman, there are three essential ways in which representational for­

mats differ: (1) “the duration of representational states,” (2) “the presence of discrete symbols,” and (3) “the abstractness of representations.”2 Markman examines six catego­

ries of knowledge representation addressed in the literature of cognition: (1) spatial rep­

resentations, (2) featural representations, (3) associative representations, (4) structural representations, (5) conceptual relationships, and (6) scripts and schemata.

Cognitive science deals with the roles of meaning and interpretation in knowledge representation primarily by ignoring them. Elements of epistemology and semantics and the problems of variability in interpreting meaning are “avoided . . . by assuming that the cognitive system is a computational device.”3 Issues of meaning are basically assigned to philosophers. This is not to imply that meaning is unimportant. The relationship of knowledge representation to meaning is central to the arguments of Polanyi and Popper regarding the roles of perception and description in knowledge representation.

RepResenTaTIon and aRTIfICIal InTellIgenCe

From an artificial intelligence perspective, knowledge representation is closely linked to reasoning. According to Brachman and Levesque, artificial intelligence is “the stud of intelligent behavior achieved through computational means,” and the combination of knowledge representation and reasoning “is the study of thinking as a computational process.”4 Drawing from epistemology, artificial intelligence defines knowledge in terms of propositions that can be either true or false, propositional attitudes that define “rela­

tionships between agents and propositions,” and beliefs.5 Overall, the specific nature of knowledge is of limited interest in artificial intelligence.

As is the case with cognitive science, artificial intelligence defines knowledge repre­

sentation as being fundamentally about “a relationship between two domains, where the first is meant to ‘stand for’ or take the place of the second.”6 Brachman and Levesque used the term “representor” as an equivalent of Markman’s “representing world.” Prop­

ositions are stated in terms of symbols, which are the representors of central importance to artificial intelligence. Within this context, “knowledge representation . . . is the field of study concerned with using formal symbols to represent a collection of propositions believed by some putative agent.”7

Smith presented the nature and importance of knowledge management for artificial intelligence in terms of a “knowledge representation hypothesis”:

Any mechanically embodied intelligent process will be comprised of structural ingredients that a) we as external observers naturally take to represent a propositional account of the knowledge that the overall process exhibits, and b) independent of such external semantic attribution, play a formal but causal and essential role in engendering the behavior that manifests the knowledge.8

Smith’s hypothesis echoes cognitive science’s avoidance of meaning, implying that derivation or interpretation of meaning is not an essential component in a “mechani­

cally embodied intelligent process.”

Knowledge representation is essential to developing knowledge­based systems, which is perhaps the most significant area of work in the field of artificial intelligence. Although

the most obvious kind of knowledge­based system is the expert system, the principles of knowledge­based systems are also found in areas such as game playing, diagnostic systems, machine learning environments, robotics, and language analysis.

RepResenTaTIon In Knowledge Bases

A fundamental requirement for any knowledge­based system is the encapsulation of knowledge representations in a knowledge base. A knowledge base includes “diverse kinds of knowledge. These include, but are not limited to, knowledge about objects, knowledge about processes, and hard­to­represent commonsense knowledge about goals, motivation, causality, time, actions, etc.”9 A knowledge base differs from a data­

base in that the knowledge base includes context as well as content. A knowledge­based system typically employs a much more complex logical model than a database system—

rather than relying on the binary logic of Boolean operators, a knowledge­based system makes use of some sort of propositional logic that takes into account the propositional role of belief. A knowledge­based system is an attempt to create an environment in which action depends “on what the system believes about the world, as opposed to just what the system has explicitly represented.10 In other words, a knowledge­based system is designed to achieve or at least emulate reasoning.

Representing knowledge in a knowledge base requires generation of a specific envi­

ronment that has four basic characteristics:

1. A domain of interest that is at least minimally understandable and can be described in terms of entities and relationships among entities

2. A vocabulary that can be used to describe the domain

3. A syntax that provides structure for application of the vocabulary to description of the domain and makes it possible to identify and manipulate relationships

4. A system for applying pragmatic meaning to the entities and relationships described by the vocabulary and syntax11

RepResenTaTIon and The woRld wIde weB:

The semanTIC weB

The World Wide Web Consortium has defined the World Wide Web “as a freeform, decentralised knowledge representation system.”12 Berners­Lee, Hendler, and Lassila, however, noted that “knowledge representation . . . is currently in a state comparable to that of hypertext before the advent of the Web: it is clearly a good idea, and some very nice demonstrations exist, but it has not yet changed the world.”13 They described the current state of the World Wide Web as one in which “content is designed for humans to read, not for computers to manipulate meaningfully.”14 They proposed a new structure for Web content known as the Semantic Web, which “will bring struc­

ture to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.”15 Berners­Lee, Hendler, and Lassila essentially rejected the potential of artificial intel­

ligence and knowledge­based systems for providing semantic access to the Web,

contending that it should be possible to provide access through relatively simple markup processes requiring no particular technical expertise.

Van Harmelen and Fensel surveyed the potential for applying existing and emerging Web technologies to semantic processing. They identified four fundamental problems related to the large bodies of semi­structured information that characterize the World Wide Web:

Searching information: Existing keyword­based search retrieves irrelevant information that uses a certain word in a different meaning or it may miss information where differ­

ent words about the desired content are used.

Extracting information: Currently human browsing and reading is required to extract relevant information from information sources since automatic agents miss all common sense knowledge required to extract such information from textual representations, and they fail to integrate information spread over different sources.

Maintaining weakly structured text sources [is a] difficult and time consuming activity when such sources become large. Keeping such collections consistent, correct, and up­

to­date requires mechanized representation of semantics and constraints that help to detect anomalies.

Automatic document generation: . . . Generating . . . semi­structured information pre­

sentations from semi­structured data requires a machine­accessible representation of the semantics of these information sources.16

Van Harmelen and Fensel described two categories of solutions to these problems:

(1) a declarative process in which semantic markup is employed to enrich Web content and (2) a procedural process in which the semantic content of Web pages is automati­

cally extracted. The Semantic Web, as envisioned by Berners­Lee, Hendler, and Lassila, is based on the principles of the first category and relies primarily on the Extensible Markup Language (XML) and the Resource Description Framework (RDF) to add semantic content to the markup of Web pages. Van Harmelen and Fensel’s analysis, however, found that XML and RDF did not provide significant advantages over HTML with regard to semantic markup; they were particularly critical of RDF’s poor repre­

sentation of “basic lessons in language design.”17 None of the conventional approaches examined by Van Harmelen and Fensel supported any meaningful approach to inferen­

tial processing. They concluded that further artificial intelligence research is necessary to build a true semantic approach to using Web content.

Horrocks and Patel­Schneider were also critical of RDF as an approach to implement­

ing the Semantic Web, writing that “more expressive power is clearly both necessary and desirable in order to describe resources in sufficient detail.”18 They suggested that a better solution could be found in the first­order logic approach of knowledge­based systems. McCool stated that “Because it’s a complex format and requires users to sacri­

fice expressivity and pay enormous cost in translation and maintenance, the Semantic Web will never achieve widespread public adoption.”19 Singh’s analysis suggested that

“the best hope for the semantic web is to encourage the emergence of communities of interest and practice that develop their own consensus knowledge on the basis of which they will standardize their representations.”20 Schoop, De Moor, and Dietz presented a “manifesto” for the Pragmatic Web, a model that recognizes the problems of apply­

ing the tools of the Semantic Web in a universal manner and proposes an alternative

model in which the basic principles of the Semantic Web are applied within specific contexts.21

Vannevar Bush, “As We May Think,” Atlantic Monthly, July 1945, 101–8.

Vannevar Bush (1890–1974) was educated at Tufts University and in 1916 earned a joint PhD in engineering from Harvard University and the Massachusetts Institute of Technology. During World War I he taught at Tufts and independently developed a submarine detection device. Following the war, he joined the engineering faculty at MIT, where he remained for 20 years. During that time he worked to improve vacuum tubes, becoming a cofounder of the Raytheon corporation in 1922, and was the primary force behind the development of the Differential Analyzer, a protocomputer. In 1939 he became president of the Carnegie Institution, and during World War II he served as an advisor to President Roosevelt. After his retirement from the Carnegie Institu­

tion, he served as president of the Merck Corporation. His works include the 1945 presidential report “Science: The Endless Frontier,” Scientists Face the World of 1942 (1842), Modern Arms and Free Men (1949), and Endless Horizons (1975). Although “As We May Think” has only generated 750 citation entries in the Social Sciences Citation Index and Science Citation Index, the core concepts introduced in the work have fun­

damentally affected modern life. The editor of the Atlantic Monthly equated the impor­

tance of “As We May Think” to that of Emerson’s “The American Scholar,” describing Bush’s essay as calling for “a new relationship between thinking man and the sum of our knowledge.”22

managIng The InfoRmaTIon explosIon

Writing at the conclusion of World War II, Bush was primarily concerned with the vast amounts of scientific information generated by the infusion of scientists into the war effort. He noted the advancing ability to communicate and preserve information but was concerned that “methods of transmitting and reviewing the results of research are generations old and by now are totally inadequate for their purpose.”23 Furthermore, Bush contends that the problems presented by the “growing mountain of research . . . [seem] to be, not so much that we publish unduly in view of the extent and variety of present­day interests, but rather that publication has been extended beyond our pres­

ent ability to make real use of the record.”24 Citing the failure of historically important devices such as the Leibniz calculator and Babbage’s Arithmetic Engine, Bush proposes that the solution lies in developing new technological approaches to managing recorded information.

In some ways the technological solutions that Bush explores seem quaint from the perspective of the 60 years that have passed since the publication of “As We May Think.” He focuses on photography, mentioning dry photography, the core technology of present­day photocopiers, laser printers, and faxes, but concentrating on micropho­

tography. He also mentions the potential for voice­input typewriters and predicts the computer revolution. Ultimately, Bush forecasts a future in which scientists will con­

centrate on higher­level reasoning and analytical processes and will entrust calculation and similar repetitive tasks to machines.

Bush anticipates that the future of scientific and business machines will require reex­

amination of the logic of data manipulation. “A new symbolism, probably positional, must apparently precede the reduction of mathematical transformations to machine processes.”25 Although Bush doesn’t directly predict the binary logic that underlies digi­

tal media, he is confident that “we may some day click off arguments on a machine with the same assurance that we now enter sales on a cash register.”26

ReThInKIng InfoRmaTIon ReTRIeval

Although Bush is abundantly confident that machine­based approaches to informa­

tion storage and manipulation will emerge and make it possible to “enormously extend the record,” he also suggests that being able to do so not only does not fully solve the problem but in fact exacerbates it.27 “This is a much larger matter than merely the extraction of data for the purposes of scientific research; it involves the entire process by which man profits by his inheritance of acquired knowledge. The prime action of use is selection, and here we are halting indeed.”28 Bush is critical of the “artificiality” of traditional systems in which data “are filed alphabetically or numerically, and informa­

tion is found (when it is) by tracing it down from subclass to subclass.”29 Such systems are inherently inefficient in that an item “can be in only one place, unless duplicates are used; one has to have rules as to which path will locate it; and the rule are cumbersome.

Having found one item, moreover, one has to emerge from the system and re­enter on a new path.”30

To Bush, the structure of traditional retrieval systems is at odds with the basic func­

tion of the human mind, which, he asserts, “operates by association.”31 Although Bush recognizes that it is probably not possible to replicate human mental information pro­

cessing, he asserts that it should be possible to learn from the associative model. “Selec­

tion by association, rather than by indexing, may yet be mechanized.”32 This notion of associative indexing is the foundation for a seemingly fanciful individualized infor­

mation system, “a sort of mechanized private file and library,” which Bush terms the memex.33 Although Bush suggests that the term “memex” was chosen “at random,” it is clear that he put substantial thought into the choice of the term.

The memex

“A memex is a device in which an individual stores all his books, records, and com­

munications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.”34 The memex is a desktop information environment that includes a display screen for viewing informa­

tion, a keyboard and “sets of buttons and levers” for input and control, and a compo­

nent “devoted to storage” that is sufficient in capacity that the user “can be profligate and enter material freely.”35 Books, journal articles, news, pictures, and other sources of information can be purchased and stored in the memex. Alternatively, the user has the ability to create new sources of information either by using the keyboard or by scan­

ning materials not available for purchase. The scanner is capable of handling any kind of graphical information. Business correspondence and other forms of communication

Dalam dokumen Knowledge Management - SILO.PUB (Halaman 130-148)

Dokumen terkait