• Tidak ada hasil yang ditemukan

KDD as Information Foraging

Dalam dokumen Knowledge Management to Wisdom (Halaman 164-168)

In this section, we have chosen to apply Pirolli and Card’s (1999) information foraging theory (IFT) to assist in the development of a descriptive conceptual model of the KDD process. IFT seeks to understand how strategies and technologies for information seeking, gathering, and consumption are adapted to the flux of information within the environment. An example of information foraging is the use of a search engine to locate relevant information on the Internet. IFT provides rather compelling evidence that inquiring organizations should not overlook the central role of the analyst in KDD. We contend that the process an analyst follows in KDD is, in a number of ways, analogous to the concept of information foraging.

Pirolli and Card argue that people will modify their information search strategies, or the structure of the task environment, to maximize their rate of gaining valuable information.

Improved information foraging returns translate into a greater yield of relevant informa- tion per unit of effort needed to obtain it. IFT maintains that cognitive strategies that result in greater yields without additional effort will, over time, replace cognitive strategies limited by lesser yields. Arguably, from an evolutionary perspective, this choice of behavior is optimal as it maximizes the chances of survival within an environ- ment where survival is contingent upon an organism’s ability to balance the cost of information search with expected returns.

The roots of IFT lie in Optimal Foraging Theory (Stephens & Krebs, 1986). This theory accounts for how organisms adapt their behavior and biological structure to environmen- tal changes in the context of searching for food. Important to this theory is the notion that food environments are often patchy: food is distributed throughout an environment in clusters or patches, with some patches being richer food sources than others. In this kind of environment, the organism must strategize about how much of its resources should be devoted to harvesting food within a patch, in contrast to seeking out new patches (between-patch activities). Within-patch foraging will continue to occur as long as the organism perceives that the effort needed to harvest the food supply in the patch to be greater than the energy needed to seek out a new patch. When the energy required to continue to extract food from a current patch is perceived to be more than that required to move to (and begin harvesting) a new patch, the organism’s best strategy is to move to a different patch and begin foraging.

Optimal Foraging Theory maintains that species that develop superior foraging strate- gies will have an evolutionary advantage over those with inferior strategies. Such species will be able to acquire an adequate food supply more efficiently (with less energy expenditure) than competitors for the same food sources, thereby better positioning themselves for survival in tough times or harsh environments. Survivors are likely to have better cognitive strategies regarding within-patch and between-patch activities than nonsurvivors.

Optimal Foraging Theory is most relevant to situations where many types of foods are available, but these foods vary with respect to profitability (i.e., the food’s nutritional value minus the resources needed by the organism to obtain and handle it). In general, an organism will make decisions about what kinds of foods to pursue and do so by considering which foods are most profitable in the current environment.

It is important to note that some organisms leverage technologies in their foraging activities, which may contribute to an improved ability to harvest food more efficiently within a particular patch. The use of such tools in foraging may contribute to the survival potential of a species relative to other species competing for the same food sources (Pianka, 1997). Sea otters living off the coast of northern California and in other habitats, for example, commonly manipulate rocks in order to facilitate access to shellfish. In addition, tool use by Brown Capuchins (chimpanzees) in food foraging is hypothesized to provide this primate species with an evolutionary advantage in its habitats (Bionski, Quatrone & Swartz, 2000). Tool use also suggests that some species may possess more cognitively complex foraging strategies than others.

Pirolli and Card (1999) noted multiple parallels between food foraging behavior and information foraging on the Internet. For example, like many food environments, infor- mation on the Web appears in a patchy environment: valuable information is unevenly distributed throughout Web-based locations, and the information seeker must decide whether to put forth resources to harvest the known information in one location or risk obtaining an unknown amount of information by following the perceived scent of potential information in other locations. Furthermore, like many food environments, information on the Internet varies enormously in its profitability. Some information is highly relevant to the seeker’s goal, whereas other information is largely irrelevant.

Pirolli and Card created an artificial information forager that was designed to seek out particular documents on the Internet. This forager, which was inspired by Anderson’s (1993) ACT-R model of human cognition, was endowed with mental characteristics that have been argued to describe the way humans solve problems and retrieve and use information from memory. The researchers had previously tracked the minimum number of steps (links in the Internet) to obtain the target documents and their artificial forager was found to take a similar (and highly efficient) route. In terms of contrast, the researchers had human volunteers search for the same documents and found human performance to be highly similar to that of the artificial forager, lending support to the notion that human information foraging behavior typically strives to optimize informa- tion yield. Such behavior is less likely to be consistent with the notion of satisficing.

The parallels between food foraging and information foraging can also be extended to KDD. Given that a data analyst using OLAP is faced constantly with the decision as to what data will be viewed (e.g., the choice of navigation path through the data set), and how long the current data set will be viewed for (i.e., the choice whether to move to a different data view immediately, or at a later time), there are arguably a number of parallels between OLAP use and information foraging. Moreover, KDD usually takes place in a patchy environment, where an analyst must make decisions pertaining to where to engage deeper drilling efforts (within-patch searching), as opposed to looking elsewhere in the database for other potential useful subsets of information (between-patch searching). Furthermore, like many food environments, the information of interest in KDD varies enormously in its profitability. That is, information contained within database

subsets varies in relevance to the analyst’s overall question/task and in the amount of effort required by the analyst to obtain and understand that information. In addition, KDD tools (analysis, visualization, etc.) are used by analysts to determine whether a particular database subset (data patch) is a fruitful source of new information about the target population or if it should be abandoned in favor of another subset with greater information yield potential. Effective tool use (and well-designed tools) can better position an analyst and employer for KDD success, thereby enhancing the organization’s survival potential in today’s brutally competitive markets.

Using IFT as a descriptive conceptual model from which to understand KDD is limited, in part, by the scant research directed at investigating the rationality and behaviors of information foraging. These matters are still very much an open research question.

Notwithstanding the immaturity of research into information foraging, we contend that IFT is still useful as a basis for making predictions about expected KDD behavior, identifying factors analysts may be sensitive to during the KDD process, and for expressing hypothesized tests of rationality. For example, taking into consideration IFT, we would expect the following behavior to be exhibited by analysts during the KDD process:

• Analysts will strive to obtain the most useful (i.e., profitable) information possible to the current task;

• Analysts will perceive the most profitable information as information that overlaps highly with what is desired but involves minimal costs in obtaining and under- standing it;

• Analysts will often modify their strategies (i.e., change what they are doing) in order to efficiently obtain profitable information; they will also react to changing perceptions in profitability; and

• Analysts’ search strategies that result in more profitable information will, over time, be implemented more often than strategies that result in less profitable information yields.

Other predictions of analyst behavior could address: data set navigation strategies, hypothesis development, hypothesis testing, the effect of learning and expertise, and the rationality of causal reasoning.

As noted in our description of the KDD process, the analyst often searches for information that meets a priori defined criteria. Pirolli and Card (1999) would label this search pattern a diet model, where information is sought that meets predefined criteria.

However, there are other types of search strategies that may also accurately depict KDD analysts’ behavior. For example, scent models search for information on the basis of obvious proximal cues (e.g., keywords) or statistical correlations. Alternatively, patch models are driven by the search for pre-existing clusters of information; visualization tools might assist analysts that employ patch models. It is possible that different search strategies are closely tied to specific forms of KDD (data mining). Further exploration of the extent to which alternative information foraging models best depict the KDD process may provide valuable insights into knowledge creation processes in inquiring organizations.

As animals do in their search for food, the analyst will hold critical assumptions that undoubtedly influence the way the task is perceived and executed. Decision assumptions comprise the analyst’s definition of the task at hand; currency assumptions pertain to ideas about the value of specific types of information; and constraint assumptions involve known limitations of the task itself, the technology used to execute the task, or of the user. Like decision makers’ mental models that play a central role in the new decision-making paradigm for DSS described by Courtney (2001), these assumptions play a critical role in IFT. Investigating how they pertain to KDD activities should enhance our understanding of how analysts obtain, interact with, and interpret data during the KDD process.

The information foraging approach is also quite useful because it embraces aspects of the KDD task that are problematic to the basic KDD model. In particular, most KDD analysts start with some rudimentary or skeletal hypothesis, but that hypothesis is quickly modified as extracted data sets are examined and analyzed. As the analyst interacts or becomes familiar with the data, specific ideas about potential outcomes become more refined (Brachman & Anand, 1996). In addition, the process is dialectic, with the analyst asking different questions while becoming familiarized with the client’s needs, products, company structure, and interacts with task-relevant data. IFT provides insight into fundamental evolutionary forces that may be driving such inquiry and knowledge creation processes.

There are limitations of this approach to understanding KDD as well. Most notably, the information foraging approach may be viewed as being overly focused on the individual.

It generally fails to take into account the role of communication and cooperation with others that could assist the data analyst in completing the KDD task (Montovoni, 2001).

Such communication/cooperation could be especially valuable to the analyst during data analysis and interpretation (e.g., Simoudis, Livezey & Kerber, 1996). Today’s organiza- tional ecology enables individual information foragers to tap into online communities to obtain insights and to identify strategies beyond their own immediate capabilities. As such, it represents an important departure from foraging strategies developed by nonhuman animals. Also, whereas much of the previous information foraging research has focused on the productive search for information, KDD must be viewed as a process that encompasses much more than the mere search for information. The heart of KDD rests in the examination, consideration, and interpretation of the information that is obtained. As mentioned earlier, the analyst is not just a data miner but is more similar to a data archeologist who classifies and interprets this information (Brachman & Anand, 1996) or to a member of an inquiring organization that is actively engaged in knowledge creation processes.

Potential Research Avenues Blending

Dalam dokumen Knowledge Management to Wisdom (Halaman 164-168)