encourage its proliferation throughout the enterprise. Knowledge-management tools enable the collection, coordination, and distribution of information and knowledge so that team members can collaborate effectively in pursuit of a common goal. Al- though some analysts dismiss the term knowledge management as hyperbole, the underpinning technologies of workflow systems, conferencing tools, and electronic meeting support systems, harnessed by corporate intranet and extranet systems, have already taken root within many organizations. Only through the creation of networks of knowledge—both within and between companies—can organizations remove barriers of distance and time between distributed groups, increase quality and productivity, and facilitate competitiveness within the expanding global marketplace.
The major KM tools and techniques can be thought of in different ways and it is important at all times not to lose focus of either their impacts or their technological capabilities. It is useful to think of the major KM tools and techniques in terms of their social and community role in the organization be it in (1) the facilitation of knowledge sharing and socialization of knowledge (production of organizational knowledge); (2) the conversion of information into knowledge through easy access, opportunities of internalization and learning (supported by the right work environ- ment and culture); (3) the conversion of tacit knowledge into “explicit knowledge”
or information, for purposes of efficient and systematic storage, retrieval, wider sharing, and application. In doing so, it brings to the forefront their role in specific aspects and components within the knowledge management layer of the generic KM architecture in Figure 2. Another useful approach to consider regarding KM tools and techniques is to group them in relation to those that capture and codify knowledge and those that share and distribute knowledge. Such a conceptualization tends to emphasize their underlying technical capabilities.
KM Tools and Technologies that Capture and Codify Knowledge
There are various tools that can be used to capture and codify knowledge. These include databases, and various types of artificial intelligence systems including
expert systems, neural networks, fuzzy logic, genetic algorithms, and intelligent or software agents.
Databases
Databases store structured information and assist in the storing and sharing of knowledge. Knowledge can be acquired from the relationships that exist among different tables in a database. For example, the relationship that might exist between a customer table and a product table could show those products that are producing adequate margins, providing decision-makers with strategic marketing knowledge.
Many different relations can exist and are only limited by the human imagination.
These relational databases help users to make informed reliable decisions, which is a goal of knowledge management. Discrete, structured information is still man- aged best by a database management system. However, the quest for a universal user interface has led to the requirement for access to existing database information through a Web browser.
Data Mining Techniques
The exponential increase in information, primarily due to the electronic capture of data and its storage in vast data warehouses, has created a demand for analyzing the large amount of data generated by today’s organizations so that enterprises can respond quickly to fast changing markets. These applications not only involve the analysis of the data but also require sophisticated tools for analysis. Knowledge discovery technologies are the new technologies that help to analyze data and find relationships from data to finding reasons behind observable patterns. Such new discoveries can have profound impact on designing business strategies. Thus, data mining techniques, and the newer techniques of business intelligence and business analytics, which basically combine the major data mining techniques with key business objectives, drivers, and outcomes critical to the generation of knowledge from data assets, dominate the technical tools for deriving knowledge from an organizations data assets. Four key data mining techniques include (Fadlalla &
Wickramasinghe, 2005):
1. Decision tree;
2. Clustering;
3. Neural networks; and 4. Association rule.
While we acknowledge there are numerous data mining techniques, we focus on these techniques since they are some of the major techniques currently used in most data mining initiatives; the first two techniques are used for exploratory data mining, the latter two techniques are used for predictive data mining (Figure 5).
We will first outline the major steps involved in data mining in order to achieve the final goal of knowledge creation, before we describe each of the previous data mining techniques.
Data mining is the non-trivial process of identifying valid, novel, potentially use- ful, and ultimately understandable patterns from data (Fayyad, Piatetskey-Shapirio, Smyth, & Uthurusamy, 1996). Data mining algorithms are used on databases for model building, or for finding patterns in data. When these patterns are new, use- ful, and understandable, we say that this is knowledge discovery. How to manage such discovered knowledge and other organizational knowledge is the realm of knowledge management.
As discussed in Chapter III, data mining is a step in the broader context of the knowledge discovery process that transforms data into knowledge. Figure 6 shows the knowledge discovery process, the evolution of knowledge from data through information to knowledge (Fayyad et al., 1996) as well as the types of data min- ing (exploratory and predictive) and their interrelationships. Data issues that data mining helps us wrestle with include huge volumes of data, dynamic data, incom- plete data, imprecise data, noisy data, missing attribute values, redundant data, and inconsistent data. Furthermore, data mining offers a wide variety of models to capture the characteristics of data and to help knowledge discovery including summarization, clustering/segmentation, regression, classification, neural networks, rough sets, association analysis, sequence analysis, prediction, exploratory analysis, and visualization.
Figure 5. Major techniques of data mining
21
Data Mining
Exploratory
Predictive
Association Rules Clustering
Neural Networks
Decision Trees
Knowledge
As can bee seen in Figure 6 the data extracted from the large pool of data is not the final outcome intended. It typically contains a significant amount of erroneous information that should be excluded prior to inputting the correct data to be pro- cessed by the data mining algorithms; thus, data goes through the following process steps before being used for any decision-making or prior to any of the previous techniques being utilized effectively (Fadlalla & Wickramasinghe, 2005; Fayyad et al., 1996).
1. Selection: Selecting the data according to some criteria (e.g., all those people who are suffering from heart attack).
2. Preprocessing: This is the data cleansing stage where certain unwanted in- formation is removed which may slow down queries.
3. Transformation: The data is not merely transferred across but transformed in that overlays may be added.
4. Data mining: This stage is concerned with the extraction of patterns from the data. It includes choosing a data-mining algorithm, which is appropriate to search a particular pattern in the data.
Figure 6. Data mining and the KDD process
20
Data Mining
Data Information Knowledge
Exploratory Data Mining
Predictive Data Mining Selection
Data Target
Data
Preprocessed Data Steps in knowledge
discovery
Types of data mining Knowledge evolution
Interpretation / Evaluation Data Mining
Transformation Preprocessing
Transformed
Data Patterns Knowledge
5. Interpretation and evaluation: The patterns identified by the system are interpreted into knowledge by removing redundant or irrelevant patterns, and translating the useful patterns into terms that can be understood by users.
Let us now examine more closely the four key data mining techniques.
Decision Tree (Fadlalla & Wickramasinghe, 2005; Fayyad et al., 1996)
In critical decision situations, mistakes are undesirable and costly. Thus, data min- ing techniques are used to find the most pertinent information and data to facilitate superior decision making. The decision tree technique achieves this by closing the gap between the facts and real understanding. Decision tree represents the knowledge or the available information in tree-like form and then a method for treatment is selected. The decision is usually made on the choices of outcomes. Decision trees are built through recursive partitioning, which is splitting the data into partitions and subsequently splitting it further (refer to Figure 7). All the information is first used to determine the structure of the tree. The critical aspect in decision tree mak-
Figure 7. Data mining resulting in the decision tree—each path from the tree root down represents a rule (i.e., a type of pattern) (Adapted from Fadlalla & Wickra- maisngeh, 2005)
Partition 1
Partition 2
Partition 1.1
Partition 1.2
Partition 1.3
ing is the location of the initial split. This split has a binary positioning in the field.
The first splitter is found because it is important to decide a single class, which predominates. Then the reduction of diversity is done.
The diversity in the tree decides the probability of a certain symptom occurring from the given set. If there are just two probabilities, the condition is the simplest since the probability of one is minus the other. Then the second splitter is again divided depending upon the choices after the initial node. This continues until the outcome can no longer be subdivided and is labeled as a leaf node. If there is just one decision to be made, such a node is removed. The best solution of the splitter is determined and a leaf node is made (Figure 5 depicts this). The second part of decision tree making is the pruning of the excess and unnecessary nodes. Excess leaves make the performance of decision tree analysis less efficient. The pruning method allows the tree to grow deep and find ways to prune off the branches that fail to generalize.
Pruning is important to ensure the highest quality outputs at all times.
Consequences of Choosing the Decision Tree Technique
Each partition in the decision tree is a test of a single variable. The interrelation between the variables can never be found from the decision tree.
Advantages of the Decision Tree Technique
• The decision tree as a whole is a graphical representation since visually the tree structure itself supports the location of the correct split or accurate decision.
• The model of the decision tree helps in reasoning and can be used to examine and justify a decision choice.
Disadvantages of the Decision Tree Technique
The decision tree variables cannot predict a continuous response variable. Specifically, all the splits are dependent on the previous splits. Hence, the model has high order interactions. The decision tree cannot discover a single rule based on the ratio when two values are given rather a new variable has to be defined to specify a simple rule.
The shortcoming of decision tree is in the way it handles the numeric input variables which sometimes leads to loss of the information. The decision tree first groups all the information and then categorizes which may lead to loss of information.
Clustering (Fadlalla & Wickramasinghe, 2005; Fayyad et al., 1996)
In the study and treatment of chromosomal and DNA related problems for example, the clustering technique is important. This technique is a type of undirected data
mining. The purpose of undirected data mining is to find the structure as a whole (i.e., there are no target variables which are to be predicted but the clusters are formed and grouped together) and then the decision is made using decision tree or neural network techniques. This is a relatively primary tool, typically used merely to study all the possible conditions before more refined and confirmatory analysis takes place.
Clustering is a classification technique, which enables us to find specific factors such as the likelihood of a patient recovering from cancer. Clustering can also be used for maintaining the records for patients in terms of their height, weight, or other historic variables. The most frequently used method for clustering is k-mean.
This is a geometrical method, which uses the average location of all the members from the particular cluster. The whole field is divided into numbers and then these numbers are normalized. The value of each field is interpreted as the distance from the origin along the corresponding axis. Here the centers are initially defined and then adjusted using predefined algorithms. To start a clustering session, a random set of centers are chosen which are then adjusted by adding to and removing centers during the analysis processes. The clustering technique is dependent on the two main criteria:
1. The cluster must be homogeneous (i.e., membership within a group must be as similar as possible).
2. Each group or cluster must be mutually exclusive (i.e., two groups should be as distinct as possible).
In most cases, clusters are usually mutually exclusive but in some instances, they may be overlapping, probabilistic or have hierarchical structures. In k-means, a data point is assigned to the cluster, which has the nearest centriod (i.e., the nearest mean).
Clustering requires the data to be in numeric form since it works by assigning the cluster points accordingly. This process of assigning points to clusters continues until points stop changing positions (i.e., cluster hopping) and their boundaries become stable.
Consequences of Choosing the Clustering Technique
Clustering is one of the less rigorous data mining techniques used in exploratory data mining and is often classified as undirected data mining (Kudyba & Hoptroff, 2001). This is because the analyst seeks to discover hidden relationships in the data without directing the analysis. The approach combines computer algorithm techniques and statistical measures to identify like groups or clusters and enables the analysis to quickly and accurately mine through large volumes of data.
Advantages of Clustering
• The main strength of clustering is that it is an undirected knowledge discovery technique.
• The clustering can be used as a preparatory technique for other data mining techniques such as decision trees or neural networks.
• The outcome of clustering can be visually represented and hence easily un- derstood.
• Creating clusters reduces the complexity of the problem by sub-dividing the problem space into more manageable partitions.
• The more separable the data points the more effective the clustering effort becomes.
Disadvantages of Using Clustering
• Clustering represents a snap shot of the data at a certain point in time and thus may not be as useful in highly dynamic situations.
• Sometimes the clusters generated may not even have a practical meaning.
• It is possible not to spot the cluster sometimes since you do not know for what you are looking.
• Clustering can be computationally expensive.
Neural Networks (Fadlalla & Wickramasinghe, 2005; Fayyad et al., 1996) The technique of neural networks is modeled after the human brain and normally consists of many input nodes, one or more hidden (middle) layer nodes and one or more output nodes. The input and output nodes relate to each other through the hid- den layer. The input layer represents the raw information that is fed into the network.
The hidden layer represents a computational layer that transforms the inputs coming from the input layer into inputs to the output layer. The behavior of the output layer depends on the activity of the hidden layer where the weights between the hidden and output layers are used as a reconciliation mechanism to help minimize the dif- ference between the actual and desired outputs.
The outcome of a neural network is improved through the minimization of an error function (i.e., namely the difference between a desired output and an actual output value). The most widely used algorithm that is used to minimize this error function is known as back propagation. Each input pattern is evaluated individually and if its value exceeds a predetermined threshold, then a pre-specified rule fires (i.e., is activated) whereby its outcome is fed forward to the next layer. The firing rule is an important concept in neural networks and accounts for the high flexibility of
the technique since the rule determines how one calculates whether a subsequent neuron (node) should fire for any given input pattern.
The most important application of neural networks is pattern recognition. The net- work is trained to associate specific output patterns with input patterns. The power of neural networks comes into play in its predictive abilities (i.e., associating an input pattern that has not previously been classified with a specific output pattern).
In such cases, the network will most likely give the output that corresponds to a pre-classified input pattern that is least different from the new input pattern.
Neural networks are mainly used in the medical sciences in recognising disease types from various scans such as MRI or CT scans. The neural networks learn by example and therefore the more examples we feed into the neural network the more accurate its predictive capabilities become. Neural networks can process a large number of medical records each of which includes the information on symptoms, diagnoses, and treatments for a particular case. The use of neural network as a potential tool in medical science is exemplified by its use in the study of mammograms. In breast cancer detection, the primary task is detection of tumorous cells in the early stages.
The best probability for a successful cure of this disease is in its early detection.
Therefore, the power of neural networks lies in that they could be used to detect minute changes in tissue patterns (a key indicator of the existence of malignant cells) that are often difficult to detect with the human eye.
Advantages of Neural Networks
• Neural networks are good classification and prediction techniques when the results of the model are more important than the understanding of how the model works.
• Neural networks are very robust in that they can be used to model any type of relationship implied by the input patterns.
• Neural networks can easily be implemented to take advantage of the power of parallel computers with each processor simultaneously doing its own calcula- tions.
• Neural networks are very robust in situations where the data is noisy.
Disadvantages of Neural Networks
• The key problem with neural networks is the difficulty to explain its outcome.
Unlike decision trees, neural networks use complex non-linear modeling that does not produce rules and hence it is hard to justify ones decision.
• Significant preprocessing and preparation of the data is required.
• Neural networks will tend to over-fit the data unless implemented carefully.
This is due to the fact that the neural networks have a large number of param- eters, which can fit well into any arbitrary data set.
• Neural networks require extensive training time unless the problem is small.
Association Rule Mining (Fadlalla & Wickramasinghe, 2005; Fayyad et al., 1996)
Association rules are used to discover relationships between attribute sets for a given input pattern. Such relationships do not necessarily imply causation, they are only associations. For example, an association rule that can be derived from medical data could be that 80% of the cases that display a given symptom are diagnosed with a similar condition and hence improves diagnostic capabilities. These patterns (asso- ciations) are not easily discovered using other data mining techniques. The support of an association rule is the percentage of cases, which include the antecedent of the rule, while the confidence of the association rule is the percentage of cases where both the antecedent and the consequence of the rule are displayed. Only rules whose support and confidence exceed predetermined thresholds are considered useful. The classic algorithm used to generate these rules is the apriori algorithm.
Advantages of Association Rule
• The association rules are readily understandable.
• Association rules are best suited for categorical data analysis.
• It is widely used in hospitals to maintain patient’s records.
• The outcomes are easy to interpret and explain and thus easy to use in the aiding of decision-making.
Disadvantages of Association Rule Mining
• Generate too many rules and sometimes these are even trivial rules.
• The association rules are not expressions of cause/effect rather they are de- scriptive relationships in particular databases, so there is no formal testing to increase the predictive power of these rules.
• Insight, analysis, and explanation by healthcare professionals are usually re- quired to identify the new and useful rules and thereby achieve the full benefits from such association rules.