• Tidak ada hasil yang ditemukan

Information, Data, and Knowledge / Information Theory

List of Abbreviations

Chapter 2. Literature Review

2.2 Information, Data, and Knowledge / Information Theory

This section discusses the relationship between information, data and knowledge, and aspects of information theory that are relevant to the dissertation. Various aspects of this section have been published in van Niekerk & Maharaj (2010a).

Hutchinson and Warren (2001) describe the relationships of knowledge, information and data as follows: data is analogous with a thing (such as an event or object) in that it describes the different states of the object or event and consists of the attributes of the object or event. Knowledge is likewise associated with an agent; it is analogous to a set of interacting mind models which influences the interpretation of the data, but can also be influenced by the data, during an event.

Knowledge can only be possessed by an intelligent being; usually a human, but could also be possessed by an animal or intelligent machine. Information is the data that has been filtered by the agent through their personal biases formed by experience and perception. Figure 2.2 shows these relationships.

14 KNOWLEDGE

‘THING’ ‘AGENT’

DATA INFORMATION

Figure 2.2: The Relationship between Data, Information and Knowledge, adapted from (Hutchinson, 2002)

Waltz (1998) describes the data fusion model, shown in Figure 2.3. Data, in the form of measurements and observations, is processed to create or form information by placing it in context, indexing and organising it. Knowledge is then developed by detecting patterns and relationships amongst the information; this allows the information and data to be understood, explained, and modelled. These models could also possibly be used to predict future behaviour of the entity or process being observed. Wisdom can then be considered as the effective application of knowledge to implement planning and actions to achieve objectives.

KNOWLEDGE

DATA Processed INFORMATION Understood Applied WISDOM

Figure 2.3: Data Fusion Model, adapted from (Waltz, 1998)

An extended model may be derived from the two models discussed above: data is collected and processed; and filtered by a priori knowledge (the agent in Figure 2.2) in the form of experience and perception to produce information. The information is analysed and understood to create additional knowledge (a posteriori knowledge), which provides wisdom when applied effectively, and can be used as a priori knowledge in future analyses. The model also accounts for external influences, such as media and peer opinions that may influence perception and the processing of data and information. These relationships are shown in Figure 2.4.

Another interpretation is that data is comprised of the bits and bytes that form the basis of digital communication systems, whereas information is the data presented in a format understandable to humans, such as image, text or video. Knowledge is then the modelling of information in order to understand trends and possibly predict behaviour of certain systems (Waltz, 1998).

Shannon (1948) developed a mathematical theory of communication; the theorem indicated that the data carrying capacity of a communications channel was limited by the bandwidth of the channel, and the interference, known as noise, that is present on the channel. The following equation was derived to calculate the capacity of a communications channel with noise:

15

2.1

The capacity, C, is measured in bits, W is the bandwidth of the communications channel, and SNR is the ratio of the signal strength to the noise strength. This mathematical theory goes on to derive the entropy of information; Taub and Schilling (1991) describe the entropy as the average amount of information transmitted over a message interval. The equation for entropy is (Shannon, 1948; Taub

& Schilling, 1991):

2.2

The entropy is denoted by H, M is the total number of possible messages, and pk is the probability that message mk is transmitted (Shannon, 1948; Taub & Schilling, 1991). Borden (1999) uses the following conceptual model to illustrate entropy:

Paul Revere considered an attack that came from the sea equally probable as an attack by land;

therefore the pland = psea = 0.5. If we calculate the entropy of this using Equation 2.2, we get H = 1.

Borden equates this to one bit of uncertainty. A lookout was told that if the attack came by sea, he was to show two lanterns or one lantern if the attack was by land. When he showed two lanterns, psea = 1 and pland = 0. Again using Equation 2.2, we get H = 1; Borden equates this to one bit of information was received. In this case the signal by the lanterns was the data, and the decoding of the data using knowledge resulted in the information that the attack was by sea.

Entropy is related to noise in a communications channel through the concept of mutual information (Waltz, 1998); that the information transmitted is what is received. In digital communications the binary bits or other symbols that are transmitted as electrical signals. The noise interferes with these

KNOWLEDGE

DATA Processed Filter INFORMATION Understood Applied

Experience & perception

WISDOM Filter

External Influences

- Media - Peers

Figure 2.4: The Extended Model for Information Relationships

16

signals, which results in some of the bits or symbols being incorrectly interpreted by the receiver;

this is equivalent to the wrong message being received compared to what was transmitted. The more bits or symbols received in error, the lower the probability that the transmitted message will be received; this will therefore affect the mutual information. In an IW environment, an attacker may intentionally interfere with the transmitted information, thereby reducing the mutual information, to deny the recipient the information.

As the world has become information-centric, there has been a drive to understand how to determine the value of information, and how to manage information; from this the field of Knowledge Management arose (Prusak, 2001). A method of calculating the value of information based on its capital utility is presented in Waltz (1998):

, 2.3

where:

Iv is the information value;

At are the assets derived from the information at the time of arrival;

An are the assets should the information not have arrived;

Lt are the liabilities derived from the information at the time of arrival;

Ln are the liabilities should the information not have arrived;

In is the total cost of the information;

I1 is the cost for generating information;

I2 is the cost to format information;

I3 is the cost to reformat information;

I4 is the cost for information duplication;

I5 is the cost for information dissemination;

I6 is the cost for information storage; and, I7 is the cost for information retrieval and usage.

As information assets have value to the owner, competitors will attempt to maximise the value for their own objectives, and possibly minimise the value of the information assets for other actors (Denning, 1999). This competition surrounding information and its use leads to the concept of IW, discussed in Section 2.3. The value of the information may also be used in calculations of risk, which are discussed in Section 2.7.

17