• Tidak ada hasil yang ditemukan

WHAT IS MEASUREMENT?

Dalam dokumen BUSINESSSTATISTICS DEMYSTIFIED - MEC (Halaman 39-43)

The first and most fundamental concept in statistics is the concept of measurement. Measurement is the process by which we examine the world and end up with a description (usually a number) of some aspect of the world. The results of measurement are specific descriptions of the world.

They are the first step in doing statistics, which results ingeneraldescriptions of the world.

Measurement is a formalized version of observation, which is how we all find out about the world every day. Measurement is different from ordinary day-to-day observation because the procedures we use to observe and record the results are specified so that the observation can be repeated the same way over and over again.

When we measure someone’s height, we take a look at a person; apply a specific procedure involving (perhaps) a measuring tape, a pencil, and a part of the wall; and record the number that results. Let’s suppose that we measure Judy’s height and that Judy is ‘‘five foot two.’’ We record the number 62, measured in inches. That number does not tell us a lot about Judy. It just tells us about one aspect of Judy, her height. In fact, it just tells us about her height on that one occasion. (A few years earlier, she might have been shorter.)

Statistics uses the algebraic devices of variables and values to deal with measurements mathematically. In statistics, a variable matches up to some aspect of the thing being measured. In the example above, the variable is height. The value is the particular number resulting from the measurement on this occasion. In this case, the value is 62. The person who is thesubjectof the measurement has many attributes we could measure and many others we cannot. Statisticians like to think of subjects (whether they are persons or companies or business transactions) as being composed of many variables, but we need to remember that there is always more to the thing being measured than the measurements taken. A person is more than her height, weight, intelligence, education level, occupation, hair color, salary, and so forth. Most importantly, not every variable is important to every purpose on every occasion. There are always more attributes than there are measurable variables, and there are always lots more variables that can be measured than we will measure.

KEY POINT

Vital to any statistical analysis will be determining which variables are relevant to the business decision at hand. The easiest things to measure are often not the most useful, and the most important things to know about are often the hardest to measure. The hardest part of all is to determine what variables will make a difference in making our business decision.

TIPS ON TERMS

Subject. The individual thing (object or event) being measured. Ordinarily, the subject has many attributes, some of which are measurable features. A subject may be a single person, object, or event, or some unified group or institution. So long as a single act of measuring can be applied to it, it can be considered a single subject. Also called the

‘‘unit of analysis’’ (not to be confused with the unit of measurement, below).

Occasion.The particular occurrence of the particular act of measurement, usually identified by the combination of the subject and the time the measurement is taken.

Situation.The circumstances surrounding the subject at the time the measurement is taken. Very often, when multiple measurements of a subject are taken on a single occasion, measurements characterizing the situation are also taken.

Value. The result of the particular act of measurement. Ordinarily, values are numbers, but they can also be names or other types of identifiers. Each value usually describes one aspect or feature of the subject on the occasion of the measurement.

Variable.A mathematical abstraction that can take on multiple values. In statistics, each variable usually corresponds to some measurable feature of the subject. Each measurement usually results in one value of that variable.

Unit.(Short for unit of measurement. Not to be confused with unit of analysis in the definition of Subject, above.) For some types of measurement, the particular standard measure used to define the meaning of the number, one. For instance, inches, grams, dollars, minutes, etc., are all units of measurement. When we say something weighs two and a half pounds, we mean that it weighs two and a half times as much as a standard pound measure.

Data. The collection of values resulting from a group of measurements. Usually, each value is labeled by variable and subject, with a timestamp to identify the occasion.

Values that aren’t numbers

In statistics, measurement doesn’t always result in numbers, at least not numbers in the usual sense. Suppose we are doing an inventory of cars in a car lot. We want to make a record of the important features of each car: make, model, year, and color. (Afterwards, we may want to do some statistics, but that can wait for a later chapter.) Statisticians would refer to the process of collecting and recording the make, model, year, and color of each car in the lot as measurement, even though it’s not much like using a tape measure or a scale, and only in the case of the year does it result in a number. The reason for this is that, just like measuring height or weight, recording the color of an automobile results in a description of one feature of that particular car on that particular occasion. From a statistical point of view, the important thing is not whether the result is a number, but whether the results, each of which is a specific description of the world, can be combined to create general descriptions of the world. In the next section, Levels of Measurement, we will see how statisticians deal with non-numerical values.

TIPS ON TERMS

Categorical data. Data recorded in non-numerical terms. It is called categorical because each different value (such as car model or job title) places the subject in a different category.

Numerical data. Data recorded in numerical terms. There are different types of numerical data depending upon what numbers the values can be. (See Levels of Measurementbelow.)

What is data?

In Chapter 1 ‘‘Statistics for Business,’’ we didn’t bother too much about specific definitions. Now, in Chapter 2 ‘‘What is Statistics?’’we are starting to concern ourselves with more exact terminology. Throughout the remainder of the book, we will try to be as consistent as possible with our wording, in order to keep things clear. This does not mean that statisticians and others who use statistics are always as precise in their wording as we should be. There is a great deal of confusion about certain terms. Among these are the notorious terms,data and information.

The values recorded as the result of measurement are data. In order to distinguish them from other sorts of values, we will use the termdata values.

Data are not the facts of the world that were measured. Data are descriptions, not the things described. Data are not the statistical measures calculated from the data values, no matter how simple. Often, statisticians will distinguish between ‘‘raw’’ data and ‘‘cleaned’’ data. The raw data are the values as originally recorded, before they are examined and edited. As we will see later on, cleaning data may involve changing it, but does not involve summarizing it or making inferences from it.

QUICK QUOTE

The map is not the territory. Alfred Korzybski

KEY POINT

Data are specific descriptions. Statistics are general descriptions.

A lot of data is used only indirectly, in support of various statistical techniques. And data are always subject to error. To the degree that data contain error, they cannot inform. So data, even though they are information in the informal computer science sense, contain both information and error in the more technical, theoretical sense. In statistics, as in information theory, it is this latter, more technical sense that is most important. Because we will be using data to make business decisions, we must not forget that data contain error and that can result in bad decisions. We will have to work hard to control the error in order to allow the data to inform us and help us make our decisions.

FUN FACTS

Facts.You may have noticed that we haven’t defined the term,fact.This is not an accident. Statisticians rarely use the term in any technical sense. They consider it a philosopher’s term.

You may have heard the expression, ‘‘It’s a statistical fact!’’ but you probably didn’t hear that from a statistician. The meaning of this expression is unclear. It could mean that a statistical description is free from error, which is never the case. It could mean that the results of a statistical inference are certain, which is never the case. It probably means that a statistical conclusion is good enough to base our decisions on, but statisticians prefer to state things more cautiously.

As we mentioned earlier, statistics allows us to say how good our statistical conclusions are. Statisticians prefer to say how good, rather than just to say,

‘‘good enough.’’

Some philosophers say that facts are the things we can measure, even if we don’t measure them. Judy is some height or other, even if we don’t know what that height is. Other (smarter) philosophers say that facts are the results we would get if our measurements could be free of error, which they can never be. This sort of dispute seems to be an excellent reason to leave facts to the philosophers.

Dalam dokumen BUSINESSSTATISTICS DEMYSTIFIED - MEC (Halaman 39-43)