• Tidak ada hasil yang ditemukan

RATIOS, ALSO KNOWN AS PROPORTIONS

Dalam dokumen BUSINESSSTATISTICS DEMYSTIFIED - MEC (Halaman 194-198)

When we usenas a measure of the amount of information in our sample, we call it thedegrees of freedom.

Now, suppose we calculate a statistic from our sample. That gives us one piece of information about the world, taken from the sample. As it turns out, there is an important sense in which each number we calculate is worth the same as each number we collect. So, when we calculate a statistic from the sample, we have one less piece of information in the sample.

At first, this may seem odd. After all, we still have allNnumbers from our sample. We still know what they are. What has been lost? The answer is that we may very well want to calculate more statistics from the same sample.

How many times can we use the same N numbers, our data, to calculate statistics and still be finding out about the world, instead of just spinning our wheels? The answer is that we can calculate N statistics from N numbers before we run out of information. As we examine various statistical measures throughout this chapter, we will take a few more looks at degrees of freedom.

Degrees of freedom will also be very important when we get to Part Three and talk about statistical techniques.

we can never know for certain. We will hear a good deal more about estimates as we go forward.

Calculating the ratio

While there are other uses for ratios, the ratio statistic is most often used to estimate probabilities by comparing the count of some subgroup of interest to the count of a larger group to which that subgroup belongs. (It does not matter here whether we think of these counts as counts of things or counts of numbers.) This is not as complicated as it sounds. When we have a sample, that is a group. If we have a categorical variable, any value of that variable forms a subgroup. Our example of this is our flock of sheep (a group) and the variable, color. The value, black, of the variable, color, defines a subgroup.

The ratio of the count of black sheep to the total count of sheep gives us the ratio or proportion of black sheep. In general, if the value is represented by x, the equation defining the proportion is:

p¼Nx

N ð8-2Þ

All we have to do in Equation 8-2 is divide the count of the individual units having a value ofxby the count of all the units in the group. And that is our second statistical equation. See how easy that was!

It is important to note here that we can calculate proportions for other types of variables as well as categorical ones. Remember that numerical variables (either interval or ratio) contain more information than do categorical variables. It is a common trick in statistics to redefine a numerical variable as a categorical variable in order to use it for categorical things, including calculating proportions.

The trick to creating categories from a numerical variable uses ranges of numbers. While we have not yet discussed the range as a statistic, we already know about ranges from algebra. (Refer to Appendix A for a refresher on the number line if you need to.) Ranges of numbers are things like,less than or equal to ten,greater than8.34, orgreater than or equal to 8 and less than 11.

For any numerical variable, we can define a category as when the value of a numerical variable falls within a particular range. (The complementary category is when the value falls outside that range.) We saw this trick used in Chapter 3 where the probability of a value was defined using the normal curve.

If we redefinex to be a range for a numerical variable instead of a value for a categorical variable, we get the definition of the ratio for values of numerical variables, without even having to use a new equation!

FUN FACTS

In probability theory, mathematicians use the concept of asetto definexso that they only have to use one equation for both types of variables. Even the mathematicians like to keep some equations to a minimum.

Estimating population values

Finally, we need to understand the critical importance of the ratio to the statistical process calledestimation. By estimation, statisticians mean taking a statistical measure of a sample and inferring the value of that same measure for the entire population. Very often, the value of a statistic for the population is a terrific help in making a business decision. For example, knowing how many people in the general public might be interested in buying our product would be very helpful in our marketing plan. But we can’t very well ask every person in the country what they think. We can, however, ask a sampleof the population of the country what they think. If we can use those data to estimate how many folks in the whole population feel the same way, we will be better able to make our marketing decisions. As we move through this chapter, we will see how each of the various statistics can be valuable in making business decisions, but usually only if we can use them to speak about their value for more than just the sample we have taken.

Recall from High School algebra that ratios always mean the same thing, no matter what the size of what is measured. If we have the same amount of plain chocolate bars as almond chocolate bars, then the total number of almond chocolate bars depends on how many chocolate bars we have in total. If we have six chocolate bars, then three of them are almond. If we have 50,000 chocolate bars, then 25,000 of them are almond. It is the ratio called a ‘‘half’’ that describes what is the same about these two batches of chocolate bars that differ so much in size. No matter how many or how few chocolate bars we have, a half is always a half.

HANDY HINTS

One way to think about what ratios do in statistics is that they, in effect,erasethe information about the size of the sample, by dividing the count of objects of interest byN, the sample size.

The importance of this feature of ratios to estimation is that, in the end, the size of our sample is unimportant in terms of what we want to know about the whole population. (Make no mistake, the size of the sample is vitally important to calculating the statistics we need to help make our business decisions, but it is not a piece of information about the world, only about our study, and thus is only important as a part of the process.) If we talk to a random sample of the population and find that one-third of the people we talk to like our product, then our best estimate is that one-third of the general population will like our product as well. Just as in arithmetic, one-third is one-third is one-third.

The ratio is the key to making samples make sense. The value of the ratio always falls between zero and one. Its value doesn’t change the size of the total group, whether the total group is the sample we happened to take, or the group is the total population we want to know about. If we can express an important value as a proportion of the sample using ratios, statistical theory tells us that that same value will apply—more or less, given error due to sampling—to the entire population, and can even tell us how much error to expect. That is how the process of estimation works.

CRITICAL CAUTION Sample Size Does Matter

When we make an inference from a ratio calculated on a sample to a proportion of the population, how do we know our inference is valid? Here, sample size is a key factor. The larger the sample size, the smaller the possible error of our estimate.

If our sample size is too small, then we can’t make any reliable estimates about the whole population.

Descriptive Statistics: Characterizing Distributions

In statistics, our sample is our only source of information about the world.

Statistics we calculate from the sample will be used to estimate population values. Recall from our discussion of samples in Chapter 3 ‘‘What Is Probability?’’ that, for each variable, there are N values that make up our sample distribution of that variable. The most basic statistics that provide values that are estimates of population values are called descriptive statistics.

Descriptive statistics describe features of the distribution. Each sample descriptive statistic describes a feature of the sample distribution. If the sample is large enough, the distribution of the sample will resemble the popu- lation distribution. Because of this, descriptions of the sample distribution will also describe (via estimation) the population distribution. Descriptions of the population can be the facts we need to help with our business decisions.

HANDY HINTS

While some descriptive statistics can be defined for ordinal variables, the theory of estimation for descriptive statistics assumes that the variables are numeric. For the remainder of this chapter, we will only be dealing with numeric variables.

(Estimation of categorical variables usually involves only proportions, as estimated by ratios, which we learned about in the previous section.)

Dalam dokumen BUSINESSSTATISTICS DEMYSTIFIED - MEC (Halaman 194-198)