SIMPLE LINEAR REGRESSION - Introduction to Distribution Logistics

The forecasting models we have analyzed so far are widely adopted. However.

in these models demand is just a function of time. On the contrary, in many real-life situations demand might depend on a variety of variables including advertising expenditures. weather. price, number of stores carrying the item, state of the economy, etc. A forecasting model that tries to capture these effects can be fairly complex for several reasons.

0 First, demand might depend on many variables. For example, in grocery supermarkets, demand might depend on traffic (number of customers visiting the store), weather, the price of the item, promotions of the item a t stake and/or substitute ones, religious events such as Easter or Christmas. and sport events such as the Soccer World Cup or the Olympic Games.

Second. the relationships between the independent (i.e.. explanatory) variables and demand can be complex and nonlinear. Let us assume that we have cut price by 50% and gained a 100-unit increase in demand.

If we cut price by 100% and give the product away for free, we definitely should not expect a 200-unit lift.

Though the problem can be fairly complex in real life situations. in this section we address a relatively trivial situation where demand depends on a single independent variable. In other words. we illustrate the application of simple linear regression (see section A.lO). The more general case of multivariate regression, i.e., situations where we explain and forecast demand through more than one independent variable. is dealt with in the web sections W.A.11 and W .3.11.

In section A.10 we describe in full detail the assumptions and properties of simple linear regression. This statistical method estimates. through empirical data. the linear relationship between a variable X we call independent and a variable Y we assume t o depend on the first one. In our case, demand is the dependent variable.

As such, linear regression is just a tool t o investigate the relationship between two variables. Thus it might be used to analyze the relationship between two variables, say demand for ice cream and temperature. demand for cars and Gross Domestic Product (GDP), and demand for fashion products and advertising. Once such a relationship is estimated. we can use it to forecast future demand. if we know the future values of the independent variable (or

SIMPLE LINEAR REGRESSION 159

they can be predicted accurately). Indeed, a "perfect analysis" of the relationship between the demand for cars and the G D P does not help to forecast the demand for cars if we have no idea about the future of the economy. In the remainder of this section we assume the future values of X ^tobe known with certainty (e.g.. think about the prices the company sets). In the final part of the section we briefly discuss the consequences of uncertainty on the future value(s) of X (e.g.. when estimates about the future G D P are available, though they are affected by some sort of error).

LVe assume that demand observations are drawn from a random linear process:

(3.40) yz = Q

+

ox,

+

^{E , :}

where:

0 i is the index that identifies the i-th observation of demand and of the variable that influences it:

0 cy and 3 are unknown parameters t h at influence the demand process:

these parameters have to be estimated:

0 E, is a normally distributed random variable with a expected value zero and standard deviation _oe(additional assumptions concerning statistical independence are pointed out in section A.lO).

Also, we assume to have an estimate of the relationship between Y and x.

Y = a + b x . (3.41)

Section A.10 show: how this relationship can be estimated, based on past observations of Y and 2 . The point forecast of Y (e.g.. demand for cars) corresponding t o x ⁼xo is

Yo = a

+

bxo. (3.42)

It is easy to show t ha t Y o is an unbiased prediction*' of the future level of demand. since estimates a and b are unbiased. However. as discussed in the first section of this chapter, a point forecast is often meaningless. especially in the case of continuous variables such as Yt ^{( E}is continuous and thus Y is continuous as well).

So we should not only look at the expected level of future demand. we should also investigate the standard deviation of the estimate. that is. the

28Kote t h a t in section A.10 we are mainly concerned with t h e estimate of unknown numbers

cy and 3. T h e demand YO, corresponding t o a value 10 of t h e independent variable is a random variable t h a t we are trying t o predict. This is conceptually different a n d . for instance; we should talk about prediction intervals rather t h a n confidence intervals. In this chapter we are a hit sloppy a t times. For t h e sake of simplicity, we use t h e term ,.standard error of estimate" when referring t o See(Y0). which is conceptually not quite correct.

square root of the expected squared difference between the forecast YO and the actual demand YO:

Y o - Y o = a + / 3 2 o + € o - ( a + b ~ o ) = ( Q - a ) + ( B - b ) z o + € o . (3.43) On the one hand, this can be interpreted as the forecasting error we shall expect. On the other hand. we can read the output of the forecasting process as a distribution of demand rather than a point forecast. The mean of the distribution is YO and the standard deviation is

In section A.10 we show that

Similarly. one can show that See(Y0) is given by

(3.44) We can read equation (3.44) and make sense of it. First, as n tends to infinity, the second and third terms under square root tend t o zero (respectively, n and the number of terms in the summation grow). while the first one remains unchanged. Unlike the case of the estimates a and b of the parameters ^Qand 0, the prediction error does not go to zero. as n tends to infinity. Actually.

as n tends to infinity the forecasting error tends t o ^oc.

Indeed, with a n infinite number n of past observations we can perfectly estimate the relationship between Y and 2 . so we face no error in the estimates of ^Qand

p.

However, this is just not enough to generate a n error-free forecast.

Indeed, a perfect estimate of the parameters leads to a perfect estimate of the expected level ( a

+

^P.0) of the demand YO. that is. the nonstochastic part of the demand process. However, the random part of the process €0 still creates random fluctuations we cannot predict. Thus, it leads to forecasting errors.

as figure 3.23 shows. By now the first term in equation (3.44) shall be clear.

and we can devote our attention t o the second and third one. They show the impact of errors of estimate of a and b. To clearly tell the contribution of these errors we shall assume that €0 is zero. In other words, we assume

SIMPLE LINEAR REGRESSION 161

40 ^- 35 ^~ 30 - 25 ^~ 20 ^~ 15 - 1 0 ,

d /J

ideal line = estimated line . *

5 i 0 4

0 ⁵ ¹⁰ ^{I 5} 20 25 30 35

Fig. 3.23 The forecasting error due to the variability of the demand process.

that the random noise is zero and the new demand observation lies on the line Y ⁼cy

+

^{J ~ X .}Therefore, any error is due to the wrong estimate of the parameters, rather than t o the randomness of the demand process.

The second term .,"In is just the variance of the n draws of E we have studied t o estimate the regression line y = a

+

bz. When the n observations tend to lie above the ideal line Y ⁼^Q

+

32 (i.e.. when the average of the n draws of E is greater. or lower, than zero), the estimated regression line tends t o lie above (below) the ideal one. Thus the estimate a tends to be larger than the actual paralmeter a. The error in the estimate of cy leads to an error of estimate of

Finally. the last term in equation (3.44) can be interpreted as the impact of errors of estimate of ,!? on the accuracy of the demand forecast Y o . To isolate this effect, we set to zero the sources of errors we have discussed so far. hlore formally we assume that:

(see figure 3.24).29

C:='=,

^E%⁼^0.i . c . we assume t ha t the average of the n random draws is zero and thus the estimated line lies neither above nor below the ideal line:

2 9 € i ~ t i ~ e t h a t this is not the only source of error in t h e estimate of a . Indeed. even when the average noise Cy=l e t is zero, we might still face a n error of estimate of 0 . Indeed. in this case t h e draws are on t h e average neither above nor below the ideal line Y = a

+

,3x.

This means t h a t t h e es,timated line lies neither above nor below t h e ideal one. Still the estimate b of t h e slope inight be wrong and (in case of 5 f 0) this can lead t o errors in the estimate of cy (see section A.lO). So one might more properly say that the first term shows the impact of errors in the vertical position of t h e estimated regression line. rather t h a n errors of estimate of a per se.

45 1 v

40 - 35 - 3 0 - 25 - 20 - 15 - 10 ^~

55

⁰ ⁵ ¹⁵ ²⁰ ²⁵ ³⁰ ³⁵

X 10

Fig. 3.24 Forecasting error due to a wrong estimate of a.

0 €0 = 0, i.e.: we assume that demand YO is exactly on the ideal line Y = a + Q z .

The third term in equation (3.44) can be interpreted as the error of estimate of ,O [see equation (3.44)] times (zo - 5). Why does the forecast accuracy depend on Seeb and on the distance between xo ^and^Z? The definition of a shows that the estimated line goes through the barycenter of the demand observations (z; Y ) . Also, since we assume t hat

C:=l

E , = 0, the ideal line Y = Q

+

/3x intersects the estimated one in the barycenter of demand observations. Thus the errors in the estimate of the slope (Seeb) generate no effect on the inaccuracy of demand when xo ^{= Z.}On the contrary, the error in the estimate of the slope generates large errors when zo is far from the point 5 where the two lines intersect (see figure 3.25).

Concept 3.4 T h e error ^of estimate is due t o the randomness ^of demand process, and the errors in the estimate of the intercept (a vs. a ) and slope (b

us.

p)

of the regression line.

Equation (3.44) shows the standard forecast error and thus enables us to build confidence intervals of demand Y . The analysis above shows that the standard error of estimate reaches a minimum when ^{zo = Z,}since Z is the barycenter of past observations and thus is the single point we have more information about. This relative abundance of information reduces the forecasting error.

Hence, the width of the prediction interval is affected by the distance between the barycenter of past observations Z and the point xo for which we want t o forecast demand. Figure 3.26 shows the confidence intervals (with

SIMPLE LINEAR REGRESSlON 163

30 -

10 -

* forecast

o-l 0 ⁵ 10 ¹⁵ ²⁰ 25 30 35

Fig 3.25 Forecasting errors due to errors in the estimate of the slope of the regression line.

X -10

Fig. 3.26 Forecasting error as a function of 2 0

Fig. 3.27 Effects of a partial knowledge of 2 0 .

confidence level 90%) for various ZO. This figure shows that they are wider for zo significantly to the left or to the right of Z.

The above findings hold only under the tight assumptions of the regression model (see section A.lO). When we study real-life data the assumptions are typically not fully met. For example, the relationship between z ^and^Y^might be linear only within a given range of z. Outside this range it might be nonlinear, and thus we should expect biased forecasts and far larger errors than the linear regression model predicts.

Finally, we investigate the effect of less-than-perfect information on X o .30

For example, consider a model where the demand for a given kind of food depends on the temperature. If we want to use this model for forecasting pur- poses, we should know the future temperature. However, the future temperature is uncertain and it is known, at best, in terms of probability distribution (or confidence intervals). Hence, when we use temperature to estimate future demand, we have an additional source of uncertainty. Geometrically. a partial information on

XO

means that we do not know exactly where. on the X axis. we shall read the relationship between

Y

and X. Hence, the confidence interval on

Y

is a sort of an area on the ( X ,

Y)

plane rather than a simple segment, as we face uncertainty on both X and Y . We do not know the right point on the X axis, and still for a given point on the z axis we only have distributional information on

YO.

3 0 N ~ t i c e t h a t in this case we use X O instead of 2 0 , as we do not know the future level of X and thus it can be interpreted as a random variable rather t h a n a number.

FORECASTING MODELS BASED ON MULTIPLE REGRESSION 165

3.10.1

As we discuss in section A.lO. linear regression relies on a rather wide range of assumptions on the demand process (more generally. a random process).

Often, demand data hardly meet these assumptions and thus cannot be used for linear regression.

Setting up data for regression

Example 3.17 When we want to investigate the relationship between temperature and demanld of a food product or demand elasticity. i.e.. the increase in demand due t o promotions and/or price cuts. we might not be able to use straight demand data. as they might be affected by a significant seasonality that might lead t o erroneous conclusions.

Let us consider a retail company that. during the weekends, cuts by 20%

the price of a product that on the average sells 100 unitslday. When we look a t demand and price data we might be led to ascribe the whole increase in demand to price elasticity. On the contrary it is, at the least partially. due t o weekly fluctuations that make demand increase on Saturdays and Sundays. So we might overestimate the price sensitivity of demand as we attribute both the seasonal fluctuation and the increase due to the price cut t o the price elasticity of demand. For a more detailed discussion we refer t o [7].

0

This is why we might want t o "clean" the data before we apply linear regression to make them fit with the assumptions of the model and make sure the analysis is reliable. In the case of example 3.17 above, we should first remove the seasonality of demand from the dataset and then analyze price sensitivity to understand the relationship between price and demand. A second option is t o use multiple regression that tries t o estimate both effects a t once.

W.3.11 FORECASTING M O D E L S BASED O N M U L T I P L E

Dalam dokumen Introduction to Distribution Logistics (Halaman 176-183)