6. Multiple Regression Analysis:
Quadratics, Interaction Terms and Model Selection
Read Wooldridge (2013) , Chapter 6.2-6.3
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat
Outline
I. Quadratic Function II. Interaction Terms III. Adjusted R‐Squared IV. AIC and SIC
V. Selection of Regressors
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 2
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat
I. Models with Quadratics
• Consider a model
y : wage; x : exper y = 0 + 1 x + 2 x 2 + u
• Quadratic functions are used to capture decreasing or increasing marginal effects.
2 < 0; ( 1 > 0) decreasing marginal effect
2 > 0; ( 1 < 0) increasing marginal effect
• Interpretation: slope coefficients.
y = ( 1 + 2 2 x)x
Questions: 1) What is the marginal effect of x on y?
2) What is ?
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 3
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
Example: Quadratic equation of wages.
= 3.73 + 0.298exper – 0.0061exper 2 s.e. (0.35) (0.41) (0.0009) t‐stat [10.77] [7.28] [‐6.79]
n = 526R 2 = 0.093
Interpretation:
1. exper has a diminishing effect on wage ( <0)
2. The marginal effect of exper on wage: The return to the second year of experience is less than the first.
Compare 1 st year (.298), 2 nd year (.286) and the 11 th year.
(0.176)
3. The marginal effect of exper on wage will eventually be negative. (26 th year – going from 25 to 26 years.) (‐.007)
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 4
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
Dependent Variable: WAGE
Sample: 1 526 Included observations: 526
Variable Coefficient Std. Error t-Statistic Prob.
C 3.725406 0.345939 10.76896 0
EXPER 0.2981 0.040966 7.27685 0
EXPER^2 -0.00613 0.000903 -6.79199 0
R-squared 0.092769 Mean dependent var 5.896103
Adjusted R-squared 0.0893 S.D. dependent var 3.693086
S.E. of regression 3.524334 Akaike info criterion 5.362947
Sum squared resid 6496.147 Schwarz criterion 5.387274
Log likelihood -1407.46 F-statistic 26.73982
Durbin-Watson stat 1.801688 Prob(F-statistic) 0
Regress wage on exper and exper 2
Dependent Variable: WAGE Sample: 1 526
Included observations: 526
Variable Coefficient Std. Error t-Statistic Prob.
C 3.725406 0.345939 10.76896 0
EXPER 0.2981 0.040966 7.27685 0
EXPER^2 -0.00613 0.000903 -6.79199 0
R-squared 0.092769 Mean dependent var 5.896103
Adjusted R-squared 0.0893 S.D. dependent var 3.693086
S.E. of regression 3.524334 Akaike info criterion 5.362947
Sum squared resid 6496.147 Schwarz criterion 5.387274
Log likelihood -1407.46 F-statistic 26.73982
Durbin-Watson stat 1.801688 Prob(F-statistic) 0
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 5
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
Quadratic has a parabolic shape.
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 6
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
The turning point
• The turning point is the absolute value of the coefficient of x over twice the coefficient of x 2 .
x* = .298/2(.0061)
= 24.4 years.
• Is it true that the return on
“exper” is negative after 24 years of experience?
Tabulation of EXPER Date: 05/20/03 Time: 11:49 Sample: 1 526
Included observations: 526 Number of categories: 6
Value Count
Cumulative Count
[0, 10) 206 206
[10, 20) 129 335
[20, 30) 80 415
[30, 40) 66 481
[40, 50) 43 524
[50, 60) 2 526
Total 526 526
years 4 . ˆ 24 2
* ˆ
2
1
x
In the "Series”
window, choose View/One Way Tabulations
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 7
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
Evaluation: Is x* = 24.4 years realistic? May be not.
(1) The estimated effect of exper on wage may be biased, perhaps since we control for too few other factors.
(2) The functional relationship between wage and exper may be incorrect.
To find the marginal effect of x on y, we often use the average value of x. For example in the wage equation,
= 17.1
y/x = .2981 ‐ 2*(.006)(17.01) =.09398
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 8
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
Sample: 1 526
EXPER WAGE
Mean 17.01711 5.896103
Median 13.5 4.65
Maximum 51 24.98
Minimum 1 0.53
Std. Dev. 13.57216 3.693086 Skewness 0.706865 2.007325 Kurtosis 2.357318 7.970083 Jarque-Bera 52.85587 894.6195
Probability 0 0
Observations 526 526
Sample: 1 526
EXPER WAGE
Mean 17.01711 5.896103
Median 13.5 4.65
Maximum 51 24.98
Minimum 1 0.53
Std. Dev. 13.57216 3.693086
Skewness 0.706865 2.007325
Kurtosis 2.357318 7.970083
Jarque-Bera 52.85587 894.6195
Probability 0 0
Observations 526 526
Sample Averages: View/Descriptive Statistic/Common Sample
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 9
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
Quadratic
y = 0 + 1 x + 2 x 2 + u
1) So far we learn the quadratic that captures the decreasing effect of x on y.
> 0; < 0 2) Increasing effect of x on y
< 0; > 0
– See Example 6.2 Effect of Pollution on Housing Prices
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 10
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
Quadratic has a U-shape.
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 11
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
No turning points
• 3) Increases in x always have a positive and increasing effect on y
> 0; > 0
• 4) Increases in x have a negative and decreasing effect on y.
< 0; < 0
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 12
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat I. Models with Quadratics
II. Model with interaction terms
• Consider a model
log(wage) = 0 + 1 educ + 2 tenure + 3 educ*tenure + u log(y) = 0 + 1 x 1 + 2 x 2 + 3 x 1 *x 2 + u
• What is the partial effect of educ on log(wage)?
The semi‐elasticity of wages with respect to education is
log(y)/x 1 = 1 + 3 x 2
• Suppose that 1 >0 and 3 >0
This implies that an additional year of education yields a higher percentage increase in wages for more years with the firm.
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 13
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat II. Model with interaction terms
Example: Wage equation with interaction terms
log(wage) = 0 + 1 educ + 2 tenure + 3 educ*tenure + u
log( ) = .514 + .078educ +.0088tenure +.0014educ*tenure s.e. (.113) (.00884) (.010685) (.000857)
t‐stat [4.53] [8.78] [0.83] [1.64]
n=526, R 2 =0.312065
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 14
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat II. Model with interaction terms
Dependent Variable: LOG(WAGE) Method: Least Squares Sample: 1 526 Included observations: 526
Variable Coefficient Std. Error t-Statistic Prob.
C 0.514352 0.113442 4.534045 0
EDUC 0.07763 0.00884 8.781627 0
TENURE 0.008848 0.010685 0.828035 0.408 EDUC*TENURE 0.001405 0.000857 1.640133 0.1016 R-squared 0.312065 Mean dependent var 1.623268 Adjusted R-squared 0.308111 S.D. dependent var 0.531538 S.E. of regression 0.442133 Akaike info criterion 1.213162 Sum squared resid 102.0413 Schwarz criterion 1.245598 Log likelihood -315.062 F-statistic 78.9308 Durbin-Watson stat 1.777168 Prob(F-statistic) 0
• What is or what is the return to education?
log(y)/x 1 = 1 + 3 ̅ 2 = .085
• How to test that the return to education at mean value of tenure with the firm is statistically significant using Eviews?
log(y)/x 1 = 1 + 3 ̅ 2
Sample: 1 526
WAGE EDUC TENURE
Mean 5.896103 12.56274 5.104563
Dependent Variable: LOG(WAGE) Method: Least Squares
Sample: 1 526
Included observations: 526
Variable Coefficient Std. Error t-Statistic Prob.
C 0.514352 0.113442 4.534045 0
EDUC 0.07763 0.00884 8.781627 0
TENURE 0.008848 0.010685 0.828035 0.408
EDUC*TENURE 0.001405 0.000857 1.640133 0.1016
R-squared 0.312065 Mean dependent var 1.623268
Adjusted R-squared 0.308111 S.D. dependent var 0.531538 S.E. of regression 0.442133 Akaike info criterion 1.213162 Sum squared resid 102.0413 Schwarz criterion 1.245598
Log likelihood -315.062 F-statistic 78.9308
Durbin-Watson stat 1.777168 Prob(F-statistic) 0
log(wage) c educ tenure educ*tenure
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 15
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat II. Model with interaction terms
Sample: 1 526
WAGE EDUC TENURE
Mean 5.896103 12.56274 5.104563
Median 4.65 12 2
Maximum 24.98 18 44
Minimum 0.53 0 0
Std. Dev. 3.693086 2.769022 7.224462
Skewness 2.007325 -0.61957 2.110273
Kurtosis 7.970083 4.884245 7.658076
Jarque-Bera 894.6195 111.4653 865.9427
Probability 0 0 0
Observations 526 526 526
Sample Averages: View/Descriptive Statistic/Common Sample
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 16
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat II. Model with interaction terms
Dependent Variable: LOG(WAGE) Sample: 1 526
Included observations: 526
Variable Coefficient Std. Error t-Statistic Prob.
C 0.514352 0.113442 4.534045 0
EDUC 0.084794 0.007059 12.01189 0
TENURE 0.008848 0.010685 0.828035 0.408
EDUC*(TENURE-5.1) 0.001405 0.000857 1.640133 0.1016
R-squared 0.312065 Mean dependent var 1.623268
Adjusted R-squared 0.308111 S.D. dependent var 0.531538
S.E. of regression 0.442133 Akaike info criterion 1.213162
Sum squared resid 102.0413 Schwarz criterion 1.245598
Log likelihood -315.062 F-statistic 78.9308
Durbin-Watson stat 1.777168 Prob(F-statistic) 0
H 0 : 1 + 3 2 = 0
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 17
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat II. Model with interaction terms
III. R 2 and Adjusted R‐Squared
• R 2 measures the variation in y explained by x 1 , x 2 , …, x k
• Cautions in using R 2
1) Choosing x 1 , x 2 , …, x k in terms of R 2 can lead to a nonsensible model.
2) Small R 2 does not imply that the model is useless.
3) R 2 can never fall when a new x is added to the model.
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 18
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat III. R2 and Adjusted R-Squared
R 2 and Adjusted R‐Squared
• R‐Squared defined R 2 = SSE/SST
• Population R‐squared
R 2 = 1 –(SSR/n)/(SST/n) = 1 ‐ u 2 / y 2
SSR/n = u 2 is the population variance of u i SST/n = y 2 is the population variance of y i
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 19
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat III. R2 and Adjusted R-Squared
Adjusted R‐Squared defined
• Note that
2 =SSR/(n‐k‐1) is the unbiased estimator of u 2 SST/(n‐1) is the unbiased estimator of y 2
• Adjusted R 2 is defined as
R 2 = 1 – [SSR/(n‐k‐1)] / [SST/(n‐1)]
Term: Corrected R‐squared, Adjusted R‐squared
• In terms of unbiasedness, adjusted R 2 is not a better estimator of R 2 .
1 k n
1 ) n R 1 ( 1
R 2 2
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 20
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat III. R2 and Adjusted R-Squared
Properties of Adj. R 2
R 2 ‐bar imposes a penalty for adding additional regressors
to the model (k )
In summary, an additional regressor is added R 2
n‐k‐1 ‐‐ penalty
R‐bar squared ‐ helps to choose a model.
• t‐, F‐Statistics and R 2 ‐bar
R 2 ‐bar squared increases when
– t‐statistic on the new variable is greater than one.
– F‐statistic for joint significance of a group of new variables is greater than one.
1 k n
1 ) n R 1 ( 1
R 2 2
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 21
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat III. R2 and Adjusted R-Squared
Choosing nonnested models based on R
2-bar
log( ) = 11.10 +.068years +.016gamesyr+ .0014bavg +.0359hrunsyr t-stat [41.48] [5.59] [10.08] [1.33] [4.96]
n=353; R
2=.625388 R
2-bar=.621082
log( ) = 11.27 +.070years +.011gamesyr+ .00074bavg +.0165rbisyr t-stat [41.2] [5.78] [5.20] [0.69] [5.12]
n=353; R
2=.626937 R
2-bar=.62264
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 22
Choosing nested models based on R 2 ‐bar
log( ) = .514 + .078educ +.0088tenure +.0014educ*tenure t‐stat [4.53] [8.78] [0.83] [1.64]
n=526, R 2 =0.312065, R 2 ‐bar=.308111 log( ) = .404 + .087educ +.0258tenure
t‐stat [4.41] [12.38] [9.63]
n=526, R 2 =0.308520, R 2 ‐bar=.305875
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat III. R2 and Adjusted R-Squared
R 2 and R 2 ‐bar are useless
• Example: CEO salary of 177 firms
salary 1990 compensation, $1000s sales 1990 firm sales, millions
mktval market value, end 1990, millions.
ceoten years as ceo with company
log( ) = 4.504 + .163log(sales) +.109log(mktval) + .0117ceoten t‐stat [17.51] [4.15] [2.20] [2.20]
n=177, R 2 =0.31815, R 2 ‐bar=.306327
= 613.43 +.019sales + .0234mktval + 12.703ceoten
t‐stat [9.40] [1.89] [2.47] [2.26]
n=177, R 2 = 0.201274, R2‐bar=.187424
• Which model is preferred?
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 23
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat III. R2 and Adjusted R-Squared
IV. AIC and SIC
• Akaike Information Criterion (AIC) and Schwartz Information Criterion (SIC) are defined mathematically as follows.
where k is number of parameters.
• In comparing two or more models, the model with the lowest values of AIC and SIC is preferred.
n e u
AIC 2 k / n ˆ i 2
n n u SIC
k/n ˆ
2I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 24
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat IV. AIC and SIC
Example: Baseball player’s salary revisited.
log( ) = 11.10 +.068years +.016gamesyr+ .0014bavg +.0359hrunsyr n=353; R 2 =.625388 R 2 ‐bar=.621082
AIC=2.216709 SIC=2.271474
log( ) = 11.27 +.070years +.011gamesyr+ .00074bavg +.0165rbisyr n=353; R 2 =.626937 R 2 ‐bar=.622649
AIC = 2.12566 SIC = 2.267332
Example: Wage models in linear and quadratic functions
= 3.73 + 0.298exper – 0.0061exper 2 n = 526 R 2 = 0.092769 R 2 ‐bar = .089300
AIC=5.362947 SIC=5.382947
= 5.37 + 0.0307exper n = 526 R 2 = .012747 R 2 ‐bar = .010863
AIC=5.443674 SIC=5.45989
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 25
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat IV. AIC and SIC
V. Selection of Regressors
Controlling for two many factors
• Suppose we want to study the effect of project investment on
employment of state enterprises in the energy sector. Variables are as follows.
empnum number of employees employed (persons) invsize amount of money invested (millions of baht) product output produced by the project (millions) years duration of the project
= 65.07 ‐.045invsize +.0802product ‐26.43years t‐stat [.623] [‐2.27] [5.91] [‐1.23]
n=25 R 2 =.802802, R 2 ‐bar=.774630
= 717.7 +.118invsize ‐148.69years t‐stat [3.72] [6.29] [‐4.41]
n=25 R 2 =.601889 R 2 ‐bar=.565697
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 26
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat V. Selection of Regressors
Tradeoff when adding Regressors
• Tradeoff: new variable is correlated with regressors – Adding a new independent variable may exacerbate
multicollinearity problem
– But adding a regressor generally reduces error variance.
• Add a regressor:
We should add a regressor that affects y but is uncorrelated with all of the independent variables of interest.
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 27
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat V. Selection of Regressors
Example: Effect of wine consumption in fifty states
• Wine Equation
wine = 0 + 1 price + 2 income + u
• We may want to include individual characteristics to the regression to better explain the variation in y. (eg. age and level of education)
wine = 0 + 1 price + 2 income + 3 age + 4 educ + u
28
28
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat V. Selection of Regressors
Functional Form
• We’ve seen that a linear regression can really fit nonlinear relationships
• Can use logs on RHS, LHS or both
• Can use quadratic forms of x’s
• Can use interactions of x’s
• How do we know if we’ve gotten the right functional form for our model?
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 29
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat V. Selection of Regressors
Functional Form
• First, use economic theory to guide you.
• Think about the interpretation.
• Does it make more sense for x to affect y in percentage (use logs) or absolute terms?
• Does it make more sense for the derivative of x 1 to vary with x 1 (quadratic) or with x 2 (interactions) or to be fixed?
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 30
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat V. Selection of Regressors
Recap of MLR:
Quadratic and Interaction
Quadratic Function
Interaction Terms
Adjusted R‐Squared
AIC and SIC
Selection of Regressors
I. Quadratic II. Interaction III. Adjust R
2IV. AIC&SIC V. Selection 31
6. Quadratics, Interaction Terms and Model Selection . Quantitative Methods of Economic Analysis . 2949605 . Chairat Aemkulwat