• Tidak ada hasil yang ditemukan

CFA 2018 Level 2 Quantitative

N/A
N/A
Protected

Academic year: 2019

Membagikan "CFA 2018 Level 2 Quantitative"

Copied!
35
0
0

Teks penuh

(1)

Quantitative Methods

Level 2 -- 2017

Instructor: Feng

Brief Introduction

Topic weight:

Study Session 1-2 Ethics & Professional Standards 10 -15% Study Session 3 Quantitative Methods 5 -10%

Study Session 4 Economics 5 -10%

Study Session 5-6 Financial Reporting and Analysis 15 -20%

Study Session 7-8 Corporate Finance 5 -15%

Study Session 9-11 Equity Investment 15 -25%

Study Session 12-13 Fixed Income 10 -20%

Study Session 14 Derivatives 5 -15%

Study Session 15 Alternative Investments 5 -10% Study Session 16-17 Portfolio Management 5 -10% Weights: 100%

Brief Introduction

Content:

Ø Study session 3: Quantitative Methods for Valuation

• Reading 9: Correlation and Regression

• Reading 10: Multiple Regression and Issues in

Regression Analysis

• Reading 11: Time-Series Analysis

• Reading 12: Excerpt from “Probabilistic approaches: scenario analysis, decision trees, and simulations”

Brief Introduction

考纲对比:

(2)

Brief Introduction

推荐阅读:

Ø 定量投资分析

• Richard A. DeFusco, Dennis W. Mcleavey, Jerald E. Pinto, David E. Runkle

• ISBN: 978-7-111-38802-9

• 机械工业出版社

Brief Introduction

学习建议:

Ø 本门课程逻辑递进关系很强,要把每个知识点学懂了再继

续往前学;

Ø 听课与做题相结合,但并不建议“刷题”;

Ø 最重要的,认真、仔细的听课。

幸福就是,有人爱、有事做、

有所相信、有所期待!

Correlation Analysis

Tasks:

Ø Calculate and interpret a sample covariance and a

sample correlation coefficient;

Ø Formulate a hypothesis test of population correlation

coefficient;

(3)

Correlation Analysis

Scatter plots

Ø A graph that shows the relationship between the observations for two data series in two dimensions.

Japan Switzerland

U.S. U.K.

Australia South Korea

Sample covariance

Ø A statistical measure of the degree to which two variables move together, and capture the linear relationship between tow variables.

Ø Ranges of Cov(X,Y): -∞ < Cov(X,Y) < +∞.

üCov(X,Y) > 0: the two variables tend to move together;

üCov(X,Y) < 0: the two variables tend to move in opposite direction.

 

n

i i

i= 1

X - X Y - Y

C o v (X ,Y )=

n - 1

Correlation Analysis

Sample correlation coefficient

Ø A measure of the direction and extent of linear association between two variables.

Ø Ranges of rXY : -1 < rXY < +1. X Y

X Y

C o v (X ,Y ) r = s s

Correlation Analysis Correlation Analysis

Sample correlation coefficient (Cont.)

r = +1 (perfect positive linear

correlation)

r = -1 (perfect negative linear

(4)

Correlation Analysis

Sample correlation coefficient (Cont.)

0 < r < 1

(positive linear correlation) (negative linear correlation)-1 < r < 0

Correlation Analysis

Sample correlation coefficient (Cont.)

r = 0 (no linear correlation)

Correlation Analysis

Steps of hypothesis testing (Review of Level 1)

Ø Step 1: stating the hypotheses: relation to be tested;

Ø Step 2: identifying the appropriate test statistic and its probability distribution;

Ø Step 3: specifying the significance level;

Ø Step 4: stating the decision rule;

Ø Step 5: collecting the data and calculating the test statistic;

Ø Step 6: making the statistical decision;

Ø Step 7: making the economic or investment decision.

Hypothesis testing of correlation

Ø Test the correlation coefficient between two variables is equal to zero.

üH0: ρ=0, Ha: ρ≠0;

üt-test: df=n-2;

üTwo-tailed test;

üDecision rule: reject H0 if t > + tcritical , or t < - tcritical 2

r n - 2 t =

1 - r

(5)

Example:

A analyst want to test the correlation between variable X and variable Y. The sample size is 20, and he find the covariance between X and Y is 16. The standard deviation of X is 4 and the standard deviation of Y is 8. With 5% significance level, test the significance of the correlation coefficient between X and Y.

Correlation Analysis

Answer:

Ø H0: ρ=0, Ha: ρ≠0;

Ø Sample correlation coefficient r = 16/(4×8) = 0.5;

Ø t-statistic:

Ø The critical value of two-tailed t-test with df=18 and significance level of 5% is 2.101;

Ø Since 2.45 is larger than 2.101, the null hypothesis can be rejectted, and we can say the correlation coefficient between X and Y is significantly different from zero.

2 0 - 2

t= 0 .5 x = 2 .4 5

1 - 0 .2 5 Correlation Analysis

Limitation to correlation analysis

Ø Outlier: may result in false statistical significance of linear relationship.

Correlation Analysis

Limitation to correlation analysis (Cont.)

Ø Spurious correlation: statistically significant correlation exists when in fact there is no relation (no economic explanation).

(6)

Limitation to correlation analysis (Cont.)

Ø Nonlinear relationships: two variables can have a strong nonlinear relation and still have a very low correlation.

Correlation Analysis

Ø Importance: ☆☆ Ø Content:

• Covariance and correlation coefficient;

• Hypothesis testing of correlation coefficient;

• Limitation of correlation analysis.

Ø Exam tips:

• 这一部分是后面学习的基础,出题点比较多,出题形式也

比较灵活。 Summary

Simple Linear Regression

Tasks:

Ø Describe the assumptions underlying linear regression;

Ø Calculate and interpret the predicted value and

confidence interval for the dependent variable;

Ø Interpret regression coefficients, formulate its hypothesis testing, calculate and interpret its confidence interval.

Dependent variable (Y)

Ø The variable that you are seeking to explain;

Ø Also referred to as explained variable or predicted variable.

Independent variable (X)

Ø The variable(s) that you are using to explain changes in the dependent variable.

Ø Also referred to as explanatory variable or predicting variable.

(7)

Linear regression

Ø Use linear regression model to explain the dependent variable using the independent variable(s).

Simple Linear Regression

Simple linear regression model

Ø

where:

Yi = ith observation of the dependent variable, Y;

Xi = ith observation of the independent variable, X;

b0 = intercept;

b1 = slope coefficient;

εi = error term for the ith observation (also referred to as

residual of disturbance term).

.... i 0 1 i i

Y = b + b X + ε i= 1 , ,n

Simple Linear Regression

Assumptions of simple linear regression model

Ø The relationship between the dependent variable (Y) and the independent variable (X) is linear;

Ø The independent variable (X) is not random;

Ø The expected value of the error term is 0: E(ε)=0;

Ø The variance of the error term is the same for all observations (homoscedasticity):

Ø The error term is uncorrelated (independent) across observations: E(εiεj)=0 for all i ≠ j;

Ø The error term (ε) is normally distributed.

Simple Linear Regression

2 2 i ε

E (ε )= σ i= 1 , .... ,n

The regression line (the line of best fit)

Ø Ordinary least squares (OLS) regression: chooses values for the intercept (estimated intercept coefficient, ) and slope (estimated slope coefficient, ), to minimize the sum of squared errors (SSE).

(8)

The regression line

Ø Estimated slope coefficient ( )

üCalculation:

üInterpretation: the sensitivity of Y to a change in X.

• The change of Y for 1-unit change of X.

Ø Estimated intercept coefficient ( )

üCalculation:

üInterpretation: the value of Y when X is equal to zero.

ˆ

Predicted value of dependent variable

Ø The values that are predicted by the regression equation, given an estimate of the independent variable.

where:

Predicted value of dependent variable (Cont.)

Ø The confidence interval for a predicted value of dependent variable is:

Significance test for a regression coefficient

Ø H0: b1= hypothesized value; Ha: b1≠ hypothesized value; üTypically, H0: b1= 0; Ha: b1≠ 0, which means to test

whether an independent variable explains the variation in the dependent variable.

Ø Test statistic: df=n-2;

Ø Decision rule: reject H0 if t > + tcritical , or t < - tcritical ; Ø Rejection of null hypothesis means the regression

(9)

Confidence interval for a regression coefficient

Ø The confidence interval for a regression coefficient is:

where:

two-tailed critical t-value with df=n-2; standard error of the regression coefficient.

Ø Can be applied to significance test for a regression coefficient.

üIf the confidence interval does not include zero, the null hypothesis (H0: b1=0) is rejected, and the coefficient is

said to be statistically significantly different from zero.

Simple Linear Regression

1 1 1

1 c b 1 c b 1 1 c b

bˆ (t sˆ) or bˆ - (t sˆ) b bˆ (t sˆ)

c

t :

1 b sˆ:

Ø Importance: ☆☆☆ Ø Content:

• Underlying consumptions of linear regression;

• Prediction of dependent variable;

• Interpretation of hypothesis testing for regression coefficient.

Ø Exam tips:

• 常考点1:underlying consumption,概念题;

• 常考点2:predicted value of dependent variable,计算题。 Summary

ANOVA Analysis (1)

Tasks:

Ø Describe and interpret ANOVA;

Ø Calculate and interpret SEE, R2, and F-statistics; Ø Describe limitations of regression analysis.

Analysis of variance (ANOVA)

Ø A statistical procedure for dividing the total variability of a variable into components that can be attributed to different sources.

üTotal variation = explained variation + unexplained variation

• Total sum of squares(SST) = Regression sum of squares (RSS) + Sum of squared errors (SSE)

(10)

Analysis of variance (Cont.)

Ø A graphic explanation of the components of total variation:

Simple Linear Regression

Analysis of variance (Cont.)

Ø Total sum of squares(SST): measures the total variation in the dependent variable.

Ø Regression sum of squares (RSS): measures the variation in the dependent variable that is explained by the independent variable.

n 2 i i- 1

S S T =

(Y - Y )

Simple Linear Regression

n 2

i-1

R SS=

(Y-Y)ˆ

Analysis of variance (Cont.)

Ø Sum of squared errors (SSE): measures the unexplained variation in the dependent variable.

• Also known as the sum of squared residuals or the residual sum of squares.

i

n 2 i-1

S S E =

(Y -Y )ˆ Simple Linear Regression

Analysis of variance (Cont.)

Ø ANOVA table

üMSR: mean regression sum of squares;

üMSE: mean squared error.

df Sum of Squares(SS) Mean Sum of Squares (MS) Regression

(explained) 1 RSS MSR=SSR/1 Error

(unexplained) n-2 SSE MSE=SSE/(n-2)

Total n-1 SST

(11)

Standard error of estimate (SEE)

Ø The standard deviation of error terms in the regression.

Ø Measures the degree of variability of the actual Y-values relative to the estimated Y-values from a regression equation;

üGauges the "fit" of the regression line. The smaller the SEE, the better the fit.

Simple Linear Regression

S S E S E E = = M S E

n - 2

Coefficient of determination (R²)

Ø The percentage of the total variation that is explained by the regression.

üFor simple linear regression, R² is equal to the squared correlation coefficient: R² = r².

Simple Linear Regression

2 E x p la in e d v a ria tio n R S S S S T -S S E

R = T o ta l v a ria tio n =S S T= S S T

F-statistic

Ø An F-statistic assesses how well the independent variables, as a group, explains the variation in the dependent variable; or used to test whether at least one independent variable explains a significant portion of the variation of the dependent variable.

üNote: this is always a one-tailed test.

R S S

M S R k

F = = SSE M S E n -k -1

Simple Linear Regression

F-statistic (Cont.)

Ø For simple linear regression, the F-test duplicate the t-test for the significance of the slop coefficient.

üH0: b1= 0; Ha: b1≠ 0;

ü dfnumerator=1; dfdenominator=n-2;

üDecision rule: reject H0 if F > Fc. R SS

M SR 1

F= = SSE M SE n -2

(12)

Limitations of regression analysis

Ø Regression relations can change over time (parameter instability).

Ø To investment contexts, public knowledge of regression relationships may negate their future usefulness.

Ø If the regression assumptions are violated, hypothesis tests and predictions based on linear regression will not be valid.

Simple Linear Regression

Ø Importance: ☆☆☆ Ø Content:

• ANOVA;

• SEE, R2, and F-statistic. Ø Exam tips:

• 常考点1:给出ANOVA表,计算某空白格;

• 常考点2:R2的calculation and interpretation,计算题和概念

题都可能考。 Summary

Multiple Regression

Tasks:

Ø Formulate a multiple regression and explain the assumptions of a multiple regression model;

Ø Interpret estimated regression coefficients, formulate

hypothesis tests for them and interpret the results.

Ø Calculate and interpret the predicted value for the dependent variable.

Multiple Regression

Multiple regression

Ø Regression analysis with more than one independent variable.

üMultiple linear regression model

where:

Yi = the ith observation of the dependent variable Y

Xji = the ith observation of the jth independent variable Xj

(13)

Assumptions of multiple linear regression

Ø The relationship between the dependent variable and the independent variables is linear;

Ø The independent variables are not random. Also, no exact linear relation exists between two or more of the independent variables;

Ø The expected value of the error term, conditioned on the independent variables, is 0: E(ε | X1, X2, …, Xk) = 0;

Multiple Regression Multiple Regression

Assumptions of multiple linear regression (Cont.)

Ø The variance of the error term is the same for all observations (homoscedasticity, 同方差性): ;

Ø The error term is uncorrelated across observations: E(εiε

j)=0 for all i≠j;

Ø The error term is normally distributed.

2 2 i ε

E (ε )= σ

Intercept term (b0)

Ø The value of the dependent variable when the independent variables are all equal to zero.

Slope coefficient (bj)

Ø The expected increase in the dependent variable for a 1-unit increase in that independent variable, holding the other independent variables constant.

üAlso called partial slope coefficients.

Multiple Regression

Hypothesis testing of regression coefficients

Ø Hypothesis:

Ø Test statistic:

üdf = n-k-1, k = number of independent variables

Ø Decision rule: reject H0 if üt > + tc , or t < - tc ;

(14)

Statistical significance of independent variable

Ø Hypothesis:

Ø Test statistic:

üdf = n-k-1, k = number of independent variables

Ø Decision rule: reject H0 if üt > + tcritical , or t < - tcritical ; üp-value < significance level (α).

j

Interpret the testing results

Ø Rejection of null hypothesis means the regression coefficient is different from/greater than/less than the hypothesized value given a level of significance (α).

Ø For significance testing, rejection of null hypothesis means the regression coefficient is different from zero, or the independent variable explains some variation of the dependent variable.

Multiple Regression

Multiple Regression

Confidence interval for a regression coefficient

Ø The confidence interval for a regression coefficient is:

where:

two-tailed critical t-value with df=n-k-1; standard error of the regression coefficient.

Ø Can be applied to significance test for a regression coefficient.

üIf the confidence interval does not include zero, the null hypothesis (H0: bj=0) is rejected, and the coefficient is

said to be statistically significantly different from zero.

j j j

Predicting the dependent variable

Ø The regression equation can be used to predict the value of the dependent variable based on assumed values of the independent variables.

where:

(15)

Ø Importance: ☆☆ Ø Content:

• Assumptions of multiple linear regression;

• Interpretation and hypothesis testing of regression coefficients;

• Prediction of dependent variable.

Ø Exam tips:

• 常考点:regression coefficients的假设检验;出题点比较灵 活,包括检验统计量的计算,检验结果的判断和解读。 Summary

ANOVA Analysis (2)

Tasks:

Ø Describe and interpret ANOVA table;

Ø Calculate and interpret F-statistic, and describe how it is used in regression analysis;

Ø Distinguish between and interpret R2 and adjusted R2.

ANOVA table of multiple regression

Ø R2 = RSS/SST

Ø F = MSR/MSE with df of k and n-k-1

Ø

df SS MSS

Regression k RSS MSR=SSR/k Error n-k-1 SSE MSE=SSE/(n-k-1)

Total n-1 SST

-Multiple Regression

SEE= MSE

F-statistics

Ø Test whether all regression coefficients are simultaneously equal to zero; or test whether the independent variables, as a group, help explain the dependent variable; or assess the effectiveness of the model, as a whole, in explaining the dependent variable.

üH0: b1= b2= … = bk=0; Ha: at least one bj≠0 (j = 1 to k)

ü

Multiple Regression

R SS

M SR k

(16)

F-statistics (Cont.)

Ø Decision rule: reject H0 if F-statistic > F-critical value. ü The F-test here is always a one-tailed test.

Ø Rejection of H0 means there is at least one regression

coefficient is significantly different from zero, thus at least one independent variable makes a significant contribution to the explanation of dependent variable.

Multiple Regression

R² (Coefficient of determination)

Ø

Ø Test the overall effectiveness (goodness of fit) of the entire set of independent variables (regression model) in explaining the dependent variable.

üFor example, an R²of 0.7 indicates that the model, as a whole, explains 70% of the variation in the dependent variable.

Ø For multiple regression, however, R2 will increase simply

by adding independent variables that explain even a slight amount of the previously unexplained variation.

üEven if the added independent variable is not statistically significant, R2 will increase.

Multiple Regression

Adjusted R²

Ø

where: n = number of observations k = number of independent variables

Ø Adjusted R² ≤ R², and may less than zero if R² is low enough.

Ø Adding a new independent variable will increase R2, but

may either increase or decrease the adjusted R2. üIf the new variable has only a small effect on R2, the

value of adjusted R2 may decrease.

(17)

Interpretation of regression model

Ø Interpretation generally focuses on the regression coefficients;

Ø It is possible to identify a relationship that has statistical significance without any economic significance.

Multiple Regression

Ø Importance: ☆☆☆ Ø Content:

• ANOVA table;

• Calculation and interpretation of F-statistics;

• R2 and adjusted R2. Ø Exam tips:

• 常考点1:F-statistics的解读,概念题;

• 常考点2:R2和adjusted R2的比较,概念题。 Summary

Violations of Assumptions

Tasks:

Ø Explain the types of heteroskedasticity and how

heteroskedasticity and serial correlation affect statistical inference;

Ø Describe multicollinearity and explain its causes and effects in regression analysis.

Heteroskedasticity (异方差性)

Definition of heteroskedasticity

Ø The variance of the errors differs across observations (i.e., the error terms are not homoskedastic).

üUnconditional heteroskedasticity: heteroskedasticity of the error variance is not correlated with the independent variables.

(18)

Heteroskedasticity

Definition of heteroskedasticity (Cont.)

üConditional heteroskedasticity: heteroskedasticity of the error variance is correlated with (conditional on) the values of the independent variables.

• Does create significant problems for statistical inference.

Effects of heteroskedasticity

Ø The coefficient estimates ( ) aren't affected.

Ø The standard errors of coefficient ( )are usually unreliable.

üWith financial data, the standard errors are most likely

underestimate, and the t-statistics ( ) will be

inflated, and tend to find significant relationships where none actually exist (type I error);

Ø The F-test is also unreliable.

Heteroskedasticity

Ø Examining scatter plots of the residuals;

X

Testing for heteroskedasticity (Cont.)

Ø Breusch-Pagen χ² test.

üH0: no heteroskedasticity;

üBP χ² = with df = k (the number of independent variables) and one-tailed test;

• n = the number of observation;

• = R2 of a second regression of the squared

(19)

Correcting for heteroskedasticity

Ø Use robust standard errors to recalculate the t-statistics;

üAlso called White-corrected standard errors.

Ø Use generalized least squares, other than ordinary least squares, to build the regression model.

Heteroskedasticity

Definition of serial correlation (autocorrelation)

Ø The residuals (error terms) are correlated with one another, and typically arises in time-series regressions.

üPositive serial correlation: a positive/negative error for one observation increases the chance of a

positive/negative error for another observation.

üNegative serial correlation: a positive/negative error for one observation increases the chance of a negative/positive error for another observation.

Serial Correlation (序列相关、自相关)

Effects of serial correlation

Ø The coefficient estimates aren't affected.

Ø The standard errors of coefficient are usually unreliable.

üPositive serial correlation: standard errors

underestimated and t-statistics inflated, suggesting significance when there is none (type I error);

üNegative serial correlation: vice versa (type II error).

Ø The F-test is also unreliable.

Serial Correlation

Testing for serial correlation

Ø Residual scatter plots

Serial Correlation

Residuals

T Residuals

T

(20)

Testing for serial correlation (Cont.)

Ø The Durbin-Watson test

üH0: No serial correlation

üDW ≈ 2×(1−r), if sample size is very large

• r = correlation coefficient between residuals from one period and those from the previous period.

üDecision rule:

Serial Correlation

DW=0

(r=1) dL dU 4-dU 4-dL DW=4(r=-1)

Positive conclusiveIn- reject HFail to 0 conclusive Negative

In-Correcting for serial correlation

Ø Adjust the coefficient standard errors (recommended);

• E.g., Hanson method, which also correct conditional heteroskadasticity;

• Adjusted standard errors, or Hansen-White standard errors.

Ø Modify the regression equation itself.

Serial Correlation

Definition of multicollinearity

Ø Two or more independent variables (or combinations of independent variables) are highly (but not perfectly) correlated with each other.

Multicollinearity ( 多重共线性)

Effects of multicollinearity

Ø Estimates of regression coefficients become extremely imprecise and unreliable;

Ø Standard errors of regression coefficients will be inflated, then t-test on the coefficients will have little power (more type II error).

üGreater probability we will incorrectly conclude that a variable is not statistically significant.

(21)

Testing for multicollinearity

Ø The t-tests indicate that none of the regression coefficients is significant, while R² is high and F-test indicates overall significance;

Ø The absolute value of the sample correlation between any two independent variables is greater than 0.7 (not recommended).

Correcting for multicollinearity

Ø Excluding one or more of the correlated independent variables.

Multicollinearity

Violation Effects Testing

Conditional

Heteroskedasticity Type I error

• Residual scatter plots

• Breusch-Pagen χ²-test BP = n×R² Positive serial

correlation Type I error • Residual scatter plots

• Durbin-Watson test DW≈2×(1−r) Negative serial

correlation Type II error

Multicollinearity Type II error

•t-tests indicate no significance when F-test indicates overall significance and R² is high

Summary of Assumption Violations

Ø Importance: ☆☆☆ Ø Content:

• Definition, effects, testing, and correcting of heteroskedasticity, serial correlation, and multicollinearity.

Ø Exam tips:

• 常考点:effects of heteroskedasticity and serial correlation, 概 念题。

Summary

Other Issues in Regression Analysis

Tasks:

Ø Formulate a multiple regression with dummy variables

and interpret the coefficients;

Ø Describe effects of model misspecification and

avoidance of its common forms;

(22)

Dummy variable

Ø Qualitative variables may be used as independent

variables in a regression.

üDummy variable is one type of qualitative variable, and takes on a value of “0” or “1”.

Ø If we want to distinguish among n categories, we need n−1 dummy variables.

Dummy Variable (哑变量)

Example

Ø Yi = b0 + b1X1i + b2X2i + b3X3i + ɛi

where: Yi = quarterly value of EPS of a stock

Y X1 X2 X3

Q1 EPS 1 0 0

Q2 EPS 0 1 0

Q3 EPS 0 0 1

Q4 EPS

(omitted category) 0 0 0

Dummy Variable

Interpretation of coefficient

Ø Intercept coefficient (b0): the average value of

dependent variable for the omitted category.

Ø Regression coefficient (bj): the difference in dependent

variable (on average) between the category represented by the dummy variable and the omitted category.

Dummy Variable

Definition of model misspecification

Ø The set of variables included in the regression and the regression equation’s functional form.

(23)

Categories of model misspecification

Ø Misspecified functional form

üImportant variables ommited

üVariables need to be transformed

üPools data incorrectly

Ø Independent variables correlated with the error term

üLagged dependent variables as independent variables

üIncorrect dating of variables

üIndependent variables are measured with error

Ø Other types of time-series misspecification

Model Misspecification

Effects of model misspecification

Ø Regression coefficients are often biased and inconsistent, leading to unreliable hypothesis testing and inaccurate predictions.

Model Misspecification

Avoiding model misspecification

Ø The model should be grounded in cogent economic reasoning;

Ø The functional form chosen for the variables should be appropriate given the nature of the variables;

Ø The model should be parsimonious;

Ø The model should be examined for violations of regression assumptions before being accepted;

Ø The model should be tested and be found useful out of sample before being accepted.

Model Misspecification

Qualitative dependent variable

Ø Dummy variables used as dependent variables instead of as independent variables.

üProbit and logit model

üDiscriminant models

(24)

Ø Importance: ☆ Ø Content:

• Dummy variable;

• Model misspecification;

• Qualitative dependent variable.

Ø Exam tips:

• 不是考试重点。

Summary

Trend Models

Tasks:

Ø Calculate and evaluate the predicted trend value for a time series;

Ø Describe factors to determine trend model selection;

Ø Evaluate limitations of trend models.

Linear trend models

Ø Work well in fitting time series that have constant change

amount with time.

yt=b0 + b1t + εt

t yt

Trend Models

0 1

y = b + b tˆ ˆ ˆ

Log-linear trend models

Ø Work well in fitting time series that have constant growth rate with time (exponential growth).

Ln(yt) =b0+b1t+εt

Trend Models

0 1 t (b + b t+ ε ) t

(25)

Linear trend model Vs. Log-linear trend model

Ø If data plots with a linear shape (constant change amount), a linear trend model may be appropriate.

Ø If data plots with a non-linear (curved) shape (constant growth rate), a log-linear model may be more suitable.

Trend Models

Limitations of trend models

Ø The trend model is not appropriate for time series when data exhibit serial correlation.

üUse the Durbin-Watson statistic to detect serial correlation.

Trend Models

Ø Importance: ☆ Ø Content:

• Linear trend model & log-linear trend model;

• Limitation of trend models.

Ø Exam tips:

• 不是考试重点。

Summary

Autoregressive Models (AR)

Tasks:

Ø Describe the structure of an AR model, explain the testing of autocorrelations of the residuals;

Ø Calculate one- and two-period-ahead forecasts given

the estimated coefficients of an AR model;

Ø Explain mean reversion and calculate a

(26)

Covariance stationary

Ø A key assumption for AR time series model to be valid based on ordinary least squares (OLS) estimates.

Ø A covariance stationary series must satisfy three principal requirements:

üConstant and finite expected value in all periods;

üConstant and finite variance in all periods;

üConstant and finite covariance with itself for a fixed number of periods in the past or future in all periods.

Autoregressive Model (AR)

Autoregressive model

Ø Uses past values of dependent variables as independent variables.

ü AR(1): First-order autoregressive model

ü AR(p): p-order autoregressive model

• Where p indicates the number of lagged values that the autoregressive model will include as

independent variables.

Chain rule of forecasting

Ø A one-period-ahead forecast for an AR(1) model:

Ø A two-period-ahead forecast for an AR(1) model:

t+ 1 ˆ0 ˆ1 t

Ø Step 1: Estimate the AR(1) model using linear regression;

üxt = b0 + b1xt-1 + ɛt

Ø Step 2: Compute the autocorrelations ( ) of the residual;

üAutocorrelation: the correlations of a time series with its own past values;

üThe order of the correlation is given by k, where k represents the number of periods lagged.

Autoregressive Model

t t - k

ε ,ε

(27)

Detecting autocorrelation (Cont.)

Ø Step 3: Test if the autocorrelations are significantly different from zero.

ü

• T: the number of observations in the time series;

• Degree of freedom: T-2.

üIf the residual autocorrelations differ significantly from 0, the model is not correctly specified and need to be modified.

Ø Time series that shows regular patterns of movement within the year.

Ø Testing of seasonality: test if the seasonal

autocorrelation of the residual will differ significantly from 0.

üThe 4th autocorrelation in case of quarterly data; üThe 12th autocorrelation in case of monthly data.

Autoregressive Model

Seasonality (Cont.)

Ø Correcting of seasonality: include a seasonal lag in AR model:

üQuarterly data: xt = b0+b1xt-1+ b2xt-4+εt üMonthly data: xt = b0+b1xt-1+ b2xt-12+εt Ø Forecasting using AR model with a seasonal lag:

üQuarterly data:

üMonthly data:

Ø A time series shows mean reversion if it has a tendency to move towards its mean.

üTends to fall when it is above its mean and rise when it is below its mean.

Ø Mean-reverting level for an AR(1) model:

üCovariance stationary → finite mean-reverting level;

(28)

Ø Importance: ☆☆ Ø Content:

• Covariance stationary and AR model;

• Auto-correlation and seasonality;

• Mean reversion.

Ø Exam tips:

• 常考点:mean-reverting level的计算。 Summary

Random Walk

Tasks:

Ø Describe characteristics of random walk processes;

Ø Describe unit roots for time-series analysis and the steps of the unit root test for nonstationarity;

Ø Demonstrate how a random walk can be transformed

to be stationary.

Random walk (simple random walk)

Ø A time series in which the value of the series in one period is the value of the series in the previous period plus an unpredictable random error.

üA special AR(1) model with b0=0 and b1=1; üThe best forecast of xt is xt-1.

Random Walk

-1

t t t

x

x

Random walk with a drift

Ø A random walk with the intercept term that not equal to zero (b0 ≠ 0).

üIncrease or decrease by a constant amount (b0) in each

period.

Random Walk

0 -1

t t t

(29)

Random walk Vs. Covariance stationary

Ø A random walk will not exhibit covariance stationary.

üA time series must have a finite mean reverting level to be covariance stationary;

üA random walk has an undefined mean reverting level.

Ø The least squares regression method doesn’t work to estimate an AR(1) model on a time series that is actually a random walk. coefficient is equal to one (b1=1) and will follow a random

walk process.

üTesting for unit root can be used to test for

nonstationarity since a random walk is not covariance stationary;

•But t-test of the hypothesis that b1=1 in AR model is

invalid to test the unit root;

Random Walk

Unit root (Cont.)

üTesting of AR model can determine if a time series is

covariance stationary.

•If autocorrelations at all lags are statistically indistinguishable from zero, the time series is stationary.

Random Walk

Dickey-Fuller test for unit root

Ø Step 1: start with an AR(1) model: xt=b0+b1xt-1+εt ;

üCalculate t-statistic and use revised critical values;

üIf fail to reject H0, there is a unit root and the time

series is non-stationary.

(30)

First differencing

Ø A random walk (i.e., has a unit root) can be transformed to a covariance stationary time series by first

differencing.

üSubtract xt-1 from both sides of random walk model:

xt-xt-1=xt-1-xt-1+εt=εt üDefine yt=xt-xt-1, so yt=εt ;

Or yt=b0+b1yt-1+εt ; where: b0=b1=0

üThen, yt is covariance stationary variable with a finite

mean-reverting level of 0/(1-0)=0.

Random Walk

Ø Importance: ☆☆☆ Ø Content:

• Random walk;

• Testing of unit roots;

• First differencing.

Ø Exam tips:

• 常考点:unit roots的检验方法,检验结果的解读,random

walk变形为stationary的方法(first differencing)。 Summary

Model Evaluation

Tasks:

Ø Contrast in-sample and out-of-sample forecasts;

Ø Explain ARCH model;

Ø Determine and justify an appropriate time-series

model.

Comparing forecasting model performance

Ø In-sample forecasts errors: the residuals within sample period to estimate the model;

Ø Out-of-sample forecasts errors: the residuals outside sample period to estimate the model.

Ø Root mean squared error (RMSE) criterion: the model with the smallest RMSE for the out-of-sample data is typically judged most accurate.

(31)

Instability of regression coefficients

Ø Financial and economic relationships are inherently dynamic, so the estimates of regression coefficients of the time-series model can change substantially across different sample periods.

Ø The is a tradeoff between reliability and stability.

• Models estimated with shorter time series are usually more stable but less reliable.

Model Evaluation

Autoregressive Conditional Heteroskedasticity (ARCH)

Ø Review of conditional heteroskedasticity:

heteroskedasticity of the error variance is correlated with (conditional on) the values of the independent variables.

Ø ARCH: conditional heteroskedasticity in AR models.

üWhen ARCH exists, the standard errors of the regression coefficients in AR models are incorrect, and the hypothesis tests of these coefficients are invalid.

Model Evaluation

ARCH(1) model

Ø Variance of the error in a particular time-series model in one period depends on the variance of the error in previous periods.

, where ut is the error item. üIf the coefficient a1 is statistically significantly different

from 0, the time series is ARCH(1).

üIf a time series model has ARCH(1) errors, generalized least squares must be used to develop a predictive model.

2 2

0 1 -1

ˆ ˆ

taatut

Model Evaluation

Predicting variance with ARCH models

Ø If a time-series model has ARCH(1) errors, the ARCH model can be used to predict the variance of the residuals in future periods.

üˆt21 aˆ0aˆ1ˆt2

(32)

Does series have a trend? (plotting) Yes

DW test for serial correlation?

Use a trend model Use an AR model

No Yes

Linear trend Exponential trend

Model Evaluation

Steps in time series forecasting

No

Model Evaluation

Steps in time series forecasting (Cont.)

Covariance

Adding lags Serial correlation?

ARCH?

No General least squares

Regression with two time series

Ø When running regression with two time series, either or both could be subject to nonstationarity.

Ø Dickey-Fuller tests can be used to detect unit root:

üIf none of the time series has a unit root, linear regression can be safely used;

üOnly one time series has a unit root, linear regression can not be used;

Regression With Two Time Series

Regression with two time series (Cont.)

üIf both time series have a unit root:

• If the two series are cointegrated, linear regression can be used;

• If the two series are not cointegrated, linear regression can not be used.

üCointegration: two time series have long-term financial or economic relationship so that they do not diverge from each other without bound in the long run.

(33)

Ø Importance: ☆ Ø Content:

• In-sample and out-of-sample forecasting;

• ARCH model;

• Regression with two time series.

Ø Exam tips:

• 不是考试重点。

Summary

Simulation

Tasks:

Ø Describe steps of simulation and treatment of

correlation;

Ø Describe advantage, constraints, and issues of

simulation;

Ø Compare scenario analysis, decision trees, and

simulations.

Steps in running a simulation

Ø Determine “probabilistic” variables;

Ø Define probability distributions for these variables;

Ø Check for correlation across variables;

Ø Run the simulation.

Simulation

Define probability distributions for variables

Ø Historical data

Ø Cross sectional data

Ø Statistical distribution and parameters

(34)

Treatment of correlation across variables

Ø When there is strong correlation, positive or negative, across inputs, we have two choices:

üPick only one that has the bigger impact on value;

üBuilding the correlation explicitly into the simulation.

Simulation

Advantages of using simulations

Ø Better input estimation;

Ø It yield a distribution for expected value rather than a point estimate.

Simulation

Constraints on simulations

Ø Book value constraints;

Ø Earnings and cash flow constraints;

Ø Market value constraints.

Simulation

Issues in using simulations in risk assessment

Ø Garbage in, garbage out;

Ø Real data may not fit distributions;

Ø Non-stationary distributions;

Ø Changing correlation across inputs.

(35)

Comparing probabilistic approaches

Ø How to choose among probabilistic approaches: scenario analysis, decision trees, and simulation:

üSelective vs. full risk analysis;

üType of risk;

•Discrete vs. continuous.

üCorrelations across risks;

üQuality of information.

Simulation

Ø Importance: ☆ Ø Content:

• Steps of simulation and ways to define probability distribution;

• Advantages, constraints, and issues of simulation;

• Comparison of scenario analysis, decision tree and simulation.

Ø Exam tips:

• 不是考试重点。

Referensi

Dokumen terkait

[r]

Karena motivasi memberikan pengaruh cukup besar terhadap disiplin kerja pegawai pada Balai Kesehatan Kerja Masyarakat Dinas Kesehatan Provinsi Jawa Barat, maka kepada Kepala Balai

Panel Data Regression is a method to determine the effect of independent variables on the dependent variable using Ordinary Least Square (OLS) regression

Regarding the 2 nd CFA model of latent variables of Islamic Financial Exclusion (FINEX), human capital records the highest indicator in explaining this in the context

We’ll begin with a discussion of Swift 3, exploring what’s changed at a high level, and how the community organizes the evolution and open source development process for Swift..

Further, regression analysis may be used to estimate the value of dependent variable at some point of time if the values of independent variables are known.. This is more relevant in

Pendekatan atau strategi konversi langsung ( direct conversion atau direct cutover atau cold turkey conversion atau abrupt cutover ) dilakukan dengan mengganti sistem yang lama

Several technical and methodological obstacles remain that impede the broad-scale implementation of measurement and monitoring schemes, and we present a dataset designed to ( i )