Almost all of the examples in this book use datasets available in R so readers can reproduce the results. For readers who want to use R, the bibliographic notes at the end of each chapter list books that cover R programming, and the book's website contains examples of the Rand WinBUGS code used to produce this book.
Notation
If A is some statement, then I{A} is called the indicator function of A and equals 1 if Ais true and equals 0 if Ais false.
Introduction
Bibliographic Notes
The dictum that "All models are false, but some models are useful" is from Box (1976).
Returns
- Introduction
 - Net Returns
 - Log Returns
 - Adjustment for Dividends
 - The Random Walk Model
 - Random Walks
 - Geometric Random Walks
 - Are Log Prices a Lognormal Geometric Random Walk?
 - Bibliographic Notes
 - R Lab
 - Data Analysis
 - Simulations
 - Simulating a Geometric Random Walk
 - Let’s Look at McDonald’s Stock
 - Exercises
 
The quick answer is "no". The lognormal geometric random walk has two assumptions: (1) the logarithmic returns are normally distributed and (2) the logarithmic returns are independent of each other. Suppose that the daily logs of a stock's returns are independent and normally distributed with a mean of 0.001 and a standard deviation of 0.015. a).
Fixed Income Securities
Introduction
Zero-Coupon Bonds
- Price and Returns Fluctuate with the Interest Rate
 
If the interest rate remained unchanged at 6%, the price of the bond would be. If the interest rate changes, however, the annual return of 6% is only guaranteed if you keep the bond until maturity.
Coupon Bonds
- A General Formula
 
Yield to Maturity
- General Method for Yield to Maturity
 - Spot Rates
 
The maturity interest rate of a zero-coupon bond with a maturity of n years is called the current year's spot interest rate and is denoted medyn. The coupon bond maturity is thus a complex "average" of the spot rates for the zeros in this bundle.
Term Structure
- Introduction: Interest Rates Depend Upon Maturity
 - Describing the Term Structure
 
In this example, we first find the yields to maturity from the prices derived in Example 3.2 using the interest rates from Table 3.1. Equations (3.12) and (3.13) give the yields to maturity in terms of bond prices and future rates, respectively.
Continuous Compounding
Continuous Forward Rates
The discount function D(T) and the forward rate function r(t) in formula (3.22) depend on the current time, which is equal to zero in this formula. However, we might be interested in how the discount function and the forward interest rate change over time.
Sensitivity of Price to Yield
- Duration of a Coupon Bond
 
When this definition is extended to derivatives, the duration has nothing to do with the maturities of the underlying securities. Unfortunately, the underlying assumption behind (3.31) that all yields change by the same amount is unrealistic, so duration analysis is falling out of favor and value-at-risk is replacing duration analysis as a method of assessing rate risk. of interest. 4 Value-at-risk measures and other risk measures are covered in Chapter 19.
Bibliographic Notes
Now assume that all yields change by a constant amount δ, that is, yT changes toyT +δ for all T. Because of this assumption, Eq. 3.30) applies to each of these cash flows, and averaging them with these weights gives us that for a coupon bond.
R Lab
- Computing Yield to Maturity
 - Graphing Yield Curves
 
Run the code above, then to zoom in on the short end of the curves, run the code again with maturities limited to 0 to 3 years; To do that, use xlim in the plot function. The estimated forward rates found by numerically differentiating an interpolating spline are 'wobbly'. The wobbles can be removed, or at least reduced, by using a penalized spline instead of an interpolating spline.
Exercises
If you bought the bond for the original price of $828 and sold it a year later for the price calculated in part (b), what is the net yield. a) Suppose the yield for this bond is 4% per annum, compounded semi-annually. The investor plans to sell the bond at the end of one year and wants the highest yield for the year.
Exploratory Data Analysis
Introduction
We also see volatility clustering, as there are periods of higher, and of lower, variation within each range. Volatility clustering does not indicate a lack of stationarity, but rather can be viewed as a type of dependence in the conditional variance of each series.
Histograms and Kernel Density Estimation
Kernel density estimates (solid) of the daily log returns on the S&P500 compared to normal densities (dashed). a) The normal density uses the sample mean and standard deviation. We have just seen a problem with using a KDE to suggest a good model for the distribution of the data in a sample – the parameters in the model must be estimated correctly.
Order Statistics, the Sample CDF, and Sample QuantilesQuantiles
- The Central Limit Theorem for Sample Quantiles
 - Normal Probability Plots
 - Half-Normal Plots
 - Quantile–Quantile Plots
 
A semi-normal plot is a variant of the normal plot that is used to detect unusual data instead of testing for a normal distribution. More specifically, a semi-normal plot is a scatter plot of the order statistics of the absolute values of the data with respect to Φ−1{(n+i)/(2n+ 1)}, i = 1,.
Tests of Normality
For the S&P 500 returns, the Shapiro-Wilk test rejects the null hypothesis of normality with a p-value less than 2.2×10−16. Shapiro-Wilk also strongly rejects normality for the changes in the DM/dollar exchange rate and for the changes in the risk-free return.
Boxplots
With large sample sizes, e.g. and 515, for S&P 500 returns, changes in the DM/dollar exchange rate, and changes in the risk-free rate, respectively, it is quite likely that normality will be rejected since any real population will deviate to some degree from normality and any deviation regardless how small, will be discovered with a sufficiently large sample. Of course, one must be aware of differences in scale, so it is worth looking at boxplots of the variables both without and with standardization.
Data Transformation
The transformed data satisfy the assumptions of the t-test that the two populations are normally distributed with equal variance, but of course the original data do not satisfy these assumptions. The second is to transform the data so that the transformed data meet the assumptions of the original test or estimator.
The Geometry of Transformations
All statistical estimators and tests make certain assumptions about the distribution of the data. We see that the correlations decrease as α decreases from 1, so that the concavity of the transformation increases.
Transformation Kernel Density Estimation
The red dashed curve in Fig.4.27 is a plot of the TKDE of the earnings data using the square root transformation. For positive, right-skewed variables such as the earnings data, a concave transformation is required.
Bibliographic Notes
R Lab
- European Stock Indices
 - McDonald’s Prices and Returns
 
Run the following code to generate normal plots of the four indices and test the normality of each using the Shapiro–Wilk test. In lines 5–6, a robust estimator of the standard deviation of the t-distribution is computed using themad() function.
Exercises
Create a second set of six normal plots using n simulated N(0,1) random variables, where n is the number of bp changes plotted in the first figure. Use the following fact about the standard normal cumulative distribution function Φ(·):. b) What is the 0.975-quantile of a normal distribution with mean -1 and variance 2.
Modeling Univariate Distributions
Introduction
Parametric Models and Parsimony
A model should only have as many parameters as are necessary to capture the important features of the data. On the other hand, a statistical model must have enough parameters to adequately describe the behavior of the data.
Location, Scale, and Shape Parameters
A model with too few parameters can introduce bias because the model does not fit the data well. A statistical model with small bias but no redundant parameters is called parsimonious and achieves a good compromise between bias and variance.
Skewness, Kurtosis, and Moments
- The Jarque–Bera Test
 - Moments
 
Estimating the skewness and kurtosis of a distribution is relatively simple if we have a sample, Y1. Deviations from the sample skewness and kurtosis of these values are indicative of nonnormality.
Heavy-Tailed Distributions
- Exponential and Polynomial Tails
 - t -Distributions
 - Mixture Models Discrete MixturesDiscrete Mixtures
 
This is the "outlier" region (along with x < -6).7 The normal mixture has many more outliers than the normal distribution, and the outliers come from 10% of the population with a variance of 25. In summary, the normal mixture is much more prone to outliers than a normal distribution with the same mean and standard deviation.
Generalized Error Distributions
The generalized error distributions can give tail weights between the normal and double-exponential distributions by having 1 < ν < 2. Because t-distributions have polynomial tails, any distribution is more heavily tailed than any generalized error distribution.
Creating Skewed from Symmetric Distributions
Symmetric (solid) and skewed (dashed) t-densities, both with mean 0, standard deviation 1, andν= 10.ξ= 2 in the skewed density. Note that the mode of the skewed density lies to the left of its mean, a typical behavior for right-skewed densities.
Quantile-Based Location, Scale, and Shape Parameters
The parameters ξ,ω and α determine location, scale and bias and are called the direct parameters or DP. The parameters ξ and ω are the mean and standard deviation of φ(z) and α determine the amount of bias induced by Φ(αz).
Maximum Likelihood Estimation
Fisher Information and the Central Limit Theorem for the MLE
The covariance matrix of the MLE can be estimated from the inverse of the observed Fisher information matrix. If the negative of the logical likelihood is minimized by theRfunctionoptim(), then the observed Fisher information matrix is computed numerically and ifhessian = TRUE is returned.
Likelihood Ratio Tests
The left-hand side of (5.27) is twice the logarithm of the likelihood ratio L(θML)/L(θ0,ML), hence the name the likelihood ratio test. When an exact critical value is unknown, then the usual choice is the critical value.
AIC and BIC
In general, from a group of candidate models, one chooses the model that minimizes whichever criterion, AIC or BIC, is used. However, it is common for both criteria to choose the same or almost the same model.
Validation Data and Cross-Validation
The inappropriate use of the training data for validation would have led to the erroneous conclusion that the separate means estimator is more accurate. With leave-one-out cross-validation, each observation takes a turn to be the validation data set, with the other n−1 observations as training data.
Fitting Distributions by Maximum Likelihood
The flows in pipelines 1 and, to a lesser extent, 2 are fit reasonably well by the A-C skewed normal distribution. The red reference line through the quartiles in the QQ plot is created in lines 20–22.
Profile Likelihood
Yn(α) and these values can be plugged into the log-likelihood to obtain the profile log-likelihood for α. This can be done with the function boxcox() in R'sMASS package, which plots the profile log-likelihood with confidence intervals.
Robust Estimation
Letk=nαrounded14 to an integer; kis the number of observations removed from both ends of the sample. The sample standard deviation is the most common estimate of dispersion, but as stated, it is not robust.
Transformation Kernel Density Estimation with a Parametric Transformationwith a Parametric Transformation
The removal of the heavy tails can be seen in Fig.5.16, which is a normal plot of the transformed data. Line8 calculates the KDE for the untransformed data and line11 calculates the KDE for the transformed data.
Bibliographic Notes
R Lab
- Earnings Data
 - DAX Returns
 - McDonald’s Returns
 
This section uses log returns for the DAX index in the EuStock-Markets dataset. The fig std$par component contains the MLE, and the fig std$value component contains the minimum value of the objective function.
Exercises
Based on this information alone, what would you use as an estimate of ν, the tail index parameter. An Analysis of Transformations. Journal of the Royal Statistical Society, Series B Kernel Density Estimation for Heavy Tailed Distributions Using the Champernowne Transformation. Statistics.
Resampling
- Introduction
 - Bootstrap Estimates of Bias, Standard Deviation, and MSEand MSE
 - Bootstrapping the MLE of the t -Distribution
 - Bootstrap Confidence Intervals
 - Normal Approximation Interval
 - Bootstrap- t Intervals
 - Basic Bootstrap Interval
 - Percentile Confidence Intervals
 - Bibliographic Notes
 - R Lab
 - BMW Returns
 - Simulation Study: Bootstrapping the Kurtosis
 - Exercises
 
The bootstrap is based on an approximation of the population probability distribution using the sample. Most estimators satisfy a CLT, e.g. The CLTs for sample quantiles and for MLE in Sects.4.3.1and5.10, respectively.
Multivariate Statistical Models
- Introduction
 - Covariance and Correlation Matrices
 - Linear Functions of Random Variables
 - Independence and Variances of Sums
 - Scatterplot Matrices
 - The Multivariate Normal Distribution
 - The Multivariate t -Distribution
 
The covariance matrix of standardized variables is equal to the correlation matrix of original variables, which is also the correlation matrix of the standardized variables. This is because the return on the portfolio is the weighted average of the return on the assets.
LSCC
ALTR
- Using the t -Distribution in Portfolio Analysis
 - Fitting the Multivariate t -Distribution by Maximum Likelihoodby Maximum Likelihood
 - Elliptically Contoured Densities
 - The Multivariate Skewed t -Distributions
 - The Fisher Information Matrix
 - Bootstrapping Multivariate Data
 
In that figure, it can be seen that the MLE ofν is 5.94, and there is relatively little uncertainty about the value of this parameter - the 95% profile likelihood confidence interval is. When the data are t-distributed, maximum likelihood estimates are superior to the sample mean and covariance matrix in several respects—the MLE is less variable and is less sensitive to outliers.