• Tidak ada hasil yang ditemukan

An adaptive econometric system for statistical arbitrage

N/A
N/A
Protected

Academic year: 2023

Membagikan "An adaptive econometric system for statistical arbitrage"

Copied!
151
0
0

Teks penuh

The proposed system is compared with classical pairs trading for each of the markets examined. In this study, sensitivity analysis of the system is also conducted to investigate the robustness of the proposed system.

  • Introduction to financial trading
    • Trading of securities
    • Statistical arbitrage
  • Problem statement
  • Research objectives
    • Objective 1: Classification of securities from price data
    • Objective 2: Modelling of mean-reversion characteristic
    • Objective 3: Trade signal generation and risk management
    • Objective 4: Sensitivity analysis of proposed system
  • Beneficiaries
  • Research limitations
    • Security universe
    • Data granularity
  • Research methodology
    • Data acquisition and verification
    • Statistical tests implementation
    • Backtesting of system
    • Verification of results
  • Document conventions
  • Reading suggestions and document layout
    • Reading suggestions
    • Document layout

A model should be created that looks for statistical arbitrage opportunities by forming linear combinations of securities that have been divided into subgroups. Active investment managers and traders can also potentially benefit from the proposed research findings.

Figure 1-1: Entry/Exit signals of mean-reverting strategy
Figure 1-1: Entry/Exit signals of mean-reverting strategy
  • Overview of background study
  • High frequency trading
  • Arbitrage
  • Statistical arbitrage
  • Stationary processes
    • Augmented Dickey-Fuller test
  • Mean-reversion strategies
  • Correlation and cointegration
    • Correlation and dependence
    • Testing for unit roots and cointegration
  • Hedging positions
  • Volatility of security prices
  • Modelling volatility
    • Overview of ARCH models
    • ARCH(q) model specification
    • GARCH(p,q) model specification
  • Cluster analysis
    • K-means clustering
    • Affinity propagation clustering
  • Background review

As can be seen from equation (2.2), the general objective of the ADF test is to determine whether the hypothesis 𝜆 = 0 can be rejected. The critical values ​​of the test are valid only asymptotically, which can be a weakness of the test.

Figure 2-1: Example of a stationary process
Figure 2-1: Example of a stationary process

Overview of literature review

The efficient market hypothesis

Arguments for quantitative trading and active investing

Established models for statistical arbitrage

  • Minimum distance method
  • Arbitrage pricing theory model
  • Cointegration for statistical arbitrage

It can be assumed that these factors must be determined by the user of the model. If the two variables are integrated of order 1, the error term for the regression must be weakly stationary (𝐼(0)).

Statistical arbitrage in different markets

This would indicate that the spread price series moves around the equilibrium of 𝛼, since 𝜖𝑡 is assumed to be weakly stationary, but not necessarily identically and independently distributed (i.i.d.). It need only be malicious for an effective trade rule to be implemented.

Identifying related securities

Predicting market volatility

It concludes that the GJR-GARCH model most effectively modeled the daily volatility of 5 South African indices on the Johannesburg Stock Exchange (JSE). It can be concluded that if a deterministic trend is present in the price series, as is the case with most stock indices, an asymmetric GARCH model should be better than a standard GARCH model in modeling volatility.

Literature review

Although his GARCH model differs from that of Alberg, Shalit and Yosef [50], he reaches a similar conclusion that volatility can be modeled more accurately when allowing for either positive or negatively skewed returns. It can be expected that the asymmetric GARCH models will not offer an edge over standard GARCH models when modeling the volatility of (weakly) stationary price series where no deterministic trend is present.

General overview of methodology

  • Chapter layout
  • Implementation approach

This section provides an overview of the proposed model and discusses the standard pairs trading method that will be used for comparison as a benchmark. This section describes the data (type and granularity) to be used for testing the models.

Model descriptions

  • Implementation of the standard model
    • Formation Period
    • Trading Period
  • Implementation of adaptive model
    • Learning period
    • Trading period

For each of the pairs with the smallest SE values, a spread over the formation period is calculated as 𝑥𝑡 =𝑦𝑡𝑖. The moving average (𝜇𝑥) and standard deviation (𝜎𝑥) of the difference is calculated and stored for use in the trading period. Weights can be used to construct stationary series for the time period of the study period.

When the volatility of the weak stationary range increases, the market entry thresholds will increase. If the value of 𝑧 is lower than the negative of the GARCH updated threshold value, the stationary portfolio is bought.

Securities universe and sampling frequency

The Z-score will be used for comparison with the GARCH updated volatility forecast to time the entry of market positions. If the value of 𝑧 is higher than the GARCH updated threshold, a market position is taken such that the stationary portfolio is effectively sold short. When you are in a market position with the value 𝑧 crossing zero, the long or short position in the weighted (stationary) series is closed.

The default values ​​are chosen to be comparable to those of the standard model (see [6]). Daily data is available for the entire period that the securities are listed on their respective exchanges.

Implementation of clustering techniques

  • Implementation of the k-means clustering algorithm
  • Implementation of the affinity propagation clustering algorithm
    • Building of initial graph
    • Clustering of data points
    • Responsibility updates
    • Availability updates
    • Considerations of AP clustering

The term "samples" is used for existing observations that are chosen as centers of the data, as opposed to "centroids" which do not necessarily have to be actual observations and can be created (as with k-means clustering). When 𝑘 = 𝑖 in equation (4.3), the responsibility is placed on the input preference to select point 𝑘 as an example. Setting a damping factor ensures that oscillations that will "overshoot" the solution are reduced (see [32] and [54] for details).

Negative values ​​of availability will decrease the effective values ​​of some input similarities 𝑠(𝑖, 𝑘′) in equation (4.3), removing relevant candidate instances from competition. As can be seen from Equation (4.5), the availability 𝑎(𝑖, 𝑘) is set to the addition of the self-responsibility 𝑟(𝑘, 𝑘) and the sum of the positive responsibilities that the candidate role model 𝑘 receives from other points.

Implementation of the Johansen cointegration test

  • Overview of the Johansen test
  • Review of the Johansen method
  • Johansen method: Input
  • Johansen method: Step 1
  • Johansen method: Step 2
  • Johansen method: Step 3
    • Hypothesis testing

In the first step of Johansen's cointegration test, it is necessary to estimate a VAR(𝑝 − 1) model for the matrix ∆𝑦𝑡. This task can be done by regressing ∆𝑦𝑖𝑡 on a constant and all elements of the vectors ∆𝑦𝑡−1,. In the third step of the Johansen cointegration test, the maximum likelihood estimates of the parameters are calculated.

In this case, the maximum value of the log-likelihood function is given by equation (4.20). In this case, there are no restrictions on the constant term when estimating the regressions in Section 4.5.4.

Implementation of GARCH model

  • Overview of GARCH models
  • Parameter estimation in the GARCH model
  • Gaussian quasi maximum-likelihood estimation
  • Fat-tailed maximum-likelihood estimation
  • Implementing the Nelder-Mead algorithm
    • Description of Nelder-Mead algorithm
    • Implementation overview of Nelder-Mead algorithm
    • Initial simplex construction
    • Simplex transformation algorithm

Maximum likelihood functions will be used with the Nelder-Mead method to find the optimal values ​​of the GARCH model parameters given historical time series data. For the general GARCH(p,q) procedure, the likelihood function is maximized as a function of the included parameters 𝛼𝑖 and 𝛽𝑗. The obtained value in the parameter space is a Gaussian quasi-estimator of the maximum likelihood of the parameters of the GARCH(p,q) process.

In this study, the Nelder-Mead algorithm will be used extensively to estimate the parameters of the GARCH model in conjunction with the chosen log-likelihood function (see Sections 4.6.3 and 4.6.4). As an illustrative example, the steps of a simple Nelder-Mead search algorithm will be shown for the case where the minimum likelihood function (𝑓) is sought.

Figure 4-2: Centroid and initial simplex
Figure 4-2: Centroid and initial simplex

Performance evaluation metrics

  • Compound annual growth rate
  • Sharpe ratio
  • Sortino ratio
  • Benchmark comparison metrics
    • Alpha
    • Beta
    • Information ratio

It is an indicator of downside risk and indicates the maximum possible loss that the portfolio could experience in the future. Together with the maximum drawdown, the duration of time from the start of the maximum drawdown to the point in time when the portfolio regains its pre-drawdown value is calculated. It can be expressed as the CAGR of the benchmark minus the CAGR of the portfolio.

The information ratio (or valuation ratio) is a measure of a portfolio's risk-adjusted return. Active return is the difference between the portfolio's return and the return of a given benchmark.

Methodology review

In general, a beta value of less than one indicates that a portfolio's equity curve is less volatile than the market. It can be calculated as the expected value of active return divided by a tracking error. The implementation details of the GARCH volatility model were discussed along with three likelihood function derivations.

This study will use the student t-likelihood function because the results of the literature review show that it generally provides the best fit when applied to financial data. To search for the parameter values ​​of the GARCH model, the Nelder-Mead simplex search algorithm will be used.

Overview of evaluation

Backtesting system setup

Verification of system components

  • Affinity propagation clustering
  • K-means clustering
  • Comparison of clustering techniques
  • Johansen cointegration test
    • Johansen test applied to two index funds
    • Johansen test applied to two US stocks during the 2008 crash
    • Johansen test applied to US stocks in the same industry
    • Johansen test applied to three country index ETFs
    • Johansen test applied to clustered German stocks
  • GARCH volatility model
    • Applying GARCH(1,1) models to the S&P 500 sectors ETFs
    • Applying GARCH(1,1) models to the MSCI country index ETFs
    • GARCH-updated entry threshold versus fixed entry thresholds

The results of the k-means clustering on shares listed on the Deutsche Börse Xetra over the period January 2004 – January 2005 can be seen in Table 5-3. The results of the two hypothesis tests of the Johansen test are contained in Table 5-8 and Table 5-9 respectively. The results of the Johansen test applied to the time series of these securities are shown in Table 5-10 and Table 5-11.

The results of the two hypothesis tests of the Johansen test are contained in Table 5-12 and Table 5-13 respectively. As discussed in section 4.6, a persistence value can be calculated by summing the parameters 𝛼 and 𝛽.

Table 5-2: Clustering of German stocks (2004-2005) - AP
Table 5-2: Clustering of German stocks (2004-2005) - AP

Validation of system

  • Evaluation on Deutsche Börse Xetra
  • Evaluation on TSE ETFs
  • Evaluation on TSE stocks
  • Evaluation on JSE stocks
  • Evaluation on US ETFs
  • Evaluation on US stocks
  • Evaluation over different market regimes
  • Sensitivity analysis
    • Deutsche Börse Xetra
    • TSE ETFs
    • TSE Stocks
    • JSE Stocks
    • US ETFs
    • US Stocks
    • Transaction cost sensitivity

For comparison, another version of the flexible system (which trades up to 8 baskets of securities) was also implemented. Two variations of the adaptive system, the pairs trading system and the Bollinger Bands strategy, were able to outperform the S&P 500 during the non-trending period, with alphas of 2.72%, respectively. A version of the flexible system that trades up to eight baskets had lower volatility than the index and had an alpha of -0.78%.

A sensitivity analysis of the adaptive system will be performed for all financial markets examined in this study. As discussed in Section 5.2, the default transaction fee was chosen to be 0.4% of the transaction value.

Figure 5-24: System performance on DAX stocks  Table 5-18: Performance metrics summary (DAX stocks)
Figure 5-24: System performance on DAX stocks Table 5-18: Performance metrics summary (DAX stocks)

Review of evaluation

Study overview

Concluding remarks

  • Adaptive system overview
  • Performance discussion

The third objective was to generate trade signals to profit from temporary mispricings in mean reversion characteristics for manufactured series. The ultimate goal of this study was to perform a sensitivity analysis of the statistical arbitration system. This was completed with a one-at-a-time (OAT/OFAT) system sensitivity analysis across all the different markets examined in this study.

The results show that the adaptive system was able to generate positive alpha for five of the six security universes on which the system was tested over the investigated period. The results of the sensitivity analysis gave an indication of the regions in which parameter values ​​should be chosen if the system is to be applied practically.

Recommendations for future research

Closure

Dimitriu, “The Cointegration Alpha: Enhanced Index Tracking and Long-Short Equity Market Neutral Strategies,” August 5, 2002. Fama, “Efficient capital markets: A review of theory and empirical work.” The journal of Finance, vol. Titman, "Returns on Buying Winners and Selling Losers: Implications for Stock Market Efficiency," The journal of finance, vol.

Vassilvitskii, “k-means++: The Advantages of Careful Seeding,” in Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Wright, “Convergence properties of the Nelder-Mead simplex method in low dimensions,” Society for Industrial and Applied Mathematics, vol.

JOHANSEN METHOD

  • Overview of the cointegration approach
  • Maximum likelihood estimation of cointegration vectors
  • Maximum likelihood estimator of the cointegration space

First, the likelihood ratio test for the hypothesis given by equation (6.4) and also the maximum likelihood estimator of the cointegration space. Second, a likelihood ratio test of the hypothesis that the cointegration space is constrained to lie in a particular subspace, representing a linear constraint we might want to impose on the cointegration vectors. The parameters 𝛼 and 𝛽 cannot be uniquely estimated because they form an excessive parameterization of the model.

The influence matrix П is found as the coefficient of lagged levels in a nonlinear least-squares regression of ∆𝑋𝑡 on lagged differences and lagged levels. Let 𝐷 denote the diagonal matrix of ordered eigenvalues ​​𝜆̂1> ⋯ > 𝜆̂

RE-PARAMETERIZING A VAR MODEL

COMPARISON OF CLUSTERING METHODS ON SYSTEM PERFORMANCE

COMPARISON OF FIXED AND DYNAMICALLY UPDATED MARKET ENRTY

SENSITIVITY ANALYSIS OF TRANSACTION COSTS ON THE DEUTSCHE BÖRSE

ANALYSIS OF DIFFERENT GARCH-UPDATED MODELS ON THE ADAPTIVE

Gambar

Figure 1-1: Entry/Exit signals of mean-reverting strategy
Figure 2-1: Example of a stationary process
Figure 2-2: Pearson correlation coefficient for different data sets [18]
Figure 5-1: Price series from DE cluster 1
+7

Referensi

Dokumen terkait

PRODI PENDIDIKAN BAHASA INGGRIS ﻢــــﯿﺣﺮﻟا ﻦﻤـﺣﺮﻟا ﮫـﻠﻟا ﻢــــــﺴﺑ | Terakreditasi Institusi APPROVAL SHEET Tittle : Improving the Students' Reading Comprehension By Using an