Empirical results for rolling training process

The previous results are based on machine learning models that are trained on the first 240 months data. This setting does not allow us to update models when new information arrives, which might affect their performance. To study whether our results are robust to this concern, we try a different training process by updating updating the models regularly.⁹ In particular, we train the models from Jan 1971 to Dec 1990 and test them for a 10-year sample period from Jan 1991 to Dec 2000.

We retrain the models from Jan 1981 to Dec 2000 and test them for another ten years from Jan 2001 to Dec 2010. Finally, we train the models from Jan 1991 to Dec 2010 and test them for the remaining period.

We want to investigate whether a rolling training process will lead to better results than a fixed training process since a rolling training process can uncover insights and patterns from most recent data.

9The best way is to use a rolling method that updates the models every month. However, it is extremely computational intensive so we could not implement it. We do not consider RNN, GRU, and LTSM due to the same reason.

Panel A of Table 13 reports the R²_OS. As can be seen, most of the R²_OS results are are significantly better than the previous results that use fixed training process (training on the first 240 months data). This highlights the advantage of continually retraining the model to take advantage of most recent training data. However, unlike previous results for training on all 221 variables and 127 technical indicators in which all machine learning models significantly outperform OLS; here, two machine learning models do not outperform OLS for both training on all 221 variables and 127 technical indicators.

Panel B of Table13reports the returns of the zero-investment portfolios. As can be seen, most of the mean return results are significantly better than the previous results that use fixed training process. However, similar to previous results, most machine learning models for training on 94 stock characteritics do not outperform OLS. (except XGBoost, LightGBM and Combination which are statistically significant at 1%). For training on all 221 variables and 127 technical indicators, all machine learning models outperform OLS.

[Insert Table13about here]

6 Conclusion

In this paper, we conduct a comprehensive study of machine learning models on the cross-section of stock returns. We study whether various machine learning models could forecast stock returns. We also use a large set of predictors to

implement the machine learning models. These predictors contain both 94 stock characteristics variables and 127 technical indicators.

Empirical results show that, relative to the OLS model, machine learning models substantially improve the forecasting performance of various predictors. The improvements are both statistically and economically significant. These findings are consistent with [18]. Results of different sets of predictors indicate that technical indicators have stronger predictive power than the bond characteristics used in [18].

We also study whether stock characteristics variables offer an incremental con- tribution to stock return forecast. When we use all stocks in the machine learning models, we do not find contributions made by the stock characteristics variables.

Results for rolling training process highlights the advantage of continually retraining the model to take advantage of most recent training data as they provide significantly betterR²_OSand zero-investment portfolio returns results.

Our study contains several new findings. We document the importance of technical indicators above and beyond stock characteristics variables used [4]. It is important to include technical indicators as information in the stock market.

We show a wide range of hyperparameter tuning is required for some machine learning models to achieve optimal performance. This finding deepens our understanding of how to implement machine learning models to enhance learning performance and reduce overfitting.

References

[1] Yaser S. Abu-Mostafa and Amir F. Atiya. Introduction to financial forecasting. Applied Intelligence, 6:205–213, 07 1996.

[2] Dana Angluin and Philip Laird. Learning from noisy examples. Machine Learning, 2(4):343–370, 1988.

[3] Jennie Bai, Turan G Bali, and Quan Wen. Common risk factors in the cross-section of corporate bond returns. Journal of Financial Economics, 131(3):619–642, 2019.

[4] Turan G Bali, Amit Goyal, Dashan Huang, Fuwei Jiang, and Quan Wen.

The cross-sectional pricing of corporate bonds using big data and machine learning. (20-110), Sep 2020.

[5] Jack Bao, Jun Pan, and Jiang Wang. The illiquidity of corporate bonds.

Journal of Finance, 66(3):911–946, 2011.

[6] Parth Bhavsar, Ilya Safro, Nidhal Bouaynaya, Robi Polikar, and Dimah Dera. Machine learning in transportation data analytics. Data Analytics for Intelligent Transportation Systems, pages 283–307, 12 2017.

[7] William Brock, Josef Lakonishok, and Blake LeBaron. Simple technical trading rules and the stochastic properties of stock returns. Journal of Fi- nance, 47(5):1731–1764, 1992.

[8] David P Brown and Robert H Jennings. On technical analysis. Review of Financial Studies, 2(4):527–551, 1989.

[9] T. Kelly Bryan, Pruitt Seth, and Su Yinan. Characteristics are covari- ances: A unified model of risk and return. Journal of Financial Economics, 134(3):501–524, 2019.

[10] Li-Juan Cao and Francis Eng Hock Tay. Support vector machine with adap- tive parameters in financial time series forecasting. IEEE Transactions on neural networks, 14(6):1506–1518, 2003.

[11] Giovanni Cespa and Xavier Vives. Dynamic trading and asset prices: Keynes vs. hayek. The Review of Economic Studies, 79(2):539–580, 2012.

[12] Kyunghyun Cho, Bart Van Merri¨enboer, Dzmitry Bahdanau, and Yoshua Bengio. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.

[13] Fu-Lai Chung, Tak-Chung Fu, Vincent Ng, and Robert Luk. An evolutionary approach to pattern-based time series segmentation. Evolutionary Computa- tion, IEEE Transactions on, 8:471–489, 11 2004.

[14] Todd E Clark and Kenneth D West. Approximately normal tests for equal predictive accuracy in nested models. Journal of econometrics, 138(1):291–

311, 2007.

[15] Amy K Edwards, Lawrence E Harris, and Michael S Piwowar. Corpo-

rate bond market transaction costs and transparency. Journal of Finance, 62(3):1421–1451, 2007.

[16] Butaru Florentin, Chen Qingqing, Clark Brian, Das Sanmay, W. Lo Andrew, and Siddique Akhtar. Risk and risk management in the credit card industry.

Journal of Banking & Finance, 72:218–239, 2016.

[17] Bruce D Grundy and J Spencer Martin Martin. Understanding the nature of the risks and the source of the rewards to momentum investing. The Review of Financial Studies, 14(1):29–78, 2001.

[18] Shihao Gu, Bryan Kelly, and Dacheng Xiu. Empirical asset pricing via machine learning.The Review of Financial Studies, 33(5):2223–2273, 02 2020.

[19] Xu Guo, Hai Lin, Chunchi Wu, and Guofu Zhou. Predictive information in corporate bond yields. Journal of Financial Markets, 59:100687, 2022.

[20] Erkam Guresen, Gulgun Kayakutlu, and Tugrul U Daim. Using artificial neural network models in stock market index prediction. Expert Systems with Applications, 38(8):10389–10397, 2011.

[21] Tin Kam Ho. Random decision forests. InProceedings of 3rd international conference on document analysis and recognition, volume 1, pages 278–282.

IEEE, 1995.

[22] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.

[23] Zhongsheng Hua, Yu Wang, Xiaoyan Xu, Bin Zhang, and Liang Liang.

Predicting corporate financial distress based on integration of support vector machine and logistic regression. Expert Systems with Applications, 33(2):434–440, 2007.

[24] Kim Kyoung-jae. Financial time series forecasting using support vector machines. Neurocomputing, 55(1):307–319, 2003. Support Vector Machines.

[25] Shuaiqiang Liu, Cornelis Oosterlee, and Sander Bohte. Pricing options and computing implied volatilities using neural networks. Risks, 7:16, 02 2019.

[26] Andrew W Lo, Harry Mamaysky, and Jiang Wang. Foundations of technical analysis: Computational algorithms, statistical inference, and empirical implementation. Journal of Finance, 55(4):1705–1765, 2000.

[27] Scott M Lundberg, Gabriel G Erion, and Su-In Lee. Consistent individual- ized feature attribution for tree ensembles.arXiv preprint arXiv:1802.03888, 2018.

[28] Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. page 4768–4777, 2017.

[29] Christopher J Neely, David E Rapach, Jun Tu, and Guofu Zhou. Forecast- ing the equity risk premium: the role of technical indicators. Management Science, 60(7):1772–1791, 2014.

[30] David Nelson, Adriano Pereira, and Renato de Oliveira. Stock market’s price movement prediction with lstm neural networks. pages 1419–1426, 05 2017.

[31] Mingyue Qiu and Yu Song. Predicting the direction of stock market index movement using an optimized artificial neural network model. PLOS ONE, 11:e0155133, 05 2016.

[32] David E Rapach, Jack K Strauss, and Guofu Zhou. Out-of-sample equity premium prediction: Combination forecasts and links to the real economy.

The Review of Financial Studies, 23(2):821–862, 2010.

[33] David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. Learn- ing representations by back-propagating errors.nature, 323(6088):533–536, 1986.

[34] Fadil Santosa and William W Symes. Linear inversion of band-limited re- flection seismograms.SIAM Journal on Scientific and Statistical Computing, 7(4):1307–1330, 1986.

[35] J¨urgen Schmidhuber. Deep learning in neural networks: An overview. Neu- ral networks, 61:85–117, 2015.

[36] Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of bayesian opti- mization. Proceedings of the IEEE, 104(1):148–175, 2015.

[37] Daniel L Thornton and Giorgio Valente. Out-of-sample predictions of bond excess returns and forward rates: An asset allocation perspective. The Re- view of Financial Studies, 25(10):3141–3168, 2012.

[38] Jack L Treynor and Robert Ferguson. In defense of technical analysis. Jour- nal of Finance, 40(3):757–773, 1985.

[39] Guo Xu, Lin Hai, Wu Chunchi, and Zhou Guofu. Predictive information in corporate bond yields. Journal of Financial Markets, page 100687, 2021.

[40] Kara Yakup, Acar Boyacioglu Melek, and Kaan Baykan ¨Omer. Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the istanbul stock exchange.Expert Systems with Applications, 38(5):5311–5319, 2011.

[41] Shynkevich Yauheniya, McGinnity T.M., A. Coleman Sonya, Belatreche Ammar, and Li Yuhua. Forecasting price movements using technical indicators: Investigating the impact of varying input window length. Neuro- computing, 264:71–88, 2017.

[42] Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005.

Table 1: Summary statistics

This table reports the number of bond-month observations, cross-sectional mean, standard deviation, 1st percentile, 25th percentiles, 50th percentiles, 99th percentiles and maximum of stock monthly return, dividend per share, net income, sales and market capitalization. Stock return and market capitalization are in monthly frequency. Dividend per share, net income and sales are in annual frequency.

Variable N Mean SD 1% 25% 50% 75% 99%

Stock return (%) 360708 1.016 15.214 -35.246 -5.352 0.690 6.513 44.782

Dividend / share ($) 36895 0.56 1.50 0.00 0.00 0.04 0.74 4.50

Net income ($000’s) 36981 333213 1994161 -1014400 -4655 24444 158027 6955800

Sales ($000’s) 36981 4743031 18796037 0 106712 591625 2540873 83735800

Market cap ($000’s) 360708 7887159 34602870 9378 213765 1005630 4096619 138363550

Table 2:R²_OS of stock return forecast

This table reports the proportional reduction in mean squared forecast error (R²_OS) for various predictive models (the linear predictive model or machine learning models) that use three sets of predictors v.s. the naive benchmark model that assumes a value of zero.

Predictive models are trained using 20 years (240 months) data from Jan 1971 to Dec 1990 and the testing period is from Jan 1991 to Jan 2021. We evaluate the significance of R²_OS using the MSPE-adjusted statistic of (cite). Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The symbols ¹, ², ³ denote significance at the 10%, 5% and 1% level respectively.

Model Set 1 Set 2 All variables OLS -1.685³ -2.144³ -4.223 FFN 1-layer -5.127³ 0.599³ -1.374³ FFN 2-layers -0.745³ -1.359³ -3.256 FFN 3-layers 0.476³ -0.0³ -1.109³ XGBoost -1.602³ 0.629³ -0.604³ LightGBM -0.79³ 0.79³ -0.267³

RF 0.277³ 0.785³ 0.716³

GRU 0.191³ 0.482³ -0.948³ LSTM -1.054³ -0.036³ -0.066³ Combination -0.2³ 0.832³ 0.02³

Table 3: Return of zero-investment portfolios

This table reports the returns of the zero-investment portfolios sorted by stocks’ expected returns in the one-month holding horizon. We forecast an individual stock’s expected

return using the linear predictive model and various machine learning models. The predictors consists of 127 technical indicators and 94 stock/firm characteristics. For the machine learning models, we train the models using the data from Jan 1971 to Dec 1990.

The zero-investment portfolio is defined by longing the stocks with the highest 10%

expected returns and shorting the stocks with the lowest 10% expected returns. The portfolios are equally weighted and rebalanced each month. We also report the standard

deviation of the returns, and the p-value for the two-sample t-test with null hypothesis (H0) is:RML−RLR≤0, whereRMLis return of the zero-investment portfolio from the machine learning models andR_LRdenotes the return from the linear predictive model.

Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The testing period is from Jan 1991 to Dec 2020.

Set 1 Set 2 All

Model Return std dev p-value Return std dev p-value Return std dev p-value

OLS 0.801 4.943 0.672 3.328 0.886 4.270

FFN 1-layer 0.652 6.294 0.638 1.875 4.965 0.000 2.130 6.233 0.001 FFN 2-layers 0.880 6.387 0.426 2.045 5.049 0.000 1.437 6.321 0.085 FFN 3-layers 0.979 6.501 0.340 1.745 5.562 0.001 1.728 6.099 0.016

XGBoost 1.762 6.405 0.012 2.182 5.293 0.000 2.815 6.699 0.000

LightGBM 1.795 5.864 0.007 2.248 5.650 0.000 2.782 5.925 0.000

RF 0.655 5.033 0.654 2.143 5.029 0.000 2.276 6.132 0.000

GRU 0.504 5.188 0.784 1.441 4.127 0.003 1.073 5.069 0.296

LSTM 1.531 4.319 0.017 1.444 4.488 0.004 1.744 5.882 0.013

Combination 2.029 6.445 0.002 2.330 5.795 0.000 2.639 6.924 0.000

Table 4: Cumulative returns of the zero-investment portfolios

This table reports the cumulative returns of the zero-investment portfolios in the testing period.

A zero investment portfolio is longing the stocks with the highest 10% expected returns while shorting the stocks with the lowest 10% expected returns. Expected returns are from various predictive models (the linear predictive model or the machine learning models) and three sets of predictors. Predictive models are trained using the data from July 2005 to June 2010 and the testing period is from July 2010 to August 2020. Sets 1 and 2 indicate the 43 bond characteristic variables and 146 technical indicators, respectively.

Model Set 1 Set 2 All

OLS 11.132 9.125 16.907

FFN 1-layer 4.549 543.127 919.769 FFN 2-layers 10.290 983.733 75.423 FFN 3-layers 13.786 306.943 228.578 XGBoost 258.595 1517.625 10184.239 LightGBM 315.734 1815.581 10552.787

RF 6.500 1404.060 1711.734

GRU 1-layer 3.594 131.279 27.303 LSTM 1-layer 172.670 124.283 260.535 Combination 615.107 2339.420 4442.614

Table 5: Alphas of the zero-investment portfolios

This table reports alphas from eight factor models: (1) Fama and French (2015) 5-factor model (FF-5); (2) Hou, Xue, and Zhang (2015) 4-factor model (HXZ-4); (3) Stambaugh and Yuan (2016) mispricing-factor model (SY-4); (4) Daniel, Hirshleifer, and Sun (2020) behavioral-factor modelmodel (DHS-3); Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The symbols¹,²,³denote significance at the 10%, 5% and 1% level respectively.

FF-5 HXZ-4 SY-4 DHS-3

Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All

OLS 0.939³ 0.596³ 0.865³ 0.747¹ 0.626³ 0.491³ 1.238³ 0.983³ 1.053³ 0.91³ 0.841³ 0.775³ FFN 1-lyr 2.189³ 1.708³ 0.84³ 2.043 2.082³ 0.443³ 2.856³ 2.409³ 1.006³ 2.349³ 2.133³ 0.704³ FFN 2-lyr 1.662³ 1.941³ 1.028³ 1.229¹ 2.103³ 0.74³ 1.911³ 2.58³ 1.335³ 1.595³ 2.336³ 0.906³ FFN 3-lyr 2.008³ 1.592³ 1.213³ 1.503¹ 1.869³ 0.996³ 2.184³ 2.477³ 1.579³ 1.849³ 2.108³ 1.192³ XGBoost 2.586³ 2.087³ 1.649³ 3.055³ 2.289³ 2.132³ 3.859³ 2.929³ 2.81³ 3.172³ 2.485³ 2.153³ LightGBM 2.648³ 2.146³ 1.797³ 2.8³ 2.348³ 1.911³ 3.597³ 2.94³ 2.687³ 3.011³ 2.513³ 2.09³ RF 2.073³ 2.063³ 0.677³ 2.353³ 2.271³ 0.938³ 3.099³ 2.812³ 1.408³ 2.446³ 2.394³ 0.776³ GRU 1.007² 1.246³ 0.597³ 0.802¹ 1.34³ 0.411³ 1.485³ 1.685³ 0.803³ 1.12³ 1.518³ 0.7³ LSTM 1.348³ 1.173³ 1.46³ 1.582³ 1.384³ 1.617³ 2.258³ 1.59³ 1.976³ 1.895³ 1.354³ 1.748³ Combination 2.639³ 2.194³ 2.094³ 2.531³ 2.437³ 2.195³ 3.496³ 3.015³ 2.935³ 2.893³ 2.625³ 2.338³

Table 6: Cross-sectional regressions

This table reports the results of cross-sectional regressions of monthly returns of individual stocks on the expected return predicted by all 221 variables, 127 technical indicators and 94 stock characteristics variables respectively,

r_j,t+1=z₀+z₁E_t[r_j,t+1] +

k=1

∑

f_kB_j,kt+εj,t+1, (6)

whereE_t[r_j,t+1]is the future (month t + 1) return of stock j forecast by independent variables in month t, andB_j,kt,k=1, ...,mare stock characteristic variables. The regression is a Fama-MacBeth cross-sectional regression. We consider six models that use different stock characteristics in the regression: (1) dividend to price, earnings to price, sales to price, cash flow to price ratio, moving average price of last six months(MA_t−1,6)and moving average price of last four years(MA_t−1,48); (2) dividend to price, earnings to price, sales to price, cash flow to price ratio,MAt−1,6,MAt−1,48, moving average returns of last six months(MA_t−1,6)^ret and moving average returns of last four years(MA^ret_t−1,48); (3) dividend to price, earnings to price, sales to price, cash flow to price ratio,MAt−1,6,MAt−1,48,MA^ret_t−1,6,MA^ret_t−1,48, return volatility and earnings volatility; (4) dividend to price, earnings to price, sales to price, cash flow to price ratio,MAt−1,6,MAt−1,48, MA^ret_t−1,6,MA^ret_t−1,48, return volatility, earnings volatility, return on assets, return on equity and return on invested capital. Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The symbols¹,²,³denote significance at the 10%, 5% and 1% level respectively.

Model (1) Model (2) Model (3) Model (4)

Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All

OLS 0.136³ 0.077³ 0.092³ 0.264³ 0.229³ 0.202³ 0.222³ 0.222³ 0.222³ 0.28³ 0.223³ 0.204³ FFN 1-lyr 0.263³ 0.400³ 0.299³ 0.668³ 1.249³ 0.49³ 1.225³ 1.225³ 1.225³ 0.79³ 1.228³ 0.516³ FFN 2-lyr 0.265³ 0.589³ 0.209³ 0.842³ 1.081³ -0.031 1.041³ 1.041³ 1.041³ 0.962³ 1.044³ -0.044 FFN 3-lyr 1.601³ 0.335³ 0.286³ 6.104³ 0.861³ -0.131 0.845³ 0.845³ 0.845³ 6.546³ 0.847³ -0.167 XGBoost 0.195³ 0.404³ 0.322³ 0.89³ 0.943³ 0.988³ 0.927³ 0.927³ 0.927³ 0.878³ 0.926³ 0.979³ LightGBM 0.352³ 0.793³ 0.491³ 0.967³ 2.034³ 0.966³ 1.989³ 1.989³ 1.989³ 1.011³ 1.988³ 0.979³ RF 0.213³ 0.845³ 1.677³ 0.927³ 2.205³ 4.975³ 2.166³ 2.166³ 2.166³ 0.848³ 2.162³ 4.872³ GRU 0.242³ 0.365³ 0.177³ 0.635³ 1.245³ 0.599³ 1.24³ 1.24³ 1.24³ 0.702³ 1.239³ 0.721³

Table 7: Statistics of zero-investment portfolios

Panel A reports the Sharpe ratios of the zero-investment portfolios. A zero investment portfolio is longing the bonds with the highest 10% expected returns while shorting the bonds with the lowest 10% expected returns.. Panel B reports the maximum drawdown of zero-investment portfolios. All panels are for model trained on first 20 years (240 months) data from Jan 1971 to Dec 1990 and the testing period is from Jan 1991 to Dec 2020. Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively.

Panel A

Model Set 1 Set 2 All

OLS 0.162 0.202 0.208

FFN 1-layer 0.104 0.378 0.342 FFN 2-layers 0.138 0.405 0.227 FFN 3-layers 0.151 0.314 0.283

XGBoost 0.275 0.412 0.420

LightGBM 0.306 0.398 0.470

RF 0.130 0.426 0.371

GRU 1-layer 0.097 0.349 0.212 LSTM 1-layer 0.355 0.322 0.296 Combination 0.315 0.402 0.381 Panel B

Model Set 1 Set 2 All

OLS -57.409 -37.172 -51.305

FFN 1-layer -70.946 -27.543 -64.705 FFN 2-layers -71.122 -30.724 -76.460 FFN 3-layers -71.182 -31.589 -73.539 XGBoost -65.665 -40.301 -61.709 LightGBM -66.692 -41.378 -60.108

RF -65.626 -27.292 -60.593

GRU 1-layer -68.938 -21.139 -66.615 LSTM 1-layer -44.681 -37.371 -66.798 Combination -72.406 -37.339 -70.485

Table 8: transaction costs of decile-sorted portfolios

Panel A reports the turnover ratios of the decile-sorted portfolios (H-L). Panel B reports the corre- sponding break-even transaction costs (BETCs). The zero return BETCs are the transaction costs that completely offset the returns. The insignificant BETCs are the costs that make the returns of H-L portfolios insignificantly different from zero at the 5% level. We report the turnover rates of High and Low portfolios and the H-L portfolio that longs High and shorts Low decile-sorted portfolios (H-L). Both panel are for model trained on first 20 years (240 months) data from Jan 1971 to Dec 1990 and the testing period is from Jan 1991 to Dec 2020. Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively.

Panel A: turnover ratio

Set 1 Set 2 All

Model H (%) L(%) H-L(%) H(%) L(%) H-L(%) H(%) L(%) H-L(%)

OLS 16.900 16.902 33.803 23.596 23.601 47.197 21.379 21.378 42.757 FFN 1-layer 17.904 17.889 35.793 27.069 27.063 54.132 24.138 24.124 48.262 FFN 2-layers 17.294 17.275 34.569 27.772 27.757 55.529 25.211 25.203 50.414 FFN 3-layers 18.299 18.283 36.582 26.003 25.996 52.000 24.830 24.823 49.653 XGBoost 29.398 29.419 58.817 31.431 31.442 62.873 28.156 28.174 56.330 LightGBM 27.228 27.232 54.459 30.369 30.366 60.735 28.059 28.061 56.121 RF 22.169 22.180 44.349 32.022 32.033 64.055 21.518 21.512 43.030 GRU 1-layer 26.544 26.547 53.091 28.459 28.467 56.926 28.802 28.832 57.634 LSTM 1-layer 26.713 26.700 53.413 17.637 17.610 35.248 27.894 27.909 55.803 Combination 26.035 26.031 52.066 27.820 27.807 55.627 26.295 26.291 52.585 Panel B: BETCs

Zero-return BETCs (%) Insignificance BETCs (%)

Model Set 1 Set 2 All Set 1 Set 2 All

OLS 2.370 1.424 2.073 0.862 0.697 1.043

FFN 1-layer 1.822 3.463 4.414 0.008 2.517 3.081 FFN 2-layers 2.547 3.682 2.851 0.641 2.744 1.557 FFN 3-layers 2.676 3.355 3.481 0.843 2.251 2.214 XGBoost 2.995 3.471 4.998 1.872 2.603 3.771 LightGBM 3.295 3.701 4.957 2.185 2.741 3.868

RF 1.476 3.345 5.289 0.305 2.535 3.819

GRU 1-layer 0.950 2.531 1.862 -0.058 1.783 0.955 LSTM 1-layer 2.866 4.097 3.125 2.032 2.784 2.037 Combination 3.898 4.188 5.018 2.621 3.113 3.660

Table 9: Variables importance for all models

Table reports the top 5 most important variables for Linear Regression (OLS), Feedforward Neural Network for 1, 2 and 3 hidden layers (FFN 1-layer, FFN 2-layers and FFN 3-layers), eXtreme Gradient Boosting (XGBoost) , Light Gradient Boosting Machine (LightGBM) and Random Forest (RF) using SHAP feature importance methodology.

Panel A

OLS FFN 1-lyr FFN 2-lyr FFN 3-lyr XGBoost LightGBM RF

EMA 27 WILLR 15 securedind securedind ep ep ep

RSI 18 WILLR 24 ROC 21 RSI 1 ROC 3 ROC 21 sp

RSI 15 ROC 21 WILLR 15 dolvol ROC 21 ROC 3 bm

EMA 30 dolvol ROC 3 ROC 21 secured indmom cashpr

RSI 21 securedind dolvol WILLR 15 indmom secured ROC 3

Table 10: Results of high and low market value stocks

This table reports the results of investment and non-investment bonds. Panel A reports the proportional reduction in mean squared forecast error (R²_OS) for various predictive models (the linear predictive model or machine learning models) that use three sets of predictors v.s. the naive benchmark model that assumes a value of zero. Panel B reports the returns and the p-value for the two-sample t-test with the null hypothesis (H₀) is:RML−RLR≤0, whereRMLis return of the zero-investment portfolio from the machine learning models andRLR denotes the return from the linear predictive model. Panel C reports alphas using regression on the factor model, Stambaugh and Yuan (2016) mispricing-factor model (SY-4). Panel D reports the results of Fama-MacBeth regressions including the control variables of bond size, age, time to maturity, coupon rate,MA_t−1,6, MAt−1,48,MA_t−1,6^ret ,MA^ret_t−1,48, IRC, Amihud illiquidity, and four bond factor betas of [3]. Predictive models are trained on the data from Jan 1971 to Dec 1990. The out-of-sample period is from Jan 1991 to Dec 2020. Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The symbols¹,²,³denote significance at the 1%, 5% and 10% level respectively.

Panel A:R²_OS

High mkt val stocks Low mkt val stocks

Model Set 1 Set 2 All Set 1 Set 2 All

OLS -1.751³ -5.062³ -7.305 -3.849³ -1.485³ -5.431³ FFN 1-layer -1.219³ -0.614³ -2.831³ -4.104³ -1.475³ -3.388 FFN 2-layers 0.637³ -0.184³ -3.168³ -1.4³ 1.079³ -3.007³ FFN 3-layers 0.593³ 0.405³ 0.717³ -3.088³ 0.755³ -1.353² XGBoost -0.494³ -33.665³ -0.233³ -1.549³ 0.653³ -5.708³ LightGBM 0.672³ 0.854³ 0.189³ -0.092³ 0.323³ -0.262³ RF 0.754³ 0.778³ 0.78³ 0.457³ 0.401³ 0.169³ GRU -1.864³ 0.545³ -0.652³ 0.381³ 0.457³ 0.528³ LSTM -2.831³ 0.012³ 0.457³ 0.383³ 0.445³ 0.467³

Panel B: Portfolio analysis

High mkt val stocks Low mlt val stocks

Set 1 Set 2 All Set 1 Set 2 All

Model Return p-value Return p-value Return p-value Return p-value Return p-value Return p-value

OLS 0.899 0.404 0.846 2.019 1.592 1.857

FFN 1-layer 0.932 0.455 1.339 0.003 1.860 0.001 2.401 0.226 3.920 0.000 3.621 0.001 FFN 2-layers 0.816 0.609 1.271 0.005 1.893 0.001 2.755 0.101 4.038 0.000 4.511 0.000 FFN 3-layers 1.159 0.224 1.077 0.011 0.858 0.487 3.083 0.027 3.298 0.001 1.301 0.813 XGBoost 1.302 0.144 1.733 0.000 1.954 0.002 3.867 0.001 3.705 0.000 4.763 0.000 LightGBM 0.928 0.464 1.836 0.000 1.682 0.007 4.507 0.000 3.687 0.000 5.014 0.000

RF 1.038 0.350 1.279 0.001 1.310 0.084 4.544 0.000 3.247 0.003 4.759 0.000

GRU 0.965 0.415 1.013 0.017 0.842 0.506 1.266 0.917 1.985 0.226 0.603 0.986

LSTM 0.993 0.377 0.893 0.034 1.395 0.033 0.700 0.996 1.930 0.268 1.667 0.647 Combination 1.416 0.068 1.595 0.000 2.317 0.000 4.236 0.000 4.099 0.000 4.647 0.000

Panel C: alpha Panel D: Fama-MacBeth regression

High mkt val stocks Low mkt val stocks High mkt val stocks Low mkt val stocks

Model Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All

OLS 1.225³ 0.502 1.372³ -0.041 0.310¹ -0.050 0.413³ 0.184³ 0.406³ 0.135³ 0.029 0.154³ Lasso 1.223³ 0.387 1.353³ 0.079 0.526³ 0.149 0.783³ 0.494³ 0.821³ 0.182³ 0.248³ 0.317² Enet 1.230³ 0.291 1.349³ 0.089 0.499³ 0.269 0.768³ 0.481³ 0.833³ 0.180³ 0.210³ 0.364³ FFN 1-lyr 1.216³ 1.105² 1.259³ 0.979³ 0.598³ 0.621¹ 0.711³ 0.542³ 0.617³ 0.185³ 0.087² 0.312³ FFN 2-lyr 1.281³ 1.049³ 1.281³ 0.698 0.558³ 0.719

FFN 3-lyr 1.176³ 0.815 1.241³ -0.006 0.674³ 0.477 0.713³ 0.548³ 0.980³ 0.290³ 0.091² 0.303³ XGBoost 1.330³ 1.201³ 1.476³ 0.770 0.578³ 0.375 0.805³ 0.358³ 0.926³ 0.300¹ 0.098² 0.221² LightGBM 1.285³ 0.635 1.445³ 0.296 0.674³ 0.174 0.901³ 0.470³ 1.010³ 0.223 0.133² 0.177¹ RF 1.341³ 0.123 1.292³ -0.277 0.636³ 0.174 0.957³ 0.266² 0.851³ -0.187 0.067¹ 0.219² RNN 1.229³ 0.411 1.341³ 0.486 0.580³ 0.306 0.859³ 0.779³ 0.018³ 0.215³ 0.095³ 0.131² GRU 1.012³ 0.632 1.158³ 0.604 0.517² 0.033 0.299³ 0.223³ 0.166³ 0.117¹ 0.086² 1.102¹ LSTM 0.738³ 0.201 1.044³ 0.624 0.651³ 0.532 0.219³ 0.470³ 0.296³ 0.265³ 0.181³ 0.171 Combination 1.446³ 0.785 1.518³ 0.355 0.754³ 0.491 1.105³ 0.924³ 1.048³ 0.531³ 0.207³ 0.441³

Dalam dokumen Machine Learning and the Cross-Section of Stock Returns (Halaman 36-62)