The previous results are based on machine learning models that are trained on the first 240 months data. This setting does not allow us to update models when new information arrives, which might affect their performance. To study whether our results are robust to this concern, we try a different training process by updating updating the models regularly.9 In particular, we train the models from Jan 1971 to Dec 1990 and test them for a 10-year sample period from Jan 1991 to Dec 2000.
We retrain the models from Jan 1981 to Dec 2000 and test them for another ten years from Jan 2001 to Dec 2010. Finally, we train the models from Jan 1991 to Dec 2010 and test them for the remaining period.
We want to investigate whether a rolling training process will lead to better results than a fixed training process since a rolling training process can uncover insights and patterns from most recent data.
9The best way is to use a rolling method that updates the models every month. However, it is extremely computational intensive so we could not implement it. We do not consider RNN, GRU, and LTSM due to the same reason.
Panel A of Table 13 reports the R2OS. As can be seen, most of the R2OS results are are significantly better than the previous results that use fixed training process (training on the first 240 months data). This highlights the advantage of continu- ally retraining the model to take advantage of most recent training data. However, unlike previous results for training on all 221 variables and 127 technical indi- cators in which all machine learning models significantly outperform OLS; here, two machine learning models do not outperform OLS for both training on all 221 variables and 127 technical indicators.
Panel B of Table13reports the returns of the zero-investment portfolios. As can be seen, most of the mean return results are significantly better than the previous results that use fixed training process. However, similar to previous results, most machine learning models for training on 94 stock characteritics do not outperform OLS. (except XGBoost, LightGBM and Combination which are statistically sig- nificant at 1%). For training on all 221 variables and 127 technical indicators, all machine learning models outperform OLS.
[Insert Table13about here]
6 Conclusion
In this paper, we conduct a comprehensive study of machine learning models on the cross-section of stock returns. We study whether various machine learn- ing models could forecast stock returns. We also use a large set of predictors to
implement the machine learning models. These predictors contain both 94 stock characteristics variables and 127 technical indicators.
Empirical results show that, relative to the OLS model, machine learning mod- els substantially improve the forecasting performance of various predictors. The improvements are both statistically and economically significant. These findings are consistent with [18]. Results of different sets of predictors indicate that tech- nical indicators have stronger predictive power than the bond characteristics used in [18].
We also study whether stock characteristics variables offer an incremental con- tribution to stock return forecast. When we use all stocks in the machine learning models, we do not find contributions made by the stock characteristics variables.
Results for rolling training process highlights the advantage of continually re- training the model to take advantage of most recent training data as they provide significantly betterR2OSand zero-investment portfolio returns results.
Our study contains several new findings. We document the importance of technical indicators above and beyond stock characteristics variables used [4]. It is important to include technical indicators as information in the stock market.
We show a wide range of hyperparameter tuning is required for some machine learning models to achieve optimal performance. This finding deepens our un- derstanding of how to implement machine learning models to enhance learning performance and reduce overfitting.
References
[1] Yaser S. Abu-Mostafa and Amir F. Atiya. Introduction to financial forecast- ing. Applied Intelligence, 6:205–213, 07 1996.
[2] Dana Angluin and Philip Laird. Learning from noisy examples. Machine Learning, 2(4):343–370, 1988.
[3] Jennie Bai, Turan G Bali, and Quan Wen. Common risk factors in the cross-section of corporate bond returns. Journal of Financial Economics, 131(3):619–642, 2019.
[4] Turan G Bali, Amit Goyal, Dashan Huang, Fuwei Jiang, and Quan Wen.
The cross-sectional pricing of corporate bonds using big data and machine learning. (20-110), Sep 2020.
[5] Jack Bao, Jun Pan, and Jiang Wang. The illiquidity of corporate bonds.
Journal of Finance, 66(3):911–946, 2011.
[6] Parth Bhavsar, Ilya Safro, Nidhal Bouaynaya, Robi Polikar, and Dimah Dera. Machine learning in transportation data analytics. Data Analytics for Intelligent Transportation Systems, pages 283–307, 12 2017.
[7] William Brock, Josef Lakonishok, and Blake LeBaron. Simple technical trading rules and the stochastic properties of stock returns. Journal of Fi- nance, 47(5):1731–1764, 1992.
[8] David P Brown and Robert H Jennings. On technical analysis. Review of Financial Studies, 2(4):527–551, 1989.
[9] T. Kelly Bryan, Pruitt Seth, and Su Yinan. Characteristics are covari- ances: A unified model of risk and return. Journal of Financial Economics, 134(3):501–524, 2019.
[10] Li-Juan Cao and Francis Eng Hock Tay. Support vector machine with adap- tive parameters in financial time series forecasting. IEEE Transactions on neural networks, 14(6):1506–1518, 2003.
[11] Giovanni Cespa and Xavier Vives. Dynamic trading and asset prices: Keynes vs. hayek. The Review of Economic Studies, 79(2):539–580, 2012.
[12] Kyunghyun Cho, Bart Van Merri¨enboer, Dzmitry Bahdanau, and Yoshua Bengio. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.
[13] Fu-Lai Chung, Tak-Chung Fu, Vincent Ng, and Robert Luk. An evolutionary approach to pattern-based time series segmentation. Evolutionary Computa- tion, IEEE Transactions on, 8:471–489, 11 2004.
[14] Todd E Clark and Kenneth D West. Approximately normal tests for equal predictive accuracy in nested models. Journal of econometrics, 138(1):291–
311, 2007.
[15] Amy K Edwards, Lawrence E Harris, and Michael S Piwowar. Corpo-
rate bond market transaction costs and transparency. Journal of Finance, 62(3):1421–1451, 2007.
[16] Butaru Florentin, Chen Qingqing, Clark Brian, Das Sanmay, W. Lo Andrew, and Siddique Akhtar. Risk and risk management in the credit card industry.
Journal of Banking & Finance, 72:218–239, 2016.
[17] Bruce D Grundy and J Spencer Martin Martin. Understanding the nature of the risks and the source of the rewards to momentum investing. The Review of Financial Studies, 14(1):29–78, 2001.
[18] Shihao Gu, Bryan Kelly, and Dacheng Xiu. Empirical asset pricing via ma- chine learning.The Review of Financial Studies, 33(5):2223–2273, 02 2020.
[19] Xu Guo, Hai Lin, Chunchi Wu, and Guofu Zhou. Predictive information in corporate bond yields. Journal of Financial Markets, 59:100687, 2022.
[20] Erkam Guresen, Gulgun Kayakutlu, and Tugrul U Daim. Using artificial neural network models in stock market index prediction. Expert Systems with Applications, 38(8):10389–10397, 2011.
[21] Tin Kam Ho. Random decision forests. InProceedings of 3rd international conference on document analysis and recognition, volume 1, pages 278–282.
IEEE, 1995.
[22] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
[23] Zhongsheng Hua, Yu Wang, Xiaoyan Xu, Bin Zhang, and Liang Liang.
Predicting corporate financial distress based on integration of support vec- tor machine and logistic regression. Expert Systems with Applications, 33(2):434–440, 2007.
[24] Kim Kyoung-jae. Financial time series forecasting using support vector ma- chines. Neurocomputing, 55(1):307–319, 2003. Support Vector Machines.
[25] Shuaiqiang Liu, Cornelis Oosterlee, and Sander Bohte. Pricing options and computing implied volatilities using neural networks. Risks, 7:16, 02 2019.
[26] Andrew W Lo, Harry Mamaysky, and Jiang Wang. Foundations of techni- cal analysis: Computational algorithms, statistical inference, and empirical implementation. Journal of Finance, 55(4):1705–1765, 2000.
[27] Scott M Lundberg, Gabriel G Erion, and Su-In Lee. Consistent individual- ized feature attribution for tree ensembles.arXiv preprint arXiv:1802.03888, 2018.
[28] Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. page 4768–4777, 2017.
[29] Christopher J Neely, David E Rapach, Jun Tu, and Guofu Zhou. Forecast- ing the equity risk premium: the role of technical indicators. Management Science, 60(7):1772–1791, 2014.
[30] David Nelson, Adriano Pereira, and Renato de Oliveira. Stock market’s price movement prediction with lstm neural networks. pages 1419–1426, 05 2017.
[31] Mingyue Qiu and Yu Song. Predicting the direction of stock market index movement using an optimized artificial neural network model. PLOS ONE, 11:e0155133, 05 2016.
[32] David E Rapach, Jack K Strauss, and Guofu Zhou. Out-of-sample equity premium prediction: Combination forecasts and links to the real economy.
The Review of Financial Studies, 23(2):821–862, 2010.
[33] David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. Learn- ing representations by back-propagating errors.nature, 323(6088):533–536, 1986.
[34] Fadil Santosa and William W Symes. Linear inversion of band-limited re- flection seismograms.SIAM Journal on Scientific and Statistical Computing, 7(4):1307–1330, 1986.
[35] J¨urgen Schmidhuber. Deep learning in neural networks: An overview. Neu- ral networks, 61:85–117, 2015.
[36] Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of bayesian opti- mization. Proceedings of the IEEE, 104(1):148–175, 2015.
[37] Daniel L Thornton and Giorgio Valente. Out-of-sample predictions of bond excess returns and forward rates: An asset allocation perspective. The Re- view of Financial Studies, 25(10):3141–3168, 2012.
[38] Jack L Treynor and Robert Ferguson. In defense of technical analysis. Jour- nal of Finance, 40(3):757–773, 1985.
[39] Guo Xu, Lin Hai, Wu Chunchi, and Zhou Guofu. Predictive information in corporate bond yields. Journal of Financial Markets, page 100687, 2021.
[40] Kara Yakup, Acar Boyacioglu Melek, and Kaan Baykan ¨Omer. Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the istanbul stock exchange.Expert Systems with Applications, 38(5):5311–5319, 2011.
[41] Shynkevich Yauheniya, McGinnity T.M., A. Coleman Sonya, Belatreche Ammar, and Li Yuhua. Forecasting price movements using technical in- dicators: Investigating the impact of varying input window length. Neuro- computing, 264:71–88, 2017.
[42] Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005.
Table 1: Summary statistics
This table reports the number of bond-month observations, cross-sectional mean, standard deviation, 1st percentile, 25th percentiles, 50th percentiles, 99th percentiles and maximum of stock monthly return, dividend per share, net income, sales and market capitalization. Stock return and market capitalization are in monthly frequency. Dividend per share, net income and sales are in annual frequency.
Variable N Mean SD 1% 25% 50% 75% 99%
Stock return (%) 360708 1.016 15.214 -35.246 -5.352 0.690 6.513 44.782
Dividend / share ($) 36895 0.56 1.50 0.00 0.00 0.04 0.74 4.50
Net income ($000’s) 36981 333213 1994161 -1014400 -4655 24444 158027 6955800
Sales ($000’s) 36981 4743031 18796037 0 106712 591625 2540873 83735800
Market cap ($000’s) 360708 7887159 34602870 9378 213765 1005630 4096619 138363550
43
Table 2:R2OS of stock return forecast
This table reports the proportional reduction in mean squared forecast error (R2OS) for various predictive models (the linear predictive model or machine learning models) that use three sets of predictors v.s. the naive benchmark model that assumes a value of zero.
Predictive models are trained using 20 years (240 months) data from Jan 1971 to Dec 1990 and the testing period is from Jan 1991 to Jan 2021. We evaluate the significance of R2OS using the MSPE-adjusted statistic of (cite). Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The symbols 1, 2, 3 denote significance at the 10%, 5% and 1% level respectively.
Model Set 1 Set 2 All variables OLS -1.6853 -2.1443 -4.223 FFN 1-layer -5.1273 0.5993 -1.3743 FFN 2-layers -0.7453 -1.3593 -3.256 FFN 3-layers 0.4763 -0.03 -1.1093 XGBoost -1.6023 0.6293 -0.6043 LightGBM -0.793 0.793 -0.2673
RF 0.2773 0.7853 0.7163
GRU 0.1913 0.4823 -0.9483 LSTM -1.0543 -0.0363 -0.0663 Combination -0.23 0.8323 0.023
Table 3: Return of zero-investment portfolios
This table reports the returns of the zero-investment portfolios sorted by stocks’ expected returns in the one-month holding horizon. We forecast an individual stock’s expected
return using the linear predictive model and various machine learning models. The predictors consists of 127 technical indicators and 94 stock/firm characteristics. For the machine learning models, we train the models using the data from Jan 1971 to Dec 1990.
The zero-investment portfolio is defined by longing the stocks with the highest 10%
expected returns and shorting the stocks with the lowest 10% expected returns. The portfolios are equally weighted and rebalanced each month. We also report the standard
deviation of the returns, and the p-value for the two-sample t-test with null hypothesis (H0) is:RML−RLR≤0, whereRMLis return of the zero-investment portfolio from the machine learning models andRLRdenotes the return from the linear predictive model.
Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The testing period is from Jan 1991 to Dec 2020.
Set 1 Set 2 All
Model Return std dev p-value Return std dev p-value Return std dev p-value
OLS 0.801 4.943 0.672 3.328 0.886 4.270
FFN 1-layer 0.652 6.294 0.638 1.875 4.965 0.000 2.130 6.233 0.001 FFN 2-layers 0.880 6.387 0.426 2.045 5.049 0.000 1.437 6.321 0.085 FFN 3-layers 0.979 6.501 0.340 1.745 5.562 0.001 1.728 6.099 0.016
XGBoost 1.762 6.405 0.012 2.182 5.293 0.000 2.815 6.699 0.000
LightGBM 1.795 5.864 0.007 2.248 5.650 0.000 2.782 5.925 0.000
RF 0.655 5.033 0.654 2.143 5.029 0.000 2.276 6.132 0.000
GRU 0.504 5.188 0.784 1.441 4.127 0.003 1.073 5.069 0.296
LSTM 1.531 4.319 0.017 1.444 4.488 0.004 1.744 5.882 0.013
Combination 2.029 6.445 0.002 2.330 5.795 0.000 2.639 6.924 0.000
Table 4: Cumulative returns of the zero-investment portfolios
This table reports the cumulative returns of the zero-investment portfolios in the testing period.
A zero investment portfolio is longing the stocks with the highest 10% expected returns while shorting the stocks with the lowest 10% expected returns. Expected returns are from various predictive models (the linear predictive model or the machine learning models) and three sets of predictors. Predictive models are trained using the data from July 2005 to June 2010 and the testing period is from July 2010 to August 2020. Sets 1 and 2 indicate the 43 bond characteristic variables and 146 technical indicators, respectively.
Model Set 1 Set 2 All
OLS 11.132 9.125 16.907
FFN 1-layer 4.549 543.127 919.769 FFN 2-layers 10.290 983.733 75.423 FFN 3-layers 13.786 306.943 228.578 XGBoost 258.595 1517.625 10184.239 LightGBM 315.734 1815.581 10552.787
RF 6.500 1404.060 1711.734
GRU 1-layer 3.594 131.279 27.303 LSTM 1-layer 172.670 124.283 260.535 Combination 615.107 2339.420 4442.614
Table 5: Alphas of the zero-investment portfolios
This table reports alphas from eight factor models: (1) Fama and French (2015) 5-factor model (FF-5); (2) Hou, Xue, and Zhang (2015) 4-factor model (HXZ-4); (3) Stambaugh and Yuan (2016) mispricing-factor model (SY-4); (4) Daniel, Hirshleifer, and Sun (2020) behavioral-factor modelmodel (DHS-3); Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The symbols1,2,3denote significance at the 10%, 5% and 1% level respectively.
FF-5 HXZ-4 SY-4 DHS-3
Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All
OLS 0.9393 0.5963 0.8653 0.7471 0.6263 0.4913 1.2383 0.9833 1.0533 0.913 0.8413 0.7753 FFN 1-lyr 2.1893 1.7083 0.843 2.043 2.0823 0.4433 2.8563 2.4093 1.0063 2.3493 2.1333 0.7043 FFN 2-lyr 1.6623 1.9413 1.0283 1.2291 2.1033 0.743 1.9113 2.583 1.3353 1.5953 2.3363 0.9063 FFN 3-lyr 2.0083 1.5923 1.2133 1.5031 1.8693 0.9963 2.1843 2.4773 1.5793 1.8493 2.1083 1.1923 XGBoost 2.5863 2.0873 1.6493 3.0553 2.2893 2.1323 3.8593 2.9293 2.813 3.1723 2.4853 2.1533 LightGBM 2.6483 2.1463 1.7973 2.83 2.3483 1.9113 3.5973 2.943 2.6873 3.0113 2.5133 2.093 RF 2.0733 2.0633 0.6773 2.3533 2.2713 0.9383 3.0993 2.8123 1.4083 2.4463 2.3943 0.7763 GRU 1.0072 1.2463 0.5973 0.8021 1.343 0.4113 1.4853 1.6853 0.8033 1.123 1.5183 0.73 LSTM 1.3483 1.1733 1.463 1.5823 1.3843 1.6173 2.2583 1.593 1.9763 1.8953 1.3543 1.7483 Combination 2.6393 2.1943 2.0943 2.5313 2.4373 2.1953 3.4963 3.0153 2.9353 2.8933 2.6253 2.3383
47
Table 6: Cross-sectional regressions
This table reports the results of cross-sectional regressions of monthly returns of individual stocks on the expected return predicted by all 221 variables, 127 technical indicators and 94 stock characteristics variables respectively,
rj,t+1=z0+z1Et[rj,t+1] +
k=1
∑
mfkBj,kt+εj,t+1, (6)
whereEt[rj,t+1]is the future (month t + 1) return of stock j forecast by independent variables in month t, andBj,kt,k=1, ...,mare stock characteristic variables. The regression is a Fama-MacBeth cross-sectional regression. We consider six models that use different stock characteristics in the regression: (1) dividend to price, earnings to price, sales to price, cash flow to price ratio, moving average price of last six months(MAt−1,6)and moving average price of last four years(MAt−1,48); (2) dividend to price, earnings to price, sales to price, cash flow to price ratio,MAt−1,6,MAt−1,48, moving average returns of last six months(MAt−1,6)ret and moving average returns of last four years(MArett−1,48); (3) dividend to price, earnings to price, sales to price, cash flow to price ratio,MAt−1,6,MAt−1,48,MArett−1,6,MArett−1,48, return volatility and earnings volatility; (4) dividend to price, earnings to price, sales to price, cash flow to price ratio,MAt−1,6,MAt−1,48, MArett−1,6,MArett−1,48, return volatility, earnings volatility, return on assets, return on equity and return on invested capital. Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The symbols1,2,3denote significance at the 10%, 5% and 1% level respectively.
Model (1) Model (2) Model (3) Model (4)
Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All
OLS 0.1363 0.0773 0.0923 0.2643 0.2293 0.2023 0.2223 0.2223 0.2223 0.283 0.2233 0.2043 FFN 1-lyr 0.2633 0.4003 0.2993 0.6683 1.2493 0.493 1.2253 1.2253 1.2253 0.793 1.2283 0.5163 FFN 2-lyr 0.2653 0.5893 0.2093 0.8423 1.0813 -0.031 1.0413 1.0413 1.0413 0.9623 1.0443 -0.044 FFN 3-lyr 1.6013 0.3353 0.2863 6.1043 0.8613 -0.131 0.8453 0.8453 0.8453 6.5463 0.8473 -0.167 XGBoost 0.1953 0.4043 0.3223 0.893 0.9433 0.9883 0.9273 0.9273 0.9273 0.8783 0.9263 0.9793 LightGBM 0.3523 0.7933 0.4913 0.9673 2.0343 0.9663 1.9893 1.9893 1.9893 1.0113 1.9883 0.9793 RF 0.2133 0.8453 1.6773 0.9273 2.2053 4.9753 2.1663 2.1663 2.1663 0.8483 2.1623 4.8723 GRU 0.2423 0.3653 0.1773 0.6353 1.2453 0.5993 1.243 1.243 1.243 0.7023 1.2393 0.7213
48
Table 7: Statistics of zero-investment portfolios
Panel A reports the Sharpe ratios of the zero-investment portfolios. A zero investment portfolio is longing the bonds with the highest 10% expected returns while shorting the bonds with the lowest 10% expected returns.. Panel B reports the maximum drawdown of zero-investment portfolios. All panels are for model trained on first 20 years (240 months) data from Jan 1971 to Dec 1990 and the testing period is from Jan 1991 to Dec 2020. Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively.
Panel A
Model Set 1 Set 2 All
OLS 0.162 0.202 0.208
FFN 1-layer 0.104 0.378 0.342 FFN 2-layers 0.138 0.405 0.227 FFN 3-layers 0.151 0.314 0.283
XGBoost 0.275 0.412 0.420
LightGBM 0.306 0.398 0.470
RF 0.130 0.426 0.371
GRU 1-layer 0.097 0.349 0.212 LSTM 1-layer 0.355 0.322 0.296 Combination 0.315 0.402 0.381 Panel B
Model Set 1 Set 2 All
OLS -57.409 -37.172 -51.305
FFN 1-layer -70.946 -27.543 -64.705 FFN 2-layers -71.122 -30.724 -76.460 FFN 3-layers -71.182 -31.589 -73.539 XGBoost -65.665 -40.301 -61.709 LightGBM -66.692 -41.378 -60.108
RF -65.626 -27.292 -60.593
GRU 1-layer -68.938 -21.139 -66.615 LSTM 1-layer -44.681 -37.371 -66.798 Combination -72.406 -37.339 -70.485
Table 8: transaction costs of decile-sorted portfolios
Panel A reports the turnover ratios of the decile-sorted portfolios (H-L). Panel B reports the corre- sponding break-even transaction costs (BETCs). The zero return BETCs are the transaction costs that completely offset the returns. The insignificant BETCs are the costs that make the returns of H-L portfolios insignificantly different from zero at the 5% level. We report the turnover rates of High and Low portfolios and the H-L portfolio that longs High and shorts Low decile-sorted portfolios (H-L). Both panel are for model trained on first 20 years (240 months) data from Jan 1971 to Dec 1990 and the testing period is from Jan 1991 to Dec 2020. Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively.
Panel A: turnover ratio
Set 1 Set 2 All
Model H (%) L(%) H-L(%) H(%) L(%) H-L(%) H(%) L(%) H-L(%)
OLS 16.900 16.902 33.803 23.596 23.601 47.197 21.379 21.378 42.757 FFN 1-layer 17.904 17.889 35.793 27.069 27.063 54.132 24.138 24.124 48.262 FFN 2-layers 17.294 17.275 34.569 27.772 27.757 55.529 25.211 25.203 50.414 FFN 3-layers 18.299 18.283 36.582 26.003 25.996 52.000 24.830 24.823 49.653 XGBoost 29.398 29.419 58.817 31.431 31.442 62.873 28.156 28.174 56.330 LightGBM 27.228 27.232 54.459 30.369 30.366 60.735 28.059 28.061 56.121 RF 22.169 22.180 44.349 32.022 32.033 64.055 21.518 21.512 43.030 GRU 1-layer 26.544 26.547 53.091 28.459 28.467 56.926 28.802 28.832 57.634 LSTM 1-layer 26.713 26.700 53.413 17.637 17.610 35.248 27.894 27.909 55.803 Combination 26.035 26.031 52.066 27.820 27.807 55.627 26.295 26.291 52.585 Panel B: BETCs
Zero-return BETCs (%) Insignificance BETCs (%)
Model Set 1 Set 2 All Set 1 Set 2 All
OLS 2.370 1.424 2.073 0.862 0.697 1.043
FFN 1-layer 1.822 3.463 4.414 0.008 2.517 3.081 FFN 2-layers 2.547 3.682 2.851 0.641 2.744 1.557 FFN 3-layers 2.676 3.355 3.481 0.843 2.251 2.214 XGBoost 2.995 3.471 4.998 1.872 2.603 3.771 LightGBM 3.295 3.701 4.957 2.185 2.741 3.868
RF 1.476 3.345 5.289 0.305 2.535 3.819
GRU 1-layer 0.950 2.531 1.862 -0.058 1.783 0.955 LSTM 1-layer 2.866 4.097 3.125 2.032 2.784 2.037 Combination 3.898 4.188 5.018 2.621 3.113 3.660
Table 9: Variables importance for all models
Table reports the top 5 most important variables for Linear Regression (OLS), Feedforward Neural Network for 1, 2 and 3 hidden layers (FFN 1-layer, FFN 2-layers and FFN 3-layers), eXtreme Gradient Boosting (XGBoost) , Light Gradient Boosting Machine (LightGBM) and Random Forest (RF) using SHAP feature importance methodology.
Panel A
OLS FFN 1-lyr FFN 2-lyr FFN 3-lyr XGBoost LightGBM RF
EMA 27 WILLR 15 securedind securedind ep ep ep
RSI 18 WILLR 24 ROC 21 RSI 1 ROC 3 ROC 21 sp
RSI 15 ROC 21 WILLR 15 dolvol ROC 21 ROC 3 bm
EMA 30 dolvol ROC 3 ROC 21 secured indmom cashpr
RSI 21 securedind dolvol WILLR 15 indmom secured ROC 3
51
Table 10: Results of high and low market value stocks
This table reports the results of investment and non-investment bonds. Panel A reports the proportional reduction in mean squared forecast error (R2OS) for various predictive models (the linear predictive model or machine learning models) that use three sets of predictors v.s. the naive benchmark model that assumes a value of zero. Panel B reports the returns and the p-value for the two-sample t-test with the null hypothesis (H0) is:RML−RLR≤0, whereRMLis return of the zero-investment portfolio from the machine learning models andRLR denotes the return from the linear predictive model. Panel C reports alphas using regression on the factor model, Stambaugh and Yuan (2016) mispricing-factor model (SY-4). Panel D reports the results of Fama-MacBeth regressions including the control variables of bond size, age, time to maturity, coupon rate,MAt−1,6, MAt−1,48,MAt−1,6ret ,MArett−1,48, IRC, Amihud illiquidity, and four bond factor betas of [3]. Predictive models are trained on the data from Jan 1971 to Dec 1990. The out-of-sample period is from Jan 1991 to Dec 2020. Sets 1 and 2 indicate the 94 stock characteristic variables and 127 technical indicators, respectively. The symbols1,2,3denote significance at the 1%, 5% and 10% level respectively.
Panel A:R2OS
High mkt val stocks Low mkt val stocks
Model Set 1 Set 2 All Set 1 Set 2 All
OLS -1.7513 -5.0623 -7.305 -3.8493 -1.4853 -5.4313 FFN 1-layer -1.2193 -0.6143 -2.8313 -4.1043 -1.4753 -3.388 FFN 2-layers 0.6373 -0.1843 -3.1683 -1.43 1.0793 -3.0073 FFN 3-layers 0.5933 0.4053 0.7173 -3.0883 0.7553 -1.3532 XGBoost -0.4943 -33.6653 -0.2333 -1.5493 0.6533 -5.7083 LightGBM 0.6723 0.8543 0.1893 -0.0923 0.3233 -0.2623 RF 0.7543 0.7783 0.783 0.4573 0.4013 0.1693 GRU -1.8643 0.5453 -0.6523 0.3813 0.4573 0.5283 LSTM -2.8313 0.0123 0.4573 0.3833 0.4453 0.4673
52
Panel B: Portfolio analysis
High mkt val stocks Low mlt val stocks
Set 1 Set 2 All Set 1 Set 2 All
Model Return p-value Return p-value Return p-value Return p-value Return p-value Return p-value
OLS 0.899 0.404 0.846 2.019 1.592 1.857
FFN 1-layer 0.932 0.455 1.339 0.003 1.860 0.001 2.401 0.226 3.920 0.000 3.621 0.001 FFN 2-layers 0.816 0.609 1.271 0.005 1.893 0.001 2.755 0.101 4.038 0.000 4.511 0.000 FFN 3-layers 1.159 0.224 1.077 0.011 0.858 0.487 3.083 0.027 3.298 0.001 1.301 0.813 XGBoost 1.302 0.144 1.733 0.000 1.954 0.002 3.867 0.001 3.705 0.000 4.763 0.000 LightGBM 0.928 0.464 1.836 0.000 1.682 0.007 4.507 0.000 3.687 0.000 5.014 0.000
RF 1.038 0.350 1.279 0.001 1.310 0.084 4.544 0.000 3.247 0.003 4.759 0.000
GRU 0.965 0.415 1.013 0.017 0.842 0.506 1.266 0.917 1.985 0.226 0.603 0.986
LSTM 0.993 0.377 0.893 0.034 1.395 0.033 0.700 0.996 1.930 0.268 1.667 0.647 Combination 1.416 0.068 1.595 0.000 2.317 0.000 4.236 0.000 4.099 0.000 4.647 0.000
Panel C: alpha Panel D: Fama-MacBeth regression
High mkt val stocks Low mkt val stocks High mkt val stocks Low mkt val stocks
Model Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All Set 1 Set 2 All
OLS 1.2253 0.502 1.3723 -0.041 0.3101 -0.050 0.4133 0.1843 0.4063 0.1353 0.029 0.1543 Lasso 1.2233 0.387 1.3533 0.079 0.5263 0.149 0.7833 0.4943 0.8213 0.1823 0.2483 0.3172 Enet 1.2303 0.291 1.3493 0.089 0.4993 0.269 0.7683 0.4813 0.8333 0.1803 0.2103 0.3643 FFN 1-lyr 1.2163 1.1052 1.2593 0.9793 0.5983 0.6211 0.7113 0.5423 0.6173 0.1853 0.0872 0.3123 FFN 2-lyr 1.2813 1.0493 1.2813 0.698 0.5583 0.719
FFN 3-lyr 1.1763 0.815 1.2413 -0.006 0.6743 0.477 0.7133 0.5483 0.9803 0.2903 0.0912 0.3033 XGBoost 1.3303 1.2013 1.4763 0.770 0.5783 0.375 0.8053 0.3583 0.9263 0.3001 0.0982 0.2212 LightGBM 1.2853 0.635 1.4453 0.296 0.6743 0.174 0.9013 0.4703 1.0103 0.223 0.1332 0.1771 RF 1.3413 0.123 1.2923 -0.277 0.6363 0.174 0.9573 0.2662 0.8513 -0.187 0.0671 0.2192 RNN 1.2293 0.411 1.3413 0.486 0.5803 0.306 0.8593 0.7793 0.0183 0.2153 0.0953 0.1312 GRU 1.0123 0.632 1.1583 0.604 0.5172 0.033 0.2993 0.2233 0.1663 0.1171 0.0862 1.1021 LSTM 0.7383 0.201 1.0443 0.624 0.6513 0.532 0.2193 0.4703 0.2963 0.2653 0.1813 0.171 Combination 1.4463 0.785 1.5183 0.355 0.7543 0.491 1.1053 0.9243 1.0483 0.5313 0.2073 0.4413
53