Predicting Mutual Fund Performance Using Machine Learning Techniques Dulani Jayasuriya Daluwathumullagamage

(1)

Predicting Mutual Fund Performance Using Machine Learning Techniques

Dulani Jayasuriya Daluwathumullagamage

University of Auckland

23 September 2018

This study conducts a comparative analysis of using machine learning techniques such as gradient boosted decision trees to predict mutual fund performance relative to traditional regression models. At the broadest level, the study finds that machine learning offers significantly improved prediction power and accuracy relative to traditional methods. This implementation establishes a new standard for accuracy in measuring mutual fund performance by unprecedented high out-of-sample return prediction capability. Moreover, the best predictors are identified along with the nature of their relationship with regard to mutual fund performance. Improved prediction of mutual fund performance through machine learning can further extend to developing investment strategies and justify the importance of machine learning techniques in the financial academic sphere and their role in fitech.

JEL Classification: C02, O33, G20, G11, G20, G23

Key words: Machine Learning, Mutual fund performance, gradient boosting.

Acknowledgements: The author wishes to acknowledge the insightful comments of various seminar participants and industry practitioners. Any remaining errors or ambiguities are solely the responsibility of the author.

Contact author email: [email protected]

(2)

2

1. Introduction

The objective of the majority of traditional statistical models utilized in the finance literature is typically inference, not prediction. Hence, any predictions are conducted within this particular sphere of statistical inference. In machine learning models the prime objective is prediction where given a data set to be trained the model recognizes patterns and learns the relationship between the input variables (predictors or independent variables in traditional regression models) and the output variable (dependent variable in traditional regression models). Given this setting, I implement machine learning techniques to test the predictability of future mutual fund performance and identify the most important predictors and the direction of their relationship with mutual fund performance.

In general, mutual fund investors chase past returns but fund flows fail to predict future performance.

According to prior literature past performance fails to strongly predict future fund performance. Berk and Green (2004) and Berk and van Binsbergen (2014) explain this as an equilibrium phenomenon: capital flows to high- skilled fund managers and their performance suffers due to decreasing returns of scale. In addition, large funds’

trades move prices subsequently decreasing trade profitability. Moreover, prior empirical evidence identifies a negative relationship between risk-adjusted returns and fund size (Pastor, Stambaugh and Taylor (2015), Jakab (2015)). However, if capital moves slowly to highly skilled managers (measuring skill from noisy returns data is cumbersome), mutual fund performance might persist, and mutual fund returns might indeed be predictable to some extent.

Given this setting, in this study I pose the following research questions. How well can mutual fund performance be predicted using machine learning techniques? What is the relative importance of different predictors such as past returns, fund size, fund age and other fund characteristics? What is the direction of the relationship between these predictor variables and mutual fund performance?

My study is based on a monthly panel of actively managed mutual funds from 2001 to 2016. Machine learning models are built on a training data set. Then the accuracy of the machine learning model predictions are measures on the test data set. In addition, I compare the out of sample predictive performance of several simple linear regression models with the machine learning models implemented in this study. I find that the machine learning model significantly outperforms the traditional regression models in and out of sample. Moreover, the predictions from the machine learning model has significantly higher correlation with the actual observed mutual fund performance measures. The results are robust to alternative factor models used as benchmarks, different measures of alpha, annual data set and various regression models with different controls.

I proceed as follows. Section 2 presents the literature review. Section 3 describes data and sample selection procedure. In Sections 4 I present the machine learning methodology. Section 5 explains the empirical results.

Finally, section 6 concludes the study. All the robustness tables and the different alpha measures are all available upon request.

(3)

3

2. Literature Review

Introduction of the CRSP Survivor-Bias-Free US Mutual Fund database, increased academic literature that identified a variety of variables predicting mutual fund performance. First appearing in the work of Carhart (1997), the only predictor to my knowledge which predates this database is the lagged one-year return, identified by Hendricks et al. (1993). Carhart's 1997’s seminal paper identifies a variety of predictor variables that are utilized in this study. Carhart (1997) finds that past alphas from his own four-factor model could predict future risk-adjusted mutual fund performance. Moreover, Chen et al. (2004) shows that larger funds perform worse, particularly for small-cap funds, while fund family size is positively related to performance. Pastor et al. (2015)) states that the interpretation of this relationship is not clear. However, the empirical relationship between fund performance and fund size appears relatively established in prior literature. In addition, the trend of mutual funds to deviate from benchmark indices seem to be related to future fund performance as well. Cohen et al. (2005) show that funds mirroring holdings of resemble other funds with strong performance show superior performance.

Cremers and Petajisto (2009) measure these deviations using fund holdings, as in the active share measure.

Moreover, Amihud and Goyenko (2013) provide an alternative measure for the same by using the R

^{2 of}

the fund's returns on one or more benchmarks. Both these very interesting papers suggest that funds that deviate more from their benchmarks perform better. Kacperczyk et al. (2005) state that fund managers possess an informational advantage in certain industries, resulting in the industry concentration becoming a significant predictor of fund performance and managerial investment skill. Kacperczyk and Seru (2007) show that managers relying on more public information when making portfolio decisions perform worse. Christoffersen and Sarkissian (2009) show that funds located in bigger cities with more access to private information and higher likelihood of knowledge spill overs perform better compared to their smaller city counterparts.

In addition, several performance measures have been derived by calculating the so-called returns gap, proposed by Kacperczyk et al. (2008). This is defined as the difference between the actual fund returns and the holdings-based returns. Prior literature shows that a higher return gap is positively related to future fund returns.

Da et al. (2010) show that funds that trade stocks with a high likelihood of informed trading (the PIN measure of Easley et al., 1996) perform better than those that do not. Cao et al. (2013) shows that funds that appear to time market liquidity by increasing market beta prior to periods of high liquidity on average perform better.

Finally, Mamaysky et al. (2007) finds that back testing can be used to better identify funds with nonzero alphas.

Kacperczyk et al. (2014) introduces a skill index that combines both market timing and stock picking abilities and finds strong forecasting ability for future mutual fund returns.

3. Data

The final datasets fund level monthly and annual used in this study is constructed as follows. Survivor-Bias-Free Mutual Fund Database from the Center for Research in Security Prices (CRSP) is used to collect the monthly fund returns and fund characteristics data. The CRSP returns are net after fees, expenses, and brokerage

(4)

4

commissions but before any front-end or back-end loads. This database which identifies each fund share class separately is merged with MFLINKS database available on WRDS, where the MFLINKS table assign each share class to the underlying fund. In addition, monthly return factors were obtained from Kenneth French’s data library. Sample period spans from 2001 to 2016. Weighted CRSP net returns and other characteristics for each fund is calculated when a fund has multiple share classes on the CRSP database. The weights are the most recent total net assets of that particular share class. The sample includes actively managed equity funds whose investment objective codes are provided by Weisenberger and Lipper as aggressive growth, growth, growth and income, equity income, growth with current income, income, long-term growth, maximum capital gains, small capitalization growth, micro-cap, mid-cap, unclassified, or missing. The strategic insight objective code is used to identify the fund style when both the Weisenberger and the Lipper codes are missing. If no code is available for a fund for a given month, the style from the previous month is assigned to the missing value. Funds with no name in the CRSP database is also excluded from the sample. However, if the fund style cannot be identified, it is excluded from the sample. Moreover, index funds are deleted by referencing those whose name includes the word “index” or the abbreviation “ind”, “S&P”, “DOW”, “Wilshire”, and/or “Russell”. Balanced, international and sector funds and funds that hold less than 70% in common stocks are excluded from the sample. Funds with at least $15 million in total net assets (TNA) and 70% of their assets in common stocks are only included in the sample. Observations before the fund’s starting year reported by CRSP is eliminated to address Evans (2010) comments on incubation bias.

4. Empirical Methodology

This section explains the machine learning algorithm and the process implemented to train and test the machine learning model. Secondly, the traditional models utilized for mutual fund performance prediction in prior literature is explained.

Machine Learning

In general, inductive algorithms that 'learn' are termed as machine learning algorithms (Provost & Kohavi, 1998).

The model employed in this study is called the Extreme Gradient Boosting model in short form termed as the XGBoost machine learning model. It is a nonlinear inductive algorithm used to approximate the function (relationship) between inputs and outputs. The fundamental concept behind Gradient Boosting, is to "boost"

many weak learners or predictive models in order to create a stronger overall prediction model. A meta-model is constructed from a large ensemble of weak models. Essentially, a weak model simply has to predict slightly better than a random guess. To combine the weak learners, you first train a weak model, n, using data samples drawn from a particular weight distribution. Then you increase the weight of samples that are misclassified by model n and decrease the weights of those classified correctly. Subsequently, you train the next weak learning model using samples drawn according to the new updated weight distribution. This weight updating mechanism ensures that he algorithm always uses data samples that were hard to learn in previous rounds to train the subsequent models.

Moreover, this results in an ensemble that is adept at learning a large range of seemingly inscrutable patterns in the training data set. In this study, decision trees are used as the weak learner and following the weighting process, the sum of all weak learners are taken to produce the overall prediction.

A differentiable loss function, 𝑓(𝑥), to be minimized is established first for the regression model in order to

(5)

5

optimize the structure and performance. Therefore, solve by minimizing the loss function, 𝑓(𝑥), numerically via the process of steepest descent. The loss function has to be differentiable to perform steepest descent, which is an iterative process of reaching the global minimum of a given loss function by going down the slope until there is no space to move closer to the minimum. As robustness tests several different loos functions are utilized. The main results are shown for the mean squared error (MSE) loss function. The minimization process encompasses several stages. The first and then successive trees are added emulating a gradient based correction. The gradient is required for each unseen test point at each iteration for the calculation of 𝑓(𝑥). Finally, this process will yield 𝑓(𝑥) with the corresponding weighted parameters.

In this study, the prediction model incorporates the following process to predict mutual fund performance.

The set of predictors or inputs specified in the data section are fed into the machine learning model.

Figure 1 depicts the overall prediction process employed in this study.

Figure 1: Process Tree

Performa nce Measures

(6)

6 Validation Set

This section explains the cross-validation technique used in this study. The two monthly and annual mutual fund datasets are sorted by time and the first 80% of the data is used for training while the remaining 20% is randomly used as the validation set. The validation set is utilized to perform model selection and hyper parameter adjustment in the prediction models. This approach ensures that testing data never contains data older than the training data preserving prediction integrity. Moreover, the model can be tested for robustness over time without problems of large data gaps common in normal cross validation procedures. Figure 2 depicts the cross-validation technique. As more data becomes available, the training set increases allowing for an improved machine learning prediction. The size of the test set stays constant as the final metric is a simple average over the different splits. Although the test set stays constant in size, it shifts forward to test distinct non-overlapping periods. Each of the training splits is then fed into the machine learning model to predict the target output values (mutual fund performance measures). This value is compared against the test set's target values to calculate the prediction success metrics.

Figure 2: Block Form Train-Test Splits in Time Series

In sum, functional approximation is any method that helps us to fit the function of inputs to the target/response or output variable. It can also be viewed as a means of finding a pattern in the data. The training data incorporates the first part of the full time-series sample. The training data is used by the model to learn the relationships of the predictor inputs to the target output variable. After the model is trained the testing set's predictor inputs are used to predict the actual target output variable. The predicted outputs are then utilized to calculate accuracy metrics by comparing the predicted to the actual observed output variable. For tuning the parameters, I use ten-fold cross-validation, repeated hundred times and in each case choose the model with the lowest cross-validated MAE, MSE and MAPE.

(7)

7

A regression task in machine learning involves a regressor, for example XGBoost in this case that predicts a continuous value of an observation based on the learned patterns of a training set. This is unrelated to econometrics' definition of regression. The training set consists of past observations from the predictors (input variables) and the output variable. The XGBoost Regressor simply provides the best forecast of the output variable in this case the mutual fund performance measures. In this regression study, I test the ML model against prediction models used in prior literature. In the results, I show that it is possible to outperform prior traditional models of mutual fund performance prediction models by a significant margin.

Therefore, to establish a simple baseline, I first produce, a linear regression model of the form

Yi,t+1 = α + BX^l_i,t+ εi,t, (1)

Where Yi, t+1 is a vector of the different mutual fund performance measures for the i^thmutual fund, for the t+1 month. Xi, t is a vector of variables including the appropriate independent variables. For monthly data, raw and dollar returns are used as output prediction variables. Independent variables used for prediction in the monthly data set are log fund size (TNA) total net assets ($mm), log fund age computed as the difference in years between current date and the date the fund was first offered, retail dummy, institutional dummy, dividend yield, cash, other and preferred. The annual data set has the same return outputs as the monthly data set and the same inputs as predictors. However, fund performance is measured by either alpha j, raw return or the risk-adjusted excess fund return based on several benchmark models in the form of the CAPM, Fama French (1993) three factor model with return vectors RM-Rf (the market excess return), SMB (small minus big size stocks), HML (high minus low book-to-market ratio stocks), and Cahart (1997) four factor model with the additional UMD (winner minus loser stocks) factor. Moreover, additional input variables in the form of the mean and standard deviation of previous years’ monthly returns are included. In robustness tests benchmark models of Cremers, Petajisto, and Zitzewitz (2010), which includes the excess return on the S&P 500 index, the Russell 2000 index minus the return on the S&P 500 index, the Russell 3000 value index minus the return on the Russell 3000 growth index, are also implemented to calculated the risk adjusted mutual fund return. Robustness tests also include additional control or input variables such as Manager Tenure, which is the difference in years between the current date and the date when the current manager took control. Industry Size which is the aggregate size of the actively managed fund industry, as a fraction of total market capitalization of all stocks. Competitor Size which is the size of the fund’s competitors, relative to total market cap. Herfindahl index of the fund’s portfolio holdings which is a measure of portfolio concentration. The robustness test results are available upon request and has been omitted for brevity.

The final section investigates whether the predicted mutual fund performance measures can be utilized in building an active profitable investing strategy. I choose the top third, fourth and fifth highest return prediction funds and form an equally weighted portfolio every month to calculate the cumulative portfolio return for each month. Then I create random portfolios of the same size and use that as the benchmark portfolios and calculate the profitability.

(8)

8

5. Empirical Results

Table 1 includes summary statistics for the input variables (predictors) and the predicted output variables. on the parameters and variables of interest. The mean age of the funds in our sample is 7.89. The average fund size is 486 million usd. The Management tenure on average is 5.28 years. Expenses are 1.68% and turnover is 89.29%.

Table1

Summary Statistics

Variable Mean Std. Dev. Min Max

Panel A: Input Variables

Fund Age 7.89 0.92 3.00 15.00

Fund Size ($ million) 486 2.22 22.00 6671

Expenses (%) 1.68 0.35 0.03 6.47

Turnover (%) 89.29 0.50 0.00 5332

Manager tenure (years) 5.28 0.34 0.00 15.00

Dividend yield 0.03 0.11 0.00 75.05

52 week High NAV 16.99 17.99 0.03 1243

52 week low NAV 18.58 19.17 1.00 21274

Annualised Std Raw Returns 0.007 0.02 -0.47 1.23

Annualised Avg Raw Returns 0.003 0.03 0.00 3.19

Panel B: Output Variables

raw return 0.0048 0.04 -0.98 11.89

risk adjusted return 0.0035 0.04 -0.98 10.90

Dollar return 2.22 87.26 -10107 14566

(Panel A) shows the fund characteristics or the Input variables used to predict mutual fund performance measures. Fund Age is the number of years since the fund was first offered. Expenses is the annual expense ratio. Turnover is the minimum of aggregated sales or aggregated purchases of securities divided by the average twelve-month TNA of the fund. Manager tenure is the number of years since the current manager took control. We use CRSP fund returns that are net of expenses. Data are for actively managed equity funds that satisfy the sample selection requirements.

Dividend Yield is the dividend yield for the funds in our sample. 52 week High NAV is the 52 week average high of the Net asset value and 52 week Low NAV is the 52 week average low of the net asset value. (Panel B) shows statistics for the output or predicted mutual fund performance measures from our sample of funds. The risk adjusted returns are calculated from regressions of fund returns (in excess of the one-month T-bill rate) on the returns of the and Carhart (1997) four factor model. In robustness tests we use the market model and Fama-French (1993) as well over a window of twenty-four months to calculate risk adjusted return. This estimation period moves one month at a time, preceding the 252 one-month test periods that span the years 2001–2016.

The robustness test results are available upon request and has been omitted for brevity.

Table2 shows the linear regressions implemented from equation (1). However, only past lags of the performance measures are significant in predicting present mutual fund performance measures. Low R2 shows significant noise. Therefore, the traditional linear model possesses very little predictive power.

(9)

9 Table 2

Linear regression models for raw, risk adjusted and dollar value mutual fund returns

Explanatory variables, lagged Dependent Variables Yt

Raw Return Risk Adjusted Return Dollar Value Return

Y(t-1) 0.0021 (2.98) 0.0019 (1.99) 1.003 (2.75)

Y(t-2) 0.001 (2.49) 0.001 2(1.32) 1.021 (2.40)

log TNA (total net assets, in $millions) -0.03 (0.14) -0.01 (0.24) -0.13 (1.14)

log Fund age (years) 0.17 (1.03) 0.17 (1.03) 0.29 (1.28)

Expenses (%) 0.009 (0.21) 0.004 (0.48) 0.009 (0.21)

Turnover (%) 0.01 (0.72) 0.009 (0.61) 0.01 (0.72)

log Manager tenure (years) -0.044(0.35) -0.039(0.24) -0.044(0.35)

Retail Fund Indicator 0.01 (0.35) 0.001 (0.49) 0.35 (0.28)

Institutional Fund Indicator 0.01 (0.34) 0.001 (0.29) 1.01 (1.09)

Dividend Yield 0.007 (1.05) 0.002 (0.89) 0.01 (0.97)

Cash 0.004 (1.02) 0.001 (0.98) 0.003 (0.74)

Other securities -0.003(0.68) -0.0015(1.08) -0.006(1.68)

Preferred stock 0.002(1.03) 0.004(0.73) 0.007(1.29)

Common stocks 0.001 (1.88) 0.021 (0.64) 0.046 (1.73)

52 Week High NAV 0.0001 (1.01) 0.0005 (1.22) 0.0027 (1.39)

52 Week Low NAV -0.0002(1.68) -0.0001(1.07) -0.0024(1.42)

R² 0.10 0.09 0.09

Estimation results of Model (1) shown in equation (1) is depicted in this table. For robustness tests various regressions are run by excluding the lags of the dependent variables and also excluding some of the control variables. The dependent variables are monthly measures of fund performance in the form, raw, risk adjusted and dollar value returns. The description for the control variables are given in Table 1.

Table 3 reports conventional measures of the various models’ out of sample predictive performance. According to all the metrics the XGBoost machine learning model outperforms the traditional regression models. This is further substantiated by the high correlations between the predicted values from the XGboost model and the actual observed values of the output variables.

Table 3

Evaluating machine learning model performance vs traditional mutual fund performance prediction models

^MAE ^MAPE ^MSE

^XGBoost linear regression XGBoost linear regression XGBoost linear regression

raw returns 0.041 2.9 4.29 7.3 0.0026 1.34

risk adjusted 0.042 2.49 3.98 6.49 0.0029 1.92

Dollar return 0.38 2.89 3.76 6.89 0.0031 1.93

Correlation between actual and predicted

linear regression XGBoost

raw returns -0.0043 0.612

risk adjusted 0.0228 0.571

Dollar return 0.034 0.475

(10)

10 Detailed empirical results of the machine learning model

A large part of this study is concerned with improving the output/target, variable 𝑦̂. However, in academic research the value of the coefficient 𝐵̂ is significant as well. In Machine learning interpreting the coefficient value is not an easy task and straightforward as in traditional statistical models. The Tables 5 to 7 identifies the variable importance score and a direction indicator to describe the non-linear relationships of the most important input variables for predicting mutual fund performance measures. Table 5 identifies the five most important inputs or predictors other than the past values of the output variable for predicting future mutual fund performance.

Table 4 shows the cross-validation technique where the accuracy of an inducer is calculated by dividing the data into k mutually exclusive subsets of approximately equal size, this is essentially the same as performing multiple tests to ensure the model's robustness over time.

Table 4

Evaluating model performance over different intervals.

Start_Date End_Date MAE MAPE(%) MSE Train Size Test Size

30/11/2001 28/04/2002 0.031794 9.4 0.00279 264836 264828

28/04/2004 31/08/2003 0.035615 4.5 0.003319 529664 264828

31/08/2005 31/10/2007 0.041129 2.4 0.004385 794492 264828

31/10/2006 31/12/2008 0.031854 3.4 0.002228 1059320 264828

31/12/2007 31/01/2009 0.020533 4.2 0.000928 1324148 264828

31/01/2009 28/02/2010 0.021129 5.7 0.000941 1588976 264828

28/02/2010 30/03/2012 0.014151 1.2 0.000476 1853804 264828

30/03/2011 29/02/2013 0.024918 1.0 0.00135 2118632 264828

29/02/2012 30/01/2014 0.058266 3.1 0.007627 2383460 264828

30/01/2013 30/11/2016 0.046715 6.8 0.004489 2648288 264828

Train size is the size of the training set, i.e., the observations from which to approximate the response function (relationship between the input variables and the output variables). Start date and end date are the approximate dates over which the model has been trained and similarly. The test set is a holdout set on which the results of the model gets tested, the test set is always 20% of the data; the other 80% of the model's observations get tested across all the splits. The tests are performed by comparing the predictions against the actual values. The results include the MAE, MAPE and MSE of the machine learning model.

Table 5 to 7 identifies the most important variables for predicting each type of future mutual fund performance as identified by the Gini importance which measures the average gain in information. The measure is based on the number of times a variable is selected for splitting, weighted by the squared improvement to the model as a result of each split, and averaged over all trees. Direction identifies the direction of the variable. According to Table 5 to 7 past lags of the mutual fund performance measures comes in as the most important predictors when predicting future mutual fund performance. According to Table 5 Log Fund Age and Fund size and the retail dummy indicator comes in as the next most important predictors with a positive relationship with the output variable except for the retail dummy indicator which has a negative sign.

(11)

11 Table 5

Aggregated importance and direction for risk adjusted mutual fund return prediction

Input Gini Importance Direction

Risk adjusted return (t-1) 10244 +

Risk adjusted return (t-2) 8932 +

LogFunadAge 795 +

LogFundSize 649 +

Retail Dummy 84 -

Institutional Dummy 0 +

Dividend Yield 0 +

According to Table 6 the same result can be observed as in Table 5. However, more predictor appears to be important in predicting raw mutual fund returns. Log Fund Age and Fund size and the retail dummy indicator comes in as the next most important predictors with a positive relationship with the output variable except for the retail dummy indicator which has a negative sign.

Table 6

Aggregated importance and direction for raw mutual fund return prediction

Input Gini Importance Direction

Raw return (t-1) 10135 +

Raw return (t-2) 9815 +

LogFundSize 2870 +

LogFunadAge 2737 +

Retail Dummy 2111 -

Dividend Yield 954 +

Cash 799 +

Other Securities 555 +

Preferred Stock 24 +

According to Table 7 only past dollar returns have a positive relationship with predicting future dollar returns.

Moreover, Log Fund age and divided yield surpasses fund size in predicting future dollar returns and is negatively related.

(12)

12 Table 7

Aggregated importance and direction for dollar mutual fund return prediction

Input Gini Importance Direction

Dollar Return (t-1) 3570 +

Dollar Return (t-2) 3242 +

LogFunadAge 7109 -

Dividend Yield 6384 -

LogFundSize 4614 -

Cash 2392 +

Other Securities 1631 +

Retail Dummy 1047 -

Preferred Stock 11 +

6. Conclusion

This study analyses the out of sample predictability of actively managed mutual fund performance measures using conventional linear regression methods, as well as sophisticated XGBoost machine learning tools that rely on cross-validation for tuning model parameters. This study finds that machine learning tools can indeed accurately predict future mutual fund performance and out perform traditional regression models. This study offers a novel tool for predicting mutual fund performance and can be further extended to developing investment strategies based on the predictions from these models.

(13)

13

References

Agarwal,V.,V. Fos, andW. Jiang. 2010. Inferring reporting biases in hedge fund databses from hedge fund equity holdings.Working Paper, Columbia Business School.

Amihud, Y., and H. Mendelson. 2010. Transactions costs and asset management. In Operational control in asset management: Processes and costs. Ed. M. Pinedo. New York: Palgrave Macmillan.

Berk, J. B., and R. Green. 2004. Mutual fund flows and performance in rational markets. Journal of Political Economy 112:1269–95.

Bessembinder, H., K. M. Kahle, W. F. Maxwell, and D. Xu. 2008. Measuring abnormal bond performance.

Review of Financial Studies 22:4219–258.

Bollen, N. P. B., and J. A. Busse. 2001. On the timing ability of mutual fund managers. Journal of Finance 56:1075–94.

———. 2005. Short-term persistence in mutual fund performance. Review of Financial Studies 18:569–97.

Brands, S.,Brown, S. J., and D. R. Gallagher. 2006. Portfolio concentration and investment manager performance.

International Review of Finance 5:149–74.

Brown, S. J., andW. N. Goetzmann. 1995. Performance persistence. Journal of Finance 50:679–98.

Carhart, M. 1997. On persistence in mutual fund returns. Journal of Finance 52:57–82.

Chen, J., H. Hong, M. Huang, and J. D. Kubik. 2004. Does fund size erode mutual fund performance? American Economic Review 94:1276–302.

Chevalier, J.A., and G. Ellison. 1999. Career concerns of mutual fund managers. Quarterly Journal of Economics 114:389–432.

Cox, D. R. 1970. The analysis of binomial data. London: Methuen & Co. Cremers, M., andA. Petajisto. 2009.

How active is your fund manager?Anew measure that predicts performance. Review of Financial Studies 22:3329–65.

Cremers, M.,A. Petajisto, and E. Zitzewitz. 2010. Should benchmark indices have alpha? Revisiting performance evaluation.Working Paper, Yale School of Management.

Cremers, M.,M. Ferreira, P. Matos, and L. Starks. 2011. The mutual fund industry worldwide: Explicit and closet indexing, fees, and performance. Working Paper.

Daniel, K., M. Grinblatt, S. Titman, and R. Wermers. 1997. Measuring mutual fund performance with characteristic-based benchmarks. Journal of Finance 52:1035–58.

Dimson, E. 1979. Risk measurement when shares are subject to infrequent trading. Journal of Financial Economics 7:197–226.

Elton, E. J., M. J. Gruber, and C. R. Blake. 1995. Fundamental economic variables, expected returns, and bond fund performance. Journal of Finance 50:1229–56.

———. 1996. Survivorship bias and mutual fund performance. Review of Financial Studies 9:1097–120.

———. 2009. An examination of mutual fund timing ability using monthly holdings data.Working paper, New York University.

———. 2011. Does size matter? The relationship between size and performance. Working paper, New York University.

Evans, R. B. 2010. Mutual fund incubation. Journal of Finance 65:1581–611.

Fama, E. F., and K. R. French. 1993. Common risk factors in the returns on stocks and bonds. Journal of Financial Economics 33:3–56.

Fama, E. F., and J. D. MacBeth. 1973. Risk, return and equilibrium: Empirical tests. Journal of Political Economy 71:607–36.

Ferson,W. E., and C. R. Harvey. 1999. Conditioning variables and the cross section of stock returns. Journal of Finance 54:1325–60.

Ferson, W., and H. Mo. 2012. Performance measurement with market and volatility timing and selectivity.

Working Paper, University of Southern California.

Goetzmann,W. N., Z. Ivkovic, and K. G. Rouwenhorst. 2001. Day trading international mutual funds: Evidence

and policy solutions. Journal of Financial and Quantitative Analysis 36:287–309.

(14)

14

Gruber, M. J. 1996. Another puzzle: The growth in actively managed mutual funds. Journal of Finance 51:

783–810.

Henriksson, R. D., and R. C. Merton. 1981. On market timing and investment performance. II. Statistical procedures for evaluating forecasting skills. Journal of Business 54:513–33.

Kacperczyk, M. T., and A. Seru. 2007. Fund manager use of public information: New evidence on managerial skills. Journal of Finance 62:485–528.

Kacperczyk, M. T., C. Sialm, and L. Zheng. 2005. On industry concentration of actively managed equity mutual funds. Journal of Finance 60:1983–2011.

———. 2008. Unobserved actions of mutual funds. Review of Financial Studies 21:2379–419.

Kane, A., D. Santini, and J. Aber. 1992. Lessons from the growth history of mutual funds. Working Paper, University of California, San Diego.

Koijen, R. S. J. 2008. The cross-section of managerial ability and risk preference.Working Paper, University of Chicago.

Krasny, Y. 2010. Asset pricing with status risk. Quarterly Journal of Finance 1:495–549.

Litzenberger, R., and K. Ramaswamy. 1979. The effects of personal taxes and dividends on capital asset prices:

Theory and empirical evidence. Journal of Financial Economics 7:163–95.

Newey, W. K., and K. D. West. 1987. A simple positive semi-definite heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55:703–708.

Sun, Z., A. Wang, and L. Zheng. 2012. The road less traveled: Strategy distinctiveness and hedge fund performance. Review of Financial Studies 25:96–143.

Titman, S., and C. Tiu. 2011. Do the best hedge funds hedge? Review of Financial Studies 24:123–68.

Treynor, J. L., and K. Mazuy. 1966. Can mutual funds outguess the market? Harvard Business Review 44:131–

6.

Wermers, R. 2003. Are mutual fund shareholders compensated for active management “bets”? Working Paper, University of Maryland.

White, H. 1980. A heteroscedasticity-consistent covariance matrix estimator and a direct test for

heteroscedasticity. Econometrica 48:817–38.