Valuations
It is well-established that valuations can predict long-term asset returns (lower valuations are associated with higher future returns). This is true of asset classes, individual securities, and portfolios of securities. But that does not necessarily mean that valuations are an effective timing signal. For example, the U.S. stock market has been trading well above its historical average cyclically adjusted price/earnings ratio, or CAPE (based on data from 1880), since 2010. However, anyone who acted on that information—trimming or liquidating their U.S. stock allocation—probably regretted it, as the market delivered strong performance from January 2010 through August 2018 despite its seemingly high valuation.
If using valuations to time the market is hard, using them to time factors might be even harder, as Cliff Asness and his colleagues at AQR argue in their paper, “Contrarian Factor Timing is Deceptively Difficult.”3 That’s because turnover in these portfolios reduces the predictive power of their valuations, as many of the current holdings may not stay in the portfolio long. Portfolio-level valuations are particularly unreliable for high-turnover strategies like momentum.
The relationship between valuations and factor performance is probably modest at best. In a previous article,4 I found that there was a moderate positive relationship between the valuation spread of the value and growth stocks, small- and large-cap stocks, and the performance of the value and small-cap factors over the next five years, based on data from June 1987 through August 2016. However, much of this effect can be attributed to extreme valuation spreads and subsequent reversals in 1999 and 2000. Additionally, the results were mixed for quality, and I did not find a significant relationship between the valuation spreads for the low-volatility and momentum factors over the market and their future performance.
Other approaches to valuation-timing can lead to different results. BlackRock found that valuation-timing worked by tilting toward factors that were trading the cheapest relative to their own history over the past three years. So, if quality was trading at a significantly lower valuation than in the recent past and value was only a little cheaper than normal, this timing strategy would favor quality. But the fact that the efficacy of valuation-timing depends on how it is defined suggests that it is not particularly robust. Ideally, we’d like to see similar results with different versions of the same idea because that suggests the metric presented wasn’t cherry-picked, providing greater confidence that it may work out of sample.
In practice, the two factor-timing ETFs on the market that incorporate a valuation-timing component use contrarian performance signals rather than traditional value signals to time their factor tilts. The idea is that investors may overreact to a stretch of poor performance, giving up on styles after they have become cheap. Nobel-Prize-winner Richard Thaler and his colleague Werner De Bondt demonstrated that long-term performance reversals among stocks in their 1985 paper, “Does the Stock Market Overreact?”5
A similar effect seems to hold at the portfolio level. To test this, I developed a strategy that targets three of the five factor indexes in Exhibit 1 (in part 1 of this article) with the worst performance over the previous five years, weights them equally, and rebalances once a year using data from November 2003 through August 2018. It modestly outperformed the static equal-weighted portfolio of the five indexes by 92 basis points annually. So, there may be something to this approach. It is less subject to data-mining risk than economic data and has been more extensively tested out of sample.
Momentum
While relative performance tends to mean revert in the long term, it tends to persist in the short term. This short-term persistence, known as momentum, is found nearly everywhere in financial markets, just like value. Given its well-documented ability to predict short-term performance, I would expect it to be one of the more-promising candidates for use as a factor-timing signal. But while BlackRock found evidence that momentum-driven factor-timing works, I did not.
Using the same five factor indexes from the contrarian strategy, I tested a strategy that targets the three factor indexes with the best performance over the past 12 months from November 1999 through August 2018. That strategy lagged the static equal-weighted basket of factor indexes. The results were similar, just targeting the top-performing index as well as the two best performers. This suggests that momentum isn’t a robust factor-timing signal on its own.
Dispersion
The argument for using dispersion as a timing signal is that the return to each factor should be greater when there is greater separation among stocks in the starting universe on the metrics used to construct the factor portfolio. For example, if highly profitable stocks are more profitable than usual relative to stocks with weak profitability, the profitability/quality factor should do better.
BlackRock found that dispersion was the weakest stand-alone timing signal and that it worked better for value and quality than it did for other factors. The firm’s study looked at tilting toward factors with the widest dispersion relative to their own history over the past three years, which had some predictive power.
To test the robustness of dispersion as a timing signal, I looked at how correlated each factor’s returns were with the spread in the metric used to construct it, using annual data for the value, size, momentum, low volatility, and profitability portfolios from the French Data Library from 1964 through 2018. This is a less-sophisticated approach than the one BlackRock used, but if dispersion is a good predictor of performance, the results should directionally line up.
However, I only found the expected relationship for the value factor, and the correlation was only moderate, consistent with my findings for valuation-timing. There was virtually no correlation between the size, profitability, and past momentum spreads and the performance of those factors. And, contrary to expectations, the low-volatility factor did worse when past volatility spreads were wider. This suggests that dispersion is at best a weak factor-timing signal.
The Jury Is Still Out
Given the complexity of factor-timing strategies, potential data-mining issues, limited research on this topic, and results that don’t appear to be robust, more out-of-sample testing and live performance are necessary to build confidence in their efficacy. While it isn’t prudent to write factor-timing off just yet, it’s important to not to lose sight of one of the main goals of multifactor investing: diversification. Tilting toward certain factors at different times reduces diversification and can increase risk if the timing model gets the call wrong. There is also a risk that timing models that rely on valuations or momentum might effectively double down on those factors, potentially causing the portfolio to behave as if it had greater exposure to value or momentum stocks.
If factor-timing has any place at all in a portfolio, it will be important to keep factor tilts modest to maintain diversification. It is also probably best to diversify across multiple signals that tend to work to limit pain when they don’t.
3 Asness, C.S., Chandra, S., Ilmanen, A., & Israel, R. 2017. “Contrarian Factor Timing is Deceptively Difficult.” SSRN. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2928945
4 Bryan, A. 2016. “Don’t Try to Time Factor Strategies.” ETFInvestor – October 2016.
5 De Bondt, W.F.M., & Thaler, R. 1985. “Does the Stock Market Overreact?” J. Finance, Vol. 40, No 3. http://breesefine7110.tulane.edu/wp-content/uploads/sites/110/2015/10/Debondt-and-Thaler.pdf