The Core Challenge: Decision Making under Uncertainty
All investment asset allocation methodologies start with two core assumptions. First, that a range of different scenarios could occur in the future. Second, that investment alternatives are available whose performance will vary depending on the scenario that eventually develops. A critical issue is the extent to which a decision-maker believes it is possible to accurately predict future outcomes. Traditional finance theory, which is widely used in the investment management industry, assumes that both the full range of possible outcomes and their associated probabilities are known to the decision-maker. This is the classic problem of making decisions in the face of risk.
However, when you dig a bit deeper, you find that this approach is based on some questionable assumptions. The obvious question is: how can a decision-maker know the full range of possible future outcomes and their associated probabilities? One explanation is that they understand the workings of the process that produces future outcomes. In physical systems, and even in simple social systems, this may be true. But this is likely not to be the case when it comes to investment outcomes. Financial markets are complex adaptive systems, filled with positive feedback loops and nonlinear effects caused by the interaction of competing strategies (for example, value, momentum, and passive approaches) and underlying decisions made by people with imperfect information and limited cognitive capacities who are often pressed for time, affected by emotions, and subject to the influence of other people. An investor can never fully understand the way this system produces outcomes.
Even without such causal understanding, an investor could still believe that the range of possible future outcomes can be described mathematically, based on an analysis of past outcomes. For example, you could use historical data to construct a statistical distribution to describe the range of possible future outcomes, or devise a formula for projecting a time series into the future. The validity of both these approaches rests on two further assumptions. The first is that the historical data used to construct the distribution or time-series algorithm contain sufficient information to capture the full range of possible future outcomes. The second is that the unknown underlying process that generates the historical data will remain constant, or only change slowly over time. Over the past decade, we have seen repeated evidence that in financial markets these two assumptions are not true, for example in the meltdown of the Long Term Capital Management hedge fund in 1998, the crash of the technology stock bubble in 2001, and the worldwide financial market panic in 2008. In these cases, models based on historical data failed to identify the full range of possible outcomes, or to accurately assess the probability of the possible outcomes they identified. People will live with the consequences of these failures for years.
This is not to say that skilled forecasters do not exist, however. They certainly do. Unfortunately, it is usually easier to identify them with the benefit of hindsight (which also helps to distinguish between skill and luck) than it is to pick them in advance.
This discussion leads to an important conclusion. In the real world, asset allocators must make decisions not in the face of risk, but rather under conditions of true uncertainty, in which neither the full range of possible future outcomes nor their associated probabilities are fully known in advance. This has two critical implications. First, there is an inescapable trade-off between any forecasting model’s fidelity to historical data and its robustness to uncertainty. The more carefully a model is backtested and tightly calibrated to accurately reproduce past outcomes, the less likely it is to accurately predict the future behavior of a complex adaptive system. Second, confidence in a forecast increases only when models based on differing methodologies (for example, causal, statistical, time-series, and judgmental forecasts) reach similar conclusions, and/or when their individual forecasts are combined to reduce the impact of their individual errors. In short, decision-making under uncertainty is much harder than decision-making under risk.
Asset Allocation: A Simple Example
Let us now move on to a more concrete, yet still simple, example to illustrate some key issues that underlie the most common asset allocation methodology in use today. Our quantitative data and results are summarized in the following table:
|Asset A||Asset B|
|Year 1 return||1%||3%|
|Year 2 return||5%||7%|
|Year 3 return||9%||20%|
|Year 4 return||5%||–5%|
|Year 5 return||1%||8%|
|Sample arithmetic mean||4.2%||6.6%|
|Standard error of the mean||1.5%||4.1%|
|Sample geometric mean||4.1%||6.3%|
|Sample standard deviation||3.3%||9.1%|
|Covariance of A and B||0.12%|
|Correlation of A and B||0.41|
|Expected arithmetic annual portfolio return||5.6%|
|Expected portfolio standard deviation||6.1%|
|Expected geometric annual portfolio return||4.9%|
Our portfolio comprises two assets, for which we have five years of historical data. In line with industry norms, we will treat each data point as an independent sample (i.e. we will assume that no momentum or mean-reversion processes are at work in our data series) drawn from a distribution which includes the full range of results that could be produced by the unknown return-generating process. As you can see, the sample mean (i.e. arithmetic average) annual return is 4.2% for Asset A and 6.6% for Asset B. So it is clear that Asset B should produce higher returns, right? Wrong. The next line of the table shows the standard error for our estimate of the mean. The standard error is equal to the sample standard deviation (which we’ll discuss below) divided by the square root of the number of data points used in the estimate (in our case, there are five). Assuming that the data come from a normal distribution (that is, one in the shape of the bell curve), there is a 67% chance that the true mean will lie within plus or minus one standard error of our sample mean, and a 95% chance that it will lie within two standard errors. In our example, the short data history, along with the relatively high standard deviation of Asset B’s returns, means that the standard errors are high relative to the sample means, and we really can’t be completely sure that Asset A has a higher expected return than Asset B. In fact, we’d need a lot more data to increase our confidence about this conclusion. Assuming no change in the size of the standard deviations, the size of the standard error of the mean declines very slowly as the length of the historical data sample is increased—the square root of 5 is about 2.2; of 10, about 3.2; and of 20, about 4.5. Cutting the standard error in half—that is, doubling the accuracy of your estimate of the true mean—requires about a fourfold increase in the length of the data series. Considering that 20 years is about the limit of the available data series for many asset classes, you can see how this can create problems when it comes to generating asset allocation results in which you can have a high degree of confidence.
The next line in the table, the sample geometric mean, highlights another issue: As long as there is any variability in returns, the average return in a given year is not the same as the actual compound return that would be earned by an investor who held an asset for the full five years. In fact, the realized return—that is, the geometric mean—will be lower, and can quickly be approximated by subtracting twice the standard deviation squared from the arithmetic mean. In summary, the higher the variability of returns, the larger the gap will be between the arithmetic and the geometric mean.
The following line in the table shows the sample standard deviation of returns for Assets A and B. This measures the extent to which they are dispersed around the sample mean. In many asset allocation analyses, the standard deviation (also known as volatility) is used as a proxy for risk. Common sense tells you that the correspondence between standard deviation and most investors’ understanding of risk is rough at best. Most investors find variability on the downside much less attractive than variability on the upside—and they like uncertainty even less than risk, which they can, or think that they can, measure. Also, when it comes to the distribution of returns, it is not just the average and standard deviation that are of interest to investors. Whether the distribution is Gaussian (normal)—that is, it has the typical bell curve shape—is also important. Distributions that are slightly tilted toward positive returns (as is the case with Assets A and B) are preferable to ones that are negatively skewed. Skewness should also affect preference for distributions with a higher percentage of extreme returns than the normal distribution (i.e. ones with high kurtosis). Preference for higher kurtosis should rise as skewness becomes more positive, and fall as it becomes more negative (i.e. as the probability of large negative returns rises). In fact, in our example, Asset B has positive skewness and higher than normal kurtosis (compared to Asset A’s lower than normal kurtosis). Hence, some investors might be willing to trade off higher positive skewness and kurtosis against higher standard deviation in their assessment of the overall riskiness of Asset B. This might be particularly true when, as in the case of some hedge fund strategies, the expected returns on an investment have a distribution that is far from normal. However, many asset allocation methodologies still do not take these trade-offs into account, because they either assume that the returns on assets are normally distributed, or they assume that investors only have preferences concerning standard deviation, and not skewness or kurtosis.
Covariance and correlation
Covariance and correlation are two ways of measuring the relationship between the time series of returns on two or more assets. Covariance is found by multiplying each year’s return for Asset A by the return for Asset B, calculating the average result, and subtracting from this the product of the average return for Asset A and by the average return for Asset B—or, more pithily, it is the average of the products less the product of the averages. Correlation standardizes the covariance by dividing it by the product of the standard deviation of Asset A’s returns, multiplied by the standard deviation of Asset B’s returns. Correlation takes a value between minus one (for returns that move in exactly opposite directions) and plus one (for returns that move exactly together). In theory, a correlation close to zero implies no relationship between the returns on the two sets of returns. Unfortunately, most people forget that correlation only measures the strength of the linear relationship between variables; if this relationship is nonlinear, the correlation coefficient will also be deceptively close to zero. Finally, covariance and correlation measure the average relationship between two return series; however, their relationship under extreme conditions (i.e. in the tails of the two return distributions) may differ from this average. This was another lesson taught by the events of 2008.
Forming a Portfolio
Let us now combine Asset A and Asset B into a portfolio in which the first has a 40% weight and the second has a 60% weight. The second-to-last row of our table shows the expected arithmetic portfolio return of 5.6% per year. This is simply the weighted average of each asset’s expected return. The calculation of the expected standard deviation of the portfolio is more complicated, but it highlights the mathematical logic of diversification. The portfolio standard deviation equals the square root of the portfolio variance. The latter is calculated as follows: [(Asset A weight squared multiplied by Asset A standard deviation squared) plus (Asset B weight squared multiplied by Asset B standard deviation squared) plus (two times Asset A weight multiplied by Asset B weight times the covariance of A and B)]. As you can see, the portfolio standard deviation is 6.1%, which is less than 6.8%—the weighted average of Asset A’s and Asset B’s standard deviations. The cause of this result is the relatively low covariance between A’s returns and B’s returns (or alternatively, their relatively low correlation of 0.41). The fact that their respective returns apparently move in less than perfect lockstep with each other reduces the overall expected variability of the portfolio return. However, this encouraging conclusion is subject to two critical caveats. First, it assumes the absence of a nonlinear relationship between A’s returns and B’s returns that has not been picked up by the correlation estimate. Second, it assumes that the underlying factors giving rise to the correlation of 0.41 will remain unchanged in the future. In practice, however, this is not the case, and correlations tend to be unstable over time. For example, in 2008, investors discovered that despite relatively low estimated correlations between their historical returns, many asset classes shared a nonlinear exposure to a market liquidity risk factor. When liquidity fell sharply, correlations rose rapidly and undermined many of the expected benefits from portfolio diversification.
Expected Portfolio Returns
The last line in our table is an estimate of the geometric or compound average rate of return that an investor might be expected to actually realize on this portfolio over a multiyear period, assuming that we have accurately estimated the underlying means, standard deviations, and correlations and that they remain stable over time (all questionable assumptions, as we have noted). As you can see, it is less than the expected arithmetic annual return. Unfortunately, too many asset allocation analyses make the mistake of assuming that the arithmetic average return will be earned over time, rather than the geometric return. In the example we have used, for an initial investment of $1,000,000 and a 20-year holding period, this difference in returns results in terminal wealth that is lower by $370,358, or 12.5%, than the use of the arithmetic average would have led us to expect. This is not a trivial difference.