The need to make trade-offs between the effort exerted on specific activities is felt universally by individuals, organizations, and nations. In many cases, activities are mutally-exclusive so partaking in one option excludes participation in another. Deciding how to make these trade-offs can be immensely difficult, especially when we lack quantitative data about risk and reward to support our decision.

In the case of financial assets, there is a wealth of numerical data available. So in this post, I explore how historical data can be leveraged to choose specific mixes of assets based on investment goals. The tools used are quite simple and rely on the mean-variance of assets’ returns to find the efficient frontier of a portfolio.

*Note: The code and data used to generate the plots in this post are available here.*

Most people have probably interacted with Google Finance or Yahoo! Finance to access financial information about stocks. The Python library Pandas provides an exceedingly simple interface for pulling stock quotes from either of these sources:

This integration makes pulling historical pice data very easy, which is great since we want to estimate portfolio performance using past pricing data.

Investopedia defines “Portfolio theory” as:

Modern Portfolio Theory - MPT A theory on how risk-averse investors can construct portfolios to optimize or maximize expected return based on a given level of market risk, emphasizing that risk is an inherent part of higher reward. “Modern Portfolio Theory - MPT.”

Investopedia.com.Investopedia, 2015. Web. 5 December 2015.

The idea is the fundamental concept of diversification and essentially boils down to the idea that pooling risky assets together, such that the combined expected return is aligned with an investor’s target, will provide a lower risk-level than carrying just one or two assets by themselves. The technique is even more effective when the assets are not well correlated with one another.

Portfolio theory’s efficacy can be seen by examining the underlying statistical thoughts behind the concept. For a single asset, the expected mean-variance of returns ‘r’ over ‘n’ observations is given by the standard formulas below.

\[\begin{align*} & Mean(Returns) = \frac{1}{n} \sum_{i=1}^{n}r_i \\ & Var(Returns) = \sum_{i=1}^{n}(r_i-\mu)^2 \\ \end{align*}\]Expanding this to the mean-variance of a portfolio of assets that are potentially correlated to one another, we have the weighted average of the mean returns, and the sum of the terms in the covariance matrix for the assets:

\[\begin{align*} & Mean(Portfolio\ Returns) = \sum_{i=1}^{n}w_i r_i \\ & Var(Portfolio\ Returns) = \sum_{i=1}^{n}\sum_{j=1}^{n}w_i w_j \rho_{i,j} \sigma_i \sigma_j = \mathbf{w}^T \bullet \big(\mathbf{cov} \bullet \mathbf{w}\big) \\ \end{align*}\]The Python package NumPy provides us with all of the functions necessary to calculate the mean-variance of returns for individual assets and portfolios, including vector math operations to keep the code clean.

This is great, if we have a set of assets and their weights within a portfolio, abut 20 lines of Python will retrieve historical quotes, calculate the mean-variance of the assets, and return the expected portfolio performance. But what if we didn’t know which assets to purchase and in what proportion to one another? Surely there are multiple ways to achieve a target return and perhaps those different asset blends will yield varying amounts of risk while providing the same return.

One approach to optimizing a portfolio is application of the Monte Carlo Method. For unfamiliar readers, this is the idea of carrying out repeated trials using randomly generated inputs and observing the outcomes. A physical example of this would be flipping a coin 100 times and counting the number of heads and tails. Based on the results, the observer could estimate whether the coin is fair or not. In the digital world, computers can rapid generate random numbers extremely quickly, enabling observation of outcomes from complex scenarios that are based on the probabilities of certain events occurring. The code below illustrates how simple it is to implement a Monte Carlo simulation using Python:

In the plot below, 25,000 portfolios with randomly varying weights of the following assets were generated and evaluated. Their expected annual return is then plotted versus the historical volatility of the portfolio. Further, each point representing a portfolio has been shaded according to the “Sharpe Ratio.”

From Figure 1 it is obvious that changing the weight of each asset in the portfolio will have a dramatic effect on the expected return and the level of risk that the investor is exposed to. Most notably, if the investor is targeting a 12% return it is evident that the volatility could be reduced to 11.5% but some portfolios share that same expected return with as high as 17.5% volatility. Figure 1 makes it very clear that we must be thoughtful when choosing how much weight each asset in our portfolio should carry.

Based on the insights from Figure 1, it is evident that a target return can be achieved with a wide range of risk levels. This introduces the concept of the “Efficient Frontier.” The Efficient Frontier is the set of portfolios that achieve a given return with the minimum amount of risk for that return. In Figure 1, these were the portfolios furthest to the left for each expected return.

Keeping this concept in mind, a more structured approach can be applied to the selection of asset weights such that we consider only these efficient portfolios which meet a criteria important to the investor. First, she could optimize the weights to target a metric such as the Sharpe Ratio. Alternatively, the investor could opt to find the minimum volatility portfolio and accept the return that that provides. The code below consists of several helper functions that make use of SciPy’s optimization library to solve these two problems.

Figure 2 shows results from these optimizations, the portfolios with the highest Sharpe Ratio and lowest volatility are denoted by the red and yellow stars respectively.

This structured approach can be taken a step further to find the Efficient Frontier across a range of desired target returns. The set of portfolios that provide a target return with minimum volatility is found by iteratively applying the SciPy optimizer as shown in the code below. Figure 3 shows the results for this optimization.

The framework developed up to the point can readily be applied to find the optimum portfolio when an investor is faced with choosing from many assets. Figure 4 below shows the results for a portfolio consisting of Apple, Amazon, BMW, Coke, Ford, Google, Microsoft, Pepsi, and Toyota. Figure 5 shows the weights of each asset for the Sharpe Ratio maximizing and minimum volatility portfolios.