Graph algorithms and currency arbitrage, part 2

In the previous post (which should definitely be read first!) we explored how graphs can be used to represent a currency market, and how we might use shortest-path algorithms to discover arbitrage opportunities. Today, we will apply this to real-world data. It should be noted that we are not attempting to build a functional arbitrage bot, but rather to explore how graphs could potentially be used to tackle the problem. Later on we’ll discuss why this is unlikely to result in actionable arbitrage.

Rather than using fiat currencies as presented in the previous post, we will examine a market of cryptocurrencies because it is much easier to acquire crypto order book data. We’ll narrow down the problem further by making two more simplifications. Firstly, we will focus on arbitrage within a single exchange. That is, we’ll look to see if there are pathways between different coins on an exchange which leave us with more of a coin than we started with. Secondly, we will only be considering a single snapshot of data from the exchange. Obviously markets are highly dynamic, with thousands of new bids and asks coming in each second. A proper arbitrage system needs to constantly be scanning for opportunities, but I feel that this is only tangentially related to the core goal of this post.

With all this in mind, the overall implementation strategy was as follows:

  1. For a given exchange, acquire the list of pairs that will form the vertices.
  2. For each of these pairs, download a snapshot of the bid/ask.
  3. Process these values accordingly, assigning them to directed edges on the graph.
  4. Using Bellman-Ford, find and return negative-weight cycles if they exist.
  5. Calculate the arbitrage that these negative-weight cycles correspond to.

The full code for this project can be found in this GitHub repo. If you find this post interesting, don’t forget to leave a star!

Star

Raw data

For the raw data, I decided to use the CryptoCompare API which has a load of free data compiled across multiple exchanges. To get started, you’ll need to register to get a free API key.

As mentioned previously, we will only look at data from Binance. I chose Binance not because it has a large selection of altcoins, but because most altcoins can trade directly with multiple pairs (e.g BTC, ETH, USDT, BNB). Some exchanges have many altcoins but you can only buy them with BTC – this is not well suited for arbitrage.

Firstly, we need to find out which pairs Binance offers. This is done with a simple call (AUTH is your API key string):

import requests
import json 

def top_exchange_pairs():
    url = (
        "https://min-api.cryptocompare.com/data/v3/all/" + 
        "exchanges?topTier=true&api_key=" + AUTH
    )
    r = requests.get(url)
    with open("pairs_list.json", "w") as f:
        json.dump(r.json(), f)

This is an excerpt from the resulting JSON file – for each exchange, the pairs field lists all other coins that the key coin can be traded with:

"Data":{  
   "Binance":{  
      "isActive":true,
      "isTopTier":true,
      "pairs":{  
         "ETH":["PAX", "TUSD", "USDT", "USDC", "BTC"],
         "ONGAS":["BTC", "BNB", "USDT"],
         "PHX":["ETH","BNB","BTC"]
      }
   },
   "Coinbase":{  
      "isActive":true,
      "isTopTier":true,
      "pairs":{  
         "ETH":["DAI", "USD", "USDC", "EUR", "GBP", "BTC"],
         "BCH":["BTC", "GBP", "EUR", "USD"]
      }
   }
}

I then filtered out coins with fewer than three tradable pairs. These coins are unlikely to participate in arbitrage – we would rather have a graph that is more connected.

def binance_connected_pairs():
    with open("pairs_list.json", "r") as f:
        data = json.load(f)
    pairs = data["Data"]["Binance"]["pairs"]
    return {k: v for k, v in pairs.items() if len(v) > 3}

We are now ready to download a snapshot of the available exchange rates for each of these coins.

import os
import tqdm  # progress bar

def download_snapshot(pair_dict, outfolder):
    if not os.path.exists(outfolder):
        os.makedirs(outfolder)

    # Download data and write to files
    for p1, p2s in tqdm(pair_dict.items()):
        url = (
            "https://min-api.cryptocompare.com/data/" 
            + f"ob/l1/top?fsyms={p1}&tsyms={','.join(p2s)}"
            + "&e=Binance&api_key=" + AUTH
        )
        r = requests.get(url)
        with open(f"{outfolder}/{p1}_pairs_snapshot.json", "w") as f:
            json.dump(r.json(), f)

We can then run all of the above functions to produce a directory full of the exchange rate data for the listed pairs.

top_exchange_pairs()
connected = binance_connected_pairs()
download_snapshot(connected, "binance_data")
"EOS": {
    "BNB": {
        "BID": ".2073",
        "ASK": ".2077"
    },
    "BTC": {
        "BID": ".0007632",
        "ASK": ".0007633"
    },
    "ETH": {
        "BID": ".02594",
        "ASK": ".025964"
    },
    "USDT": {
        "BID": "7.0441",
        "ASK": "7.046"
    },
    "PAX": {
        "BID": "7.0535",
        "ASK": "7.07"
    }
}

This excerpt reveals something that we glossed over completely in the previous post. As anyone who has tried to exchange currency on holiday will know, there are actually two exchange rates for a given currency pair depending on whether you are buying or selling the currency. In trading, these two prices are called the bid (the current highest price someone will buy for) and the ask (the current lowest price someone will sell for). As it happens, this is very easy to deal with in the context of graphs.

Preparing the data

Having downloaded the raw data, we must now prepare it so that it can be put into a graph. This effectively means parsing it from the raw JSON and putting it into a pandas dataframe. We will arrange it in the dataframe such that it constitutes an adjacency matrix:

  • Column ETH row BTC is the bid:
    • i.e someone will pay x BTC to buy my 1 ETH
    • this is then the weight of the ETH $\to$ BTC edge.
  • Column BTC row ETH is the ask:
    • i.e I have to pay y BTC to buy someone’s 1 ETH
    • the reciprocal of this is the weight of the BTC $\to$ ETH edge.

I chose this particular row-column scheme because it results in intuitive indexing: df.X.Y is the amount of Y gained by selling 1 unit of X, and df.A.B * df.B.C * df.C.D is the total amount of D gained by trading 1 unit of A when trading via $A \to B \to C \to D$.

The column headers will be the same as the row headers, consisting of all the coins we are considering. The function that creates the adjacency matrix is shown here:

def create_adj_matrix(pair_dict, folder, outfile="snapshot.csv"):
    # Union of 'from' and 'to' pairs
    flatten = lambda l: [item for sublist in l for item in sublist]
    keys, vals = pair_dict.items()
    all_pairs = list(set(keys).union(flatten(values)))

    # Create empty df
    df = pd.DataFrame(columns=all_pairs, index=all_pairs)

    for p1 in pair_dict.keys():
        with open(f"{folder}/{p1}_pairs_snapshot.json", "r") as f:
            res = json.load(f)
        quotes = res["Data"]["RAW"][p1]
        for p2 in quotes:
            try:
                df[p1][p2] = float(quotes[p2]["BID"])
                df[p2][p1] = 1 / float(quotes[p2]["ASK"])
            except KeyError:
                print(f"Error for {p1}/{p2}")
                continue
    df.to_csv(outfile)

Putting the data into a graph

We will be using the NetworkX package, an intuitive yet extremely well documented library for dealing with all things graph-related in python.

In particular, we will be using nx.DiGraph, which is just a (weighted) directed graph. I was initially concerned that it’d be difficult to get the data in: python libraries often adopt their own weird conventions and you have to modify your data so that is in the correct format. This was not really the case with NetworkX, it turns out that we already did most of the hard work when we put the data into our pandas adjacency matrix.

Firstly, we take negative logs as discussed in the previous post. Secondly, in our dataframe we currently have NaN whenever there is no edge between two vertices. To make a valid nx.DiGraph, we need to set these to zero. Lastly, we transpose the dataframe because NetworkX uses a different row/column convention. We then pass this processed dataframe into the nx.Digraph constructor. Summarised in one line:

g = nx.DiGraph(-np.log(df).fillna(0).T)

Bellman-Ford

To implement Bellman-Ford, we make use of the funky defaultdict data structure. As the name suggests, it works exactly like a python dict, except that if you query a key that is not present you get a certain default value back. The first part of our implementation is quite standard, as we are just doing the $n - 1$ edge-relaxations where n is the number of vertices.

But because the ‘classic’ Bellman-Ford does not actually return negative-weight cycles, the second part of our implementation is a bit more complicated. The key idea is that if after $n-1$ relaxations, there is an edge that can be relaxed further then that edge must be on a negative weight cycle. So to find this cycle we walk back along the predecessors until a cycle is detected, then return the cyclic portion of that walk. In order to prevent subsequent redundancy, we mark these vertices as ‘seen’ via another defaultdict. This procedure adds a linear cost to Bellman-Ford since we have to iterate over all the edges, but the asymptotic complexity overall remains $O(VE)$.

def bellman_ford_return_cycle(g, s):
    n = len(g.nodes())
    d = defaultdict(lambda: math.inf)  # distances dict
    p = defaultdict(lambda: -1)  # predecessor dict
    d[s] = 0

    for _ in range(n - 1):
        for u, v in g.edges():
            # Bellman-Ford relaxation
            weight = g[u][v]["weight"]
            if d[u] + weight < d[v]:
                d[v] = d[u] + weight
                p[v] = u  # update pred

        # Find cycles if they exist
        all_cycles = []
        seen = defaultdict(lambda: False)

        for u, v in g.edges():
            if seen[v]:
                continue
            # If we can relax further there must be a neg-weight cycle
            weight = g[u][v]["weight"]
            if d[u] + weight < d[v]:
                cycle = []
                x = v
                while True:
                    # Walk back along preds until a cycle is found
                    seen[x] = True
                    cycle.append(x)
                    x = p[x]
                    if x == v or x in cycle:
                        break
                # Slice to get the cyclic portion
                idx = cycle.index(x)
                cycle.append(x)
                all_cycles.append(cycle[idx:][::-1])
        return all_cycles

As a reminder, this function returns all negative-weight cycles reachable from a given source vertex (returning the empty list if there are none). To find all negative-weight cycles, we can simply call the above procedure on every vertex then eliminate duplicates.

def all_negative_cycles(g):
    all_paths = []
    for v in g.nodes():
        all_paths.append(bellman_ford_negative_cycles(g, v))
    flattened = [item for sublist in all_paths for item in sublist]
    return [list(i) for i in set(tuple(j) for j in flattened)]

Tying it all together

The last thing we need is a function that calculates the value of an arbitrage opportunity given a negative-weight cycle on a graph. This is easy to implement: we just find the total weight along the path then exponentiate the negative total (because our weights are the negative log of the exchange rates).

def calculate_arb(cycle, g, verbose=True):
    total = 0
    for (p1, p2) in zip(cycle, cycle[1:]):
        total += g[p1][p2]["weight"]
    arb = np.exp(-total) - 1
    if verbose:
        print("Path:", cycle)
        print(f"{arb*100:.2g}%\n")
    return arb


def find_arbitrage(filename="snapshot.csv"):
    df = pd.read_csv(filename, header=0, index_col=0)
    g = nx.DiGraph(-np.log(df).fillna(0).T)

    if nx.negative_edge_cycle(g):
        print("ARBITRAGE FOUND\n" + "=" * 15 + "\n")
        for p in all_negative_cycles(g):
            calculate_arb(p, g)
    else:
        print("No arbitrage opportunities")

Running this function gives the following output:

ARBITRAGE FOUND
===============

Path: ['USDT', 'BAT', 'BTC', 'BNB', 'ZEC', 'USDT']
0.087%

Path: ['BTC', 'XRP', 'USDT', 'BAT', 'BTC']
0.05%

Path: ['BTC', 'BNB', 'ZEC', 'USDT', 'BAT', 'BTC']
0.087%

Path: ['BNB', 'ZEC', 'USDT', 'BAT', 'BTC', 'BNB']
0.087%

Path: ['USDT', 'BAT', 'BTC', 'XRP', 'USDT']
0.05%

0.09% is not exactly a huge money, but it is still risk-free profit, right?

Why wouldn’t this work?

Notice that we haven’t mentioned exchange fees at any point. In fact, Binance charges a standard 0.1% commission on every trade. It is easy to modify our code to incorporate this, we just multiply each rate by 0.999, but we don’t need to compute anything to see that we would certainly be losing much more money than gained from the arbitrage.

Secondly, it is likely that this whole analysis is flawed because of the way the data was collected. The function download_snapshot makes a request for each coin in sequence, taking a few seconds in total. But in these few seconds, prices may move – so really the above “arbitrage” may just be a result of our algorithm selecting some of the price movements. This could be fixed by using timestamps provided by the exchange to ensure that we are looking at the order book for each pair at the exact same moment in time.

Thirdly, we have assumed that you can trade an infinite quantity of the bid and ask. An order consists of a price and a quantity, so we will only be able to fill a limited quantity at the ask price. Thus in practice we would have to look at the top few levels of the order book and consider how much of it we’d eat into.

It is not difficult to extend our methodology to arb between different exchanges. We would just need to aggregate the top of the order book from each exchange, then put the best bid/ask onto the respective edges. Of course, to do run this strategy live would require us to manage our inventory not just on a currency level but per currency per exchange, and factors like the congestion of the bitcoin network would come into play.

Lastly, this analysis has only been for a single snapshot. A proper arbitrage bot would have to constantly look for opportunities simultaneously across multiple order books. I think this could be done by having a websocket stream which keeps the graph updated with the latest quotes, and using a more advanced method for finding negative-weight cycles that does not involve recomputing the shortest paths via Bellman-Ford.

Conclusion

All this begs the question: why is it so hard to find arbitrage? The simple answer is that other people are doing it smarter, better, and (more importantly) faster. With highly optimised algorithms (probably implemented in C++), ‘virtual colocation’ of servers, and proper networking software/hardware, professional market makers are able to exploit these simple arbitrage opportunities extremely rapidly.

In any case, the point of this post was not to develop a functional arbitrage bot but rather to demonstrate the power of graph algorithms in a non-standard use case. Hope you found it as interesting as I did!


Graph algorithms and currency arbitrage, part 1

Arbitrage is the holy grail for traders and the bedrock of financial academia. Let’s say you are in an open marketplace with Alice selling oranges for \$1 each and Bob buying them for \$2. As a cunning trader, you realise you can buy an orange from Alice and immediately sell it to Bob for \$1 of “risk-free” profit. However, as you keep reaping this \$1 profit by buying up Alice’s oranges, she raises her prices. Eventually, Alice would be charging \$2 per orange and the arbitrage would be shut. In other words, acting on arbitrage opportunities causes the market to become more efficient, with everything as close to its fair price as possible. Because of this, the only way to profit from true arbitrage is to find and execute opportunities faster than everybody else.

Currency exchange is a natural place to search for arbitrage because there are many different pairs that we can trade. For example, assume we are able to convert GBP to USD, then USD to EUR, then EUR to AUD for an effective rate of 1:2 GBP:AUD. If we were then able to directly swap AUD back to GBP in the ratio 2:1.1, this would be an arbitrage opportunity because we end up with more GBP than we started with while taking zero risk.

Our goal is to develop a systematic method for detecting arbitrage opportunities by framing the problem in the language of graphs. This is a reasonably large topic, so we’ll split it into two parts. Part 1 (this post) will lay the theoretical groundwork, introducing graph algorithms and giving an overview of their application to currency arbitrage. In Part 2 we will present an actual implementation of these ideas in python, applied to cryptocurrencies.

Graphs

Formally, we define a graph as a set of vertices (also called nodes) and edges. These edges can be directed (i.e have little arrows on them) and may have weights. It is completely up to us to decide what the vertices, edges, and weights represent. For example, if we are trying to model a road network we might say that the vertices are cities (or junctions), the edges are the roads themselves, and the weights are the length of the roads. The diagram below shows how a weighted directed graph is typically depicted.

It is useful to formulate real-world problems in terms of graphs because we have about three centuries worth of relevant theory – they were first investigated by Euler in 1736. In particular, there exist many efficient algorithms related to finding the shortest path along a graph, which have widespread applications e.g in mapping.

The Bellman-Ford algorithm finds the minimum weight path from a single source vertex to all other vertices on a weighted directed graph. For the purposes of this post, we don’t need to know anything other than the fact that once the algorithm terminates, each vertex is labelled with the total weight of the minimum weight path from the source to that vertex.

Actually, the above statement has an incredibly important caveat. What happens if we try to find the minimum weight path from vertices C to D in the above graph? The obvious guess would be that it is the direct path along the CD edge, which has total weight 2.1. However, notice that we can in fact reduce this further by following the path CDECD, which would have a total weight of -1. But why stop there? If we go around the loop again, we can reduce it stil further. In fact, each time we traverse the loop, our minimum total weight will reduce by 1. Hence this is called a negative-weight cycle, the existence of which means that the shortest path between C and D is not defined.

This is actually the main reason why we are choosing to use Bellman-Ford over another shortest path algorithm like Dijkstra, despite the latter being asymptotically more efficient. Bellman-Ford is able to detect if there is a negative-weight cycle and as will be seen shortly, this will be the key to detecting arbitrage opportunities.

Graphs for currency arbitrage

How can graphs be used to model currency markets? At a very high level, we will assign currencies to different vertices, and let the edge weight represent the exchange rate. A simple example is presented below: in this market, we can convert 1 GBP to 1.27 USD, 1 USD to 1.43 AUD, etc.

To find the exchange rate between two currencies that do not share an edge, we simply find a path between the two currency vertices and walk along it, multiplying by each successive edge weight (exchange rate). If there are multiple possible paths, choose the path that has the greatest weight product, which corresponds to the most favourable rate.

What would happen if we start with 1 GBP and convert it along GBP $\to$ USD $\to$ AUD $\to$ GBP? Well, multiplying out the edges we see that our initial 1 GBP becomes $1.27 \times 1.43 \times 0.55 \approx 0.999$ GBP. That is to say, we lose a tenth of a penny on the round trip. However, if this number were greater than 1, it would mean that by following this path we end up with more GBP than we started with – this is textbook arbitrage.

So the problem of finding arbitrage can thus be formally posed as follows: find a cycle in the graph such that multiplying all the edge weights along that cycle results in a value greater than 1. In fact we have already described an algorithm that can find such a path – the problem is equivalent to finding a negative-weight cycle, provided we do some preprocessing on the edges.

Firstly, we note that Bellman-Ford computes the path weight by adding the individual edge weights. To make this work for exchange rates, which are multiplicative, an elegant fix is to first take the logs of all the edge weights. Thus when we sum edge weights along a path we are actually multiplying exchange rates – we can recover the multiplied quantity by exponentiating the sum. Secondly, Bellman-Ford attempts to find minimum weight paths and negative edge cycles, whereas our arbitrage problem is about maximising the amount of currency received. Thus as a simple modification, we must also make our log weights negative.

With these two tricks in hand, we are able to apply Bellman-Ford. The minimal weight between two currency vertices corresponds to the optimal exchange rate, the value of which can be found by by exponentiating the negative sum of weights along the path. As a corollary, we have the wonderful insight that: a negative-weight cycle on the negative-log graph corresponds to an arbitrage opportunity.

Conclusion

We have seen that graphs provide an elegant framework for thinking about currency exchange. Concretely, by representing different currencies as vertices and using the negative log of the exchange rate as the edge weight we can apply shortest-path algorithms to find negative-weight cycles if they exist, which correspond to arbitrage opportunities. In the next post, we code up the methodology presented in this post and apply it to some real-world data. Should be fun!


Portfolio optimisation: lessons learnt

Over the past few months I have been busy doing a mixture of blockchain consulting and quantitative finance work for a couple of companies in South East Asia. In particular, I have had the opportunity to investigate the interesting problem of portfolio management for cryptoassets – it was not my first experience with portfolio optimisation, having implemented efficient frontier portfolios at a roboadvisor startup, but this time I took the opportunity to do a deep dive into the subject.

Though I am not an expert, I do believe I am at the stage where I can do a kind of write-up, recounting some of the things I’ve learnt along the way. A lot of my experience has been baked into PyPortfolioOpt, a comprehensive portfolio optimisation package which I recently revamped, but this post will act as an informal guidebook for those interested in pursing the subject.

1. Be careful with in-sample vs out-of- testing

For most people, ‘portfolio optimisation’ basically refers to mean-variance optimisation (MVO) in the style of Markowitz. MVO will tell you the mathematically optimal portfolio allocation, which sounds great, but the caveat is that it requires the (future) expected returns and the covariance matrix as inputs. For those of us who aren’t clairvoyant, these quantities are inaccessible, so we must make do with estimates. The quality of these estimates will be discussed in the next section, but for now I would like to highlight a common mistake, which is backtesting with a forward-looking bias.

The standard way (in textbooks or pedagogical materials) to estimate the expected returns is to take the mean annual return over the past few years. However, it is surprisingly easy to accidentally let future data worm its way into the estimate, resulting in vastly overinflated performance. One of the early mistakes I made was to optimise an equity portfolio using data from 2006-2010, then testing it from 2006-2015. The problem here is the overlap – the 2006 test has had access to data up to 2010, so performance in the 2006-2010 portion of the backtest will be very good.

To be fair, this is often quite an obvious mistake to diagnose, because your portfolios will outperform the benchmark by a ridiculous margin and you will realise that something is wrong. But there are also more subtle ways that you can include future data (e.g survivorship bias), in which case it may not be so clear.

2. Forget about expected returns

Despite their strong theoretical guarantees, efficient frontier portfolios often have poor real-life performance owing to estimation errors in the inputs.

Anyone who has ever tried to pick stocks will know how absurd it is to expect that a stock’s mean return over the past few years will be a good indicator of its future returns. Such simple relationships have almost certainly been arbitraged away – I therefore contend that mean historical returns are almost pure noise. The problem with this is that MVO has no way of distinguishing noise and signal, so often what happens is that the optimiser highlights the noise in the input. This is why there is a ‘running joke’ in the portfolio optimisation literature that a mean-variance optimiser is really just an “error maximiser”1.

In practice, I have found that standard MVO can perform at least in line with the benchmark for equities, but for other asset classes (particularly those which are primarily driven by speculation, like cryptocurrency), the mean historical return is a useless estimator of future return. However, in the next section we will see a simple way round this.

3. You can do better than 1/N

A significant body of research suggests that, in light of the aforementioned failures of MVO, we should instead use 1/N portfolios2 – it is noted that they often beat efficient frontier optimisation significantly in out-of-sample testing.

However, an interesting paper by Kritzman et al. (2010) finds that it is not the case that there is anything special about 1/N diversification: it is just that the expected returns are an incredibly poor estimator and any optimisation scheme which relies on them will likely go astray.

The easiest way to avoid this problem is to not provide the expected returns to the optimiser, and just optimise on the sample covariance matrix instead. Effectively we are saying that although previous returns won’t predict future returns, previous risks might predict future risks. This is intuitively a lot more reasonable – the sample covariance matrix really seems like it should contain a lot of information. Empirical results support this, showing that minimum variance portfolios outperform both standard MVO and 1/N diversification.

In my own work I have found that the standard minimum variance portfolio is a very good starting point, from which you can try a lot of new things:

  • Shrinkage estimators on the covariance matrix
  • Exponential weighting
  • Different historical windows
  • Additional cost terms in the objective function (e.g small-weights penalty)

4. Don’t ignore rebalance costs

When it comes to investment strategy or algorithmic trading, most people leave the “realities” like commission, slippage, and latency to the last stage. However, because of the general high turnover nature of efficient frontier portfolios, this could be quite a dangerous mistake.

In the context of portfolio management, the two main realities that must be accounted for in all backtests are commission and slippage. Latency is not very important because of the longer time horizons involved. Commissions are not hard to analyse: just multiply your broker’s percentage commission by the turnover. Slippage, on the other, hand is a little bit more subtle (though it is only really an issue in less liquid markets). There are a number of variables that affect slippage, but the important ones are transaction volume and urgency (refer to models such as that of Almgren for more).

Specifically, in the case of cryptoasset portfolios, one major problem is the difference in liquidity between bitcoin and a small altcoin like OmiseGo. A portfolio that shifts positions too much among the altcoins may look like it performs well in backtests, but when traded in real life the 0.5% slippage will be a killer.

Once all transaction costs have been included, you may find that it can be difficult to outperform a buy-and-hold market-cap weighted benchmark. Incidentally, this is another failure with 1/N portfolios: for reasonably volatile assets, there is high turnover at each rebalance and thus a large transaction cost.

5. Overfitting

One thing that I respect a lot about machine learning education is that they are fundamentally honest about one of the major issues in the subject: overfitting. Essentially, it is possible to encourage most classifiers to wiggle their decision boundary to accommodate all of the training examples at the cost of generalisation ability. This is done by adjusting the hyperparameters (which do things like specify learning rates) and seeing which ones perform better. Hyperparameter tuning is not a bad thing, but it must be done responsibly lest we simply choose the best parameters that fit the particular train/test setup.

When it comes to standard portfolio optimisation, one doesn’t really have to worry about this because there aren’t any hyperparameters. However, once you start adding different terms to the cost function (each scaled by some parameter), things get a little bit more complicated. For example, let us consider the simple example of minimum variance optimsiation with an L2 regularisation term.

The performance of this portfolio varies greatly depending on the choice of $\gamma$. A natural instinct is to run the optimisation for 10 different values of $\gamma$, but if this is not done carefully, we are probably fitting to the noise within the train and test sets.

Remember that ceteris paribus, a simpler model should be preferred. Black-Litterman, or optimisation with an exotic objective with multiple parameters may be seem to improve performacne on the backtest, but the proof will always be in the out-of-sample pudding. Approach portfolio optimisation like you would any financial machine learning task: with a healthy dose of skepticism.

Conclusion

I still have a lot to learn about portfolio optimisation, particularly with respect to optimisation of higher moments (skew and kurtosis) and things like copula, but the lessons highlighted in this post definitely still apply. If you found this interesting, check out:

PyPortfolioOpt   Star

References

  1. Michaud, R. (1989). “The Markowitz Optimization Enigma: Is Optimization Optimal?” Financial Analysts Journal 45(1), 31–42. 

  2. Demiguel, Victor & Garlappi, Lorenzo & Uppal, Raman. (2009). Optimal Versus Naive Diversification: How Inefficient is the 1/N Portfolio Strategy?. Review of Financial Studies. 22. 10.1093/rfs/hhm075.