Quantitative Investing

Day 1: Benchmarks

Yesterday we set out our plan to backtest a strategy using the SPY ETF, which tracks the S&P 500. Before we commence, we obviously need to establish a baseline. What metrics will we use to assess the strategy? How will we define success? What benchmarks will we use? Typically, for a single asset strategy the comparison is buy-and-hold performance. That is, if you’re using Fibonacci retracements with Bollinger Band breakouts filtered by Chaikin Volatility to generate buy and sell signals, you’ll usually compare the performance of that strategy to one in which you bought the underlying on the first day of the test and held it until the end.

Day 0: So it begins

Over the next 30 days or so, we’ll be conducting a test of the emergency backtesting system. The test will attempt to go through all the steps one might usually follow to analyze, build, test, and then deploy an investment strategy. Now this probably won’t be a strategy that knocks the cover off the ball. It may not even be profitable. But our thought process is as follows. How often does one see the entire backtesting process presented step-by-step in a reproducible way?

Closing the loop

Summer has a way of getting away from you. That is as much relevant for blog writing as it is for life. Nonetheless, before summer ends we wanted to dust off our series on regime prediction and close the loop on the remaining techniques we had yet to investigate. That is, in our last post we initiated a relatively simple rolling method to retrain the model on more near term (and perhaps more relevant) data.

Rolling regime

Our last post finished up examining the three different methods used to predict market regimes in the Gold Miners ETF, GDX – namely, clustering, Gaussian Mixture Methods (GMMs), and Hidden Markov Models (HMMs). We found GMMs performed the best in terms of proof-of-concept. But there was a lot of work to do to go from backtest to viable trading strategy. In the next few posts, we’ll look at some of the ways we can improve our backtests.

Hidden miners

We conclude our discussion of market regime detection by examining Hidden Markov Models (HMMs). Recall this series was inspired by a post from PyQuant News that highlighted a longer article from the London Stock Exchange Group (LSEG). Those who took the CFA exams probably forgot using HMMs in the quant section. Whatever the case, the intuition behind them is clever. HMMs use observable data to infer non-observable data, or hidden states.

Gaussian gold

Our previous post, used hierarchical clustering to identify market regimes in the gold miners ETF, GDX. This was inspired by a post from PyQuant News that highlighted a longer article from the London Stock Exchange Group (LSEG). In this post, we’ll continue looking at identifying market regimes and using those predictions as signals for a simple trading strategy. As noted, the LSEG article showed three different machine learning methods to segregate regimes: clustering, Gaussian Mixture Models (GMMs), and Hidden Markov Models (HMMs).

Golden clusters

We recently saw a post from PyQuant News that piqued our interest, compelling us to dust off the old blog files and get back into the saddle. The post highlights a longer article from the London Stock Exchange Group (LSEG) on how to use different machine learning models to identify and forecast market regimes. That article uses Refinitiv, a market data service like Bloomberg, which we don’t have access to.

One-N against the world!

We’re taking a short break from neural networks to return to portfolio optimization. Our last posts in the portfolio series discussed risk-constrained optimization. Before that we examined satisificing vs. mean-variance optimization (MVO). In our last post on that topic, we simulated 1,000 60-month (5-year) return series using the 1987-1991 period for our four assets: stocks, bonds, commodities (gold), and real estate. We then iterated through the samples using weights derived from the naive portfolio, the satisficing algorithm1, and the maximum Sharpe ratio portfolio on the previous sample to create portfolios on the next sample.

Not so soft softmax

Our last post examined the correspondence between a logistic regression and a simple neural network using a sigmoid activation function. The downside with such models is that they only produce binary outcomes. While we argued (not very forcefully) that if investing is about assessing the probability of achieving an attractive risk-adjusted return, then it makes sense to model investment decisions as probability functions. Moreover, most practitioners would probably prefer to know whether next month’s return is likely to be positive and how confident they should be in that prediction.

Nothing but (neural) net

We start a new series on neural networks and deep learning. Neural networks and their use in finance are not new. But are still only a fraction of the research output. A recent Google scholar search found only 6% of the articles on stock price price forecasting discussed neural networks.1 Artificial neural networks, as they were first called, have been around since the 1940s. But development was slow until at least the 1990s when computing power rapidly increased.