Welcome to the last day of the first week of 30 days of backtesting! We hope you’re enjoying the ride. If you have any questions or concerns, you can reach us at the contact details listed at the bottom of this post.
On Day 6 we defined momentum rather roughly and ran a bunch of tests to identify the linear relationship between different lookback and look forward periods. However, we didn’t go into detail about the results.
Yesterday we examined the eponymous Fama-French factors to see if we could find something that will help us develop an investment strategy to backtest. It turned out the best performing factor was the market risk premium, which is essentially the return to the market in excess of the risk-free rate. In other words, the best factor is buy-and-hold! I guess that means we’ve finished 24 days early. Just buy the index.
The day has finally arrived! Time to start backtesting! We’ve always wanted to test how Fibonacci retracements with Bollinger Band breakouts filtered by Chaikin Volatility would perform while implementing rolling stop-loss updates based on the ATR scaled by the 7-day minus 5-day implied volatility rank.1 Maybe we’re getting ahead of ourselves. Expeditions are fun and it’s always thrilling to explore uncharted territory. But it’s also easy to get lost and forget that we’re ultimately trying to generate superior, risk-adjusted returns.
We’re four days in and you’re probably wondering when are we actually going to start backtesting?! The answer is that while it is natural to want to rush to the fun part – the hope and elation of generating outsized returns and Sharpe Ratios greater than 2 – the reality is getting the foundation right should serve us well in the future. Or so that’s what we always hear from those longer in the tooth.
Yesterday we investigated the effect of using the 200-day simple moving average (200SMA) as a proxy for a rules-based investing method. The idea was to approximate what a reasonably rational actor/agent might do in addition to the buy-and-hold approach. When folks talk about research, backtesting, and forecast comparisons, they usually use a naive model against which one compares performance. In econometric forecasting, that naive approach is often represented mathematically as \(f(x_n) = x_{n-1}\) or something like that.
On Day 1, we decided on a few benchmarks to use for our backtest. That is, a 60-40 and 50-50 weighting of the SPY and IEF ETFs. What we want to add in now is the Hello World version of trading strategies – the 200-DAY MOVING AVERAGE! Why are we adding this to our analysis? As we pointed out yesterday, the typical benchmark against which to compare a trading strategy is buy-and-hold.
Yesterday we set out our plan to backtest a strategy using the SPY ETF, which tracks the S&P 500. Before we commence, we obviously need to establish a baseline. What metrics will we use to assess the strategy? How will we define success? What benchmarks will we use? Typically, for a single asset strategy the comparison is buy-and-hold performance. That is, if you’re using Fibonacci retracements with Bollinger Band breakouts filtered by Chaikin Volatility to generate buy and sell signals, you’ll usually compare the performance of that strategy to one in which you bought the underlying on the first day of the test and held it until the end.
Over the next 30 days or so, we’ll be conducting a test of the emergency backtesting system. The test will attempt to go through all the steps one might usually follow to analyze, build, test, and then deploy an investment strategy. Now this probably won’t be a strategy that knocks the cover off the ball. It may not even be profitable. But our thought process is as follows.
How often does one see the entire backtesting process presented step-by-step in a reproducible way?
Summer has a way of getting away from you. That is as much relevant for blog writing as it is for life. Nonetheless, before summer ends we wanted to dust off our series on regime prediction and close the loop on the remaining techniques we had yet to investigate. That is, in our last post we initiated a relatively simple rolling method to retrain the model on more near term (and perhaps more relevant) data.
Our last post finished up examining the three different methods used to predict market regimes in the Gold Miners ETF, GDX – namely, clustering, Gaussian Mixture Methods (GMMs), and Hidden Markov Models (HMMs). We found GMMs performed the best in terms of proof-of-concept. But there was a lot of work to do to go from backtest to viable trading strategy.
In the next few posts, we’ll look at some of the ways we can improve our backtests.