Forecasting Cryptocurrency Prices with Machine Learning
Investors seek alpha through many techniques and one of the most popular, more modern approaches used to gain a market edge is to leverage machine learning for asset price forecasting. These algorithms utilize all forms of data, from forum chatter to weather patterns, in an attempt to create a robust data representation of the factors influential to an underlying asset’s price action. These coded features are then used to detect meaningful correlations with the asset of interest. Modeling price action with these more expressive datasets can be highly effective, unfortunately, data availability to is often the limiting factor preventing sufficiently sophisticated, but non-institutionally backed analysts from employing these more complex modeling techniques.
Without these robust datasets, these analysts are only able to develop models using the daily change in prices, which limits them to a specific class of models. While machine learning techniques, in general, have the advantage of historically achieving high predictive power while being relatively simple, their shortcomings must be well understood to maximize their utility in the presence of price uncertainty.
Over the last year, our team has published monthly machine learning-based predictions for our followers on Twitter and in this article, we’ll share our insights into the successes, and failures, of using this class of time series modeling techniques to trade.
Before any forecasts can be made, data must first be collected. Pricing data for Bitcoin can be procured directly via download or API from various sources across the web, and once gathered, typically do not require substantial restructuring before analysis. With this collection of datum in tow, our approach involves directly embedding the price feed into five statistical learning algorithms: an Autoregressive Integrated Moving Average, a Brownian Motion model, a Long-short Term Memory Neural Network model, a simple linear regression model and the open-sourced Prophet forecasting model. In the following figure, the model forecasts of each model are displayed.
The horizontal axis tracks time while the price of Bitcoin is mapped to the vertical axis. Within each graph is the actual price movement in black covering the last 12 months of market volatility. Blue prediction bounds for each model were included, when possible to calculate. It is immediately obvious that these techniques create predictions that are quite varied across the slope of the forecast lines, the width of the prediction bounds, the overall trend, and how they represent inter-period variation. When contrasting models, the simple linear regression and ARIMA both predict flat price movement, but with wildly different margins of error associated with their estimates. This can be further compared with the LSTM model that does not provide any bounds for its forecasts. For situations where multiple distinct modeling techniques are fit to the data, the algorithm-derived predictions are ensembled to form a meta-model where an average of all estimates is taken to form the final forecast. The figure below is of one such meta-model derived for the month of June 2021.
The presumption is that an aggregate estimate is sufficiently robust to mitigate the effects of model bias or outliers, which could take the form of an over-emphasis of specific aspects of a single time series model. Using our final set of predictions we can actually begin to assess the accuracy of these models.
The forecast for the closing price on June 30th, made on the first of that month, is quite reflective of the predictive power of ensembling these techniques. Aside from 2 instances of the price exceeding the bounds, the price action of Bitcoin was contained within the blue predictive bounds. With an actual closing price of $36,860, the forecast’s estimate of $35,040 resulted in a difference of only $1,820.Considering our model can theoretically produce highly accurate results, the question becomes how do you actually trade using this information? Trading platforms such as dy/dx, offer isolated margin positions whose trade parameters are well aligned with the outputs from our algorithms.
The initial mapping comes from the slope or absolute value of the difference of the starting and ending price of the forecasts. A positive slope and increasing price suggest that during this time period the value of any long position should increase. The upward bound serves as the model most liberal estimate of the price increase, projecting the best case scenario or max profit estimate of the trade. Likewise, the lower bound is the estimate of the lowest possible price during the time window. If treated as a maximum lower bound, this value can be a guideline for setting the liquidation price of an isolated margin trade. Following this rubric, under the guidance of this model a trader would have entered into a leveraged long position on BTC with a liquidation price of $33,500. At the end of the month, the trader would have been down ever so slightly as the final price of the month ended roughly flat. They would have, however, had two opportunities to close their position in the profit as the price did either touch or come close to the upper bounds. The price did not exceed the minimum of the lower bound which means the position would have not been liquidated. From a risk perspective, the model suggested a reasonable trade even in the midst of extreme volatility. Below is the meta-model ensemble predictions for Bitcoin’s price action in July 2021. Given what we’ve learned, how would you construct a trade based on the model’s forecast?
In this article we’ve shown how time series models, based solely on daily price movement, can produce reasonable forecasts of the monthly closing value of cryptocurrency assets. Considering the data is readily available on the web and there are common implementations of these models, machine learning estimates are quite attainable for a reasonably sophisticated crypto-analyst. Though these models will not anticipate black swan events, they provide reasonable mappings to common trading parameters that can assist investors to manage risk by taking on positions based on objective data analysis and not just their own hunches.