Forecasting SPY prices using Facebook’s Prophet
Using Facebook’s Prophet, an open-source, time series forecasting procedure to predict SPY (SPDR S&P 500 ETF Trust) closing prices.
To apply Facebook’s Prophet forecasting procedure to historical SPY (SPDR S&P 500 ETF Trust) market data to gather future pricing predictions.
A few notes
- I’m by no means a data scientist, so this is more of an exploratory analysis than an accurate one
- For sake of brevity, I won’t be using a training/test split or measuring the error of the model, I will just train the model on the entire dataset and then make a prediction
- Downloading the data — exporting the data from Yahoo Finance as a CSV
- Exploring the data — loading and exploring the data using Pandas
- Fitting the model — reading in the data and applying a basic fit of the Prophet model to the data
- Visualizing the forecast — visualizing the forecasted pricing data
This article is not investment advice, please conduct your own due diligence. This is merely a simple analysis.
Before we jump in, let’s give a little background on SPY and on Facebook’s Prophet.
The SPDR S&P 500 ETF Trust (SPY) is an ETF (Exchange Traded Fund) that tracks the performance of the S&P 500 index. SPY is also the largest ETF in the world, and is popular compared to other ETFs that track the S&P 500 because of the high volume, or the number of shares that trade on a given day (we’ll be able to see the volume per day in the CSV we export from Yahoo Finance).
For more information on ETFs, Investopedia gives a good overview.
Facebook Prophet is an open source, automated forecasting procedure for time series data. I’m not going to dive too much into the mathematics or implementation details of Prophet, but if you are more interested, you can read the research paper. Prophet makes it easy to handle outliers, adjust to different time intervals, deal with holidays, and leaves the ability to easily tune the forecasting model.
Now that we have a general idea of what we’re trying to predict and the tool we’ll use to forecast, let’s dive into the actual data.
Downloading the data
Thanks to Yahoo Finance, we can download historical pricing data for free. You can click here to view the SPY historical pricing data.
Click on the Historical Data tab, and then we can adjust our Time Period to the Max as seen below (back to January 1993).
Now we can click download to get our CSV and start diving into the data.
Exploring the data
Let’s fire up Pandas and load our data into a DataFrame to see what general insights we can extract.
Now that we know a bit more about our data in general, we can create a model using Prophet.
Fitting the model
Since we’re not concerned in this post about making our model the best it can be, we can train our model on the entire dataset.
This typically isn’t a good practice. When trying to make an accurate prediction, you should use training and test subsets of the data and calculate errors within your model and use those results to tune hyperparameters.
Nevertheless, let’s continue.
Just like that, we have built our model for a forecast. All we have left to do is generate dates to predict values for, and run the actual prediction.
Visualizing the forecast
Now let’s forecast with our model and visualize the results.
Here is the output:
A few things to notice
- The black dots are the training data points
- The blue outline is the confidence interval
- The line within the confidence interval is the actual forecast
Based on our results, we can see the forecast is fairly linear and the confidence interval is relatively narrow (due to the volume of date). The behavior of the stock market since Covid-19 started back around February 2020 has be a little unorthodox, so let’s narrow our model to be trained back to data starting in 2017 to see if there is an effect.
This results with our new output:
Now we can see a much wider confidence interval and a bit more of a bumpy forecast line; however, this looks much more realistic in terms of stock market prediction.
All in all, Facebook’s Prophet is a very fast, impressive, and strongly abstracted library. The entire script, including reading in the data, training and forecasting two models, and plotting both of the forecasts took right around 25 seconds.
I would love to see this tool in the hands of an actual data scientist to see the accuracy of the models they’d be able to create using Prophet.