Forecasting SPY prices using Facebook’s Prophet

Example prediction of SPY (SPDR S&P 500 ETF Trust) closing prices for a year in the future

Using Facebook’s Prophet, an open-source, time series forecasting procedure to predict SPY (SPDR S&P 500 ETF Trust) closing prices.

tl;dr

Goal

To apply Facebook’s Prophet forecasting procedure to historical SPY (SPDR S&P 500 ETF Trust) market data to gather future pricing predictions.

A few notes

  • I’m by no means a data scientist, so this is more of an exploratory analysis than an accurate one
  • For sake of brevity, I won’t be using a training/test split or measuring the error of the model, I will just train the model on the entire dataset and then make a prediction

Process overview

  1. Downloading the data — exporting the data from Yahoo Finance as a CSV
  2. Exploring the data — loading and exploring the data using Pandas
  3. Fitting the model — reading in the data and applying a basic fit of the Prophet model to the data
  4. Visualizing the forecast — visualizing the forecasted pricing data

Python dependencies

Important

This article is not investment advice, please conduct your own due diligence. This is merely a simple analysis.

Before we jump in, let’s give a little background on SPY and on Facebook’s Prophet.

The SPDR S&P 500 ETF Trust (SPY) is an ETF (Exchange Traded Fund) that tracks the performance of the S&P 500 index. SPY is also the largest ETF in the world, and is popular compared to other ETFs that track the S&P 500 because of the high volume, or the number of shares that trade on a given day (we’ll be able to see the volume per day in the CSV we export from Yahoo Finance).

For more information on ETFs, Investopedia gives a good overview.

Facebook Prophet is an open source, automated forecasting procedure for time series data. I’m not going to dive too much into the mathematics or implementation details of Prophet, but if you are more interested, you can read the research paper. Prophet makes it easy to handle outliers, adjust to different time intervals, deal with holidays, and leaves the ability to easily tune the forecasting model.

Now that we have a general idea of what we’re trying to predict and the tool we’ll use to forecast, let’s dive into the actual data.

Downloading the data

Thanks to Yahoo Finance, we can download historical pricing data for free. You can click here to view the SPY historical pricing data.

Click on the Historical Data tab, and then we can adjust our Time Period to the Max as seen below (back to January 1993).

Now we can click download to get our CSV and start diving into the data.

Exploring the data

Let’s fire up Pandas and load our data into a DataFrame to see what general insights we can extract.

Now that we know a bit more about our data in general, we can create a model using Prophet.

Fitting the model

Since we’re not concerned in this post about making our model the best it can be, we can train our model on the entire dataset.

This typically isn’t a good practice. When trying to make an accurate prediction, you should use training and test subsets of the data and calculate errors within your model and use those results to tune hyperparameters.

Nevertheless, let’s continue.

Just like that, we have built our model for a forecast. All we have left to do is generate dates to predict values for, and run the actual prediction.

Visualizing the forecast

Now let’s forecast with our model and visualize the results.

Here is the output:

A few things to notice

  • The black dots are the training data points
  • The blue outline is the confidence interval
  • The line within the confidence interval is the actual forecast

Based on our results, we can see the forecast is fairly linear and the confidence interval is relatively narrow (due to the volume of date). The behavior of the stock market since Covid-19 started back around February 2020 has be a little unorthodox, so let’s narrow our model to be trained back to data starting in 2017 to see if there is an effect.

This results with our new output:

Now we can see a much wider confidence interval and a bit more of a bumpy forecast line; however, this looks much more realistic in terms of stock market prediction.

Conclusion

All in all, Facebook’s Prophet is a very fast, impressive, and strongly abstracted library. The entire script, including reading in the data, training and forecasting two models, and plotting both of the forecasts took right around 25 seconds.

I would love to see this tool in the hands of an actual data scientist to see the accuracy of the models they’d be able to create using Prophet.

--

--

--

@bschoeneweis | Software Developer in Fort Worth, TX

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

INTRODUCTION TO DATA SCIENCE

SparkRSVD open-sourced by Criteo for large scale recommendation engines

Como construir uma chaminé?(2/3) https://t.co/x6IClewjWI https://t.co/adfSq29RGl

Three reasons graph analytics can power your understanding of networks and relationships

Betting: Odds and Breakeven Probability

Creating a Multifaceted Grocery Recommender System

ETL and DI in Data Science: usage in financial markets data warehouses

Implementing Google OAuth in Streamlit

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Bradley Schoeneweis

Bradley Schoeneweis

@bschoeneweis | Software Developer in Fort Worth, TX

More from Medium

Predicting Netflix stock prices using Machine Learning, using Python

Understanding Monte Carlo Simulation and its implementation with Python

The Step-by-Step Manual Calculation of Genetic Algorithm for Optimization

Using Python, Data Science and Classic Time Series models to create a trading algorithm for the US…