Forecasting electricity generation - ARIMA (R)

Aug 2021 by Francisco Juretig

In this example, we will use the famous auto.arima() function to select the best ARIMA model for us. The objective will be to predict the daily electricity generation in GWh for a company in argentina during the previous years. The ARIMA approach consists of :

  1. removing the trend from the series
  2. analyzing whether the series is stationary. If not stationary, we difference the series
  3. plotting the autocorrelation and partial autocorrelation function for the residuals
  4. proposing an AR, MA or ARIMA model
  5. checking that the residuals of the ARIMA model are stationary, and have no structure
  6. combining the trend + the ARIMA estimated model to make predictions

This usually involves a lot of manual work, and also creates a lot of difficulty in choosing the best model (steps 4-5). The auto.arima function from the forecast package is meant to do this automatically for us. Conceptually, we just need to provide a time series and auto.arima will choose the best model for us.

Here we just load the data. As you can see, it gets automatically shown in a table next to the panel.



This is actually a bigger project meant to show several things for ARIMA models. Let's focus on the branch shown here. In this branch (here under the auto.arima approach title) we have three panels (two shown here). We use the decompose function to decompose the series into a trend, a seasonal component, and a random component. Ideally, we would like to see a linear trend: here we have a nonlinear one which probably shows there are unobserved components here.



The second panel shows the auto-arima approach. We literally just call the auto.arima() function. And then, we can predict very easily.



The plot function plots the forecast along with the confidence intervals. Pay attention to the output printed here: the best model is an ARIMA with AR=2,MA=0,I=0, and the seasonal part with SAR=2. The drift is the trend that was automatically estimated.

We finally check if the residuals have no remaining structure. This is easy to evaluate. We call the checkresiduals() function to do a Ljung-Box test (the null hypothesis is that all the residuals are zero). As you can see here, we don't reject the null hypothesis. This means that we can conclude there is no structure.

Prefer a video?

Here we use the auto.arima function, and we also do a manual ARIMA model

You will also find the code, project and files that you can download from a github repo.

https://github.com/fjuretig/amazing_data_science_projects.git