SarimaxSARIMAX (Seasonal AutoRegressive Integrated Moving Average with eXogenous elements) is a time series forecasting model that extends the classical SARIMA (Seasonal AutoRegressive Integrated Moving Average) version by incorporating exogenous variables. SARIMAX is mainly beneficial for predicting time series information of famous seasonality and can be inspired by outside factors. Seasonality is a deterministic source of nonstationarity common in realworld data. Now, to understand it we will expand on the ARMA family of time series models by introducing SARIMA, a model that incorporates a seasonalperiodbased autoregressive and moving average component to represent seasonal time series data. We will go over each parameter and how it connects to its counterpart in the ARIMA model. We will next fit our SARIMA model with the BoxJenkins technique and use a rolling forecast to verify its accuracy. Code:Importing LibrariesReading the DatasetOutput: SeasonalitySeasonality is a prevalent characteristic and predictable cause of nonstationarity that we must account for in realworld data. Whereas our ARIMA model has just three factors to contend with, simulating seasonality will require four. The four factors are:
Specified as SARIMA(p,d,q)(P,D,Q)[m] We will utilize the ACF/PACF plots again to calculate the Seasonality parameter values. Choosing values for each parameter is not an exact exercise; theere are general guidelines, and iterations will be required to obtain the proper model parameters. Seasonal Period (M) Our seasonal period, m, represents the number of periods in each season. We can see this in the ACF and PACF plots, where our value m corresponds to the lag with the highest autocorrelation coefficient. The 0 value is always 1, as it has a perfect link with the current timestep. If our data is seasonal, we would anticipate the next highest associated value to the current timestep to be the same point in the season precisely one season prior. The seasonal period value will also help us calculate P and Q. Seasonal Autoregressive Order(P) Our seasonal autoregressive order is similar to our autoregressive order, except that instead of determining the order of previous timesteps that impact the value at the present timestep, we seek prior timesteps on the order of seasons of m or the seasonal period. This is why the mth lag is used to calculate the value of P. If lag m is positive, P should be greater than or equal to one. Otherwise, P should be zero. We may fit the model with a value of 1 and then increment as needed. Seasonal Difference Order (D). The rule of thumb for our D parameter is that our series and seasonal differencing should not exceed 2. If our seasonal pattern remains constant throughout time, we may set D=1, whereas setting D=0 indicates that the seasonal pattern is unstable. Seasonal Moving Average Order (Q). We determine Q in the same way we determine P. If the lag m is negative, Q >= 1, which is the reverse of how we fitted P. We normally do not want P+Q to surpass two. We want to keep our parameter values minimal since the risk of overfitting increases as we utilize more sophisticated models. Output: Data ExplorationLooking at our time series, we see an interesting pattern that might be rectified with a square root transform if it weren't for the significant decrease during 20072008. This tendency is caused by business cycles and other exogenous influences that influence economic statistics. To eliminate the trend, we may use the boxcox transformation. Next, we may differentiate the time series to account for trends and present both the result and the resultant ACF and PACF graphs. The box cox transformation converts nonnormal data to normal distribution. We may supply an input to lambda, which will automatically do a log transform, square root transform, or reciprocal transform. If we don't supply any arguments to lambda, it will adjust itself and return a lambda. Output: Output: Parameter EstimationAfter the boxcox transformation and differencing, we no longer have a trend, although seasonality remains apparent and erratic. Because our lambda value is near 1 (no transform) and the boxcox transformation does not appear to make a significant impact in the series, We have opted to remove it, and the ACF and PACF plots show an untransformed series. We may utilize the ACF and PACF graphs to find the right parameters for our SARIMA model. First, looking at the ACF, we see that our largest autocorrelation occurs at lag 12, which makes sense given our original figure and the fact that this is monthly data. Given that m is positive, this implies that P = 1 and Q = 0. The ACF and PACF charts both show their first major lag at 1. So, p = 1 and q = 1. Because we differentiated the series, d will be 1. So we have parameters as follows: SARIMA(1,1,1)(1,0,0)[12] We'll divide our data into two sets: training and testing, and then fit our SARIMA model. Output: Model MeasurementsScaledependent errors Scaledependent errors are errors on the same scale as the data.
Percentage errors Percentage errors are unitfree and are often used to assess forecast performance across data sets. The percentage errors are often in the form of estimated_value/true_value. The drawback of percentage mistakes is that they might result in endless or undefined results when the correct value is zero. Also, when data lacks a meaningful zero, such as temperature, using division followed by absolute value, as in MAPE, might result in inaccuracies that do not represent the genuine difference.
Scaled Errors Scaled mistakes seek to address some of the issues with percentage errors.
Output: Model EvaluationUnfortunately, our model does not suit the trend particularly well. The issue with the confectionery forecast data is that it follows an unsteady pattern. What we might do next is separate the trend from the rest of the time series using a HodrickPrescott Filter, a bandpass filter designed to cope with business cycles in economic data. Instead, we will make a rolling forecast. A rolling forecast is when we forecast one step ahead and then fit our model to the new data, including data from the test set. It is costly since the model must be refitted every timestep, but it allows us to foresee where a bad step will add to the overall error while having no influence on future projections. This means that early departures from the time series due to the trend will not impair our ability to predict future steps. Output: As you can see now, that forecasting is very close to the Test cases.Hence, our model is performing well.
Next TopicAccuracy, Precision, Recall or F1
