Forecasting method: ARIMA and SARIMA

Forecasting · ~7 minute read

The Box-Jenkins workhorse

ARIMA — AutoRegressive Integrated Moving Average — is the classical statistical alternative to exponential smoothing. Where exponential smoothing models the data by decomposing it into level, trend, and seasonality, ARIMA models the relationship between an observation and its recent past. The two families overlap (every exponential smoothing model can be written as an ARIMA model) but they have different strengths. ARIMA tends to do better when the autocorrelation structure of the data is the most important feature; exponential smoothing tends to do better when seasonality and trend dominate. This article walks through what ARIMA is, what the parameters mean, when SARIMA (the seasonal extension) is worth the complexity, and the cases where ARIMA in a contact centre is and is not a good investment of effort.

ARIMA decomposes a series into three components, fits each, and recomposes. Each letter corresponds to one of the parameters.

What the letters mean

An ARIMA model is written as ARIMA(p, d, q). The three numbers tell you what is in the model. p is the number of autoregressive terms — how many previous observations the model uses to predict the next one. An AR(1) model predicts the next value from the immediately previous value; an AR(2) model uses the previous two; and so on. d is the order of differencing — how many times the data has been transformed by subtracting the previous value from each observation. Differencing removes trend; one round of differencing handles linear trend, two rounds handle changing trend. q is the number of moving-average terms — not moving averages of the data itself, but moving averages of the recent forecast errors. This sounds odd but works: if your last forecast was 10 too low, you might want to nudge today’s forecast up a little.

A typical model might be ARIMA(1, 1, 1) — one AR term, one round of differencing, one MA term. SARIMA adds seasonal counterparts, written SARIMA(p, d, q)(P, D, Q)_m, where the uppercase letters describe the seasonal AR, differencing, and MA terms and m is the seasonal period (7 for weekly seasonality in daily data, 12 for monthly seasonality in monthly data).

How to choose the model

Three approaches consistently work. The first and most common in modern practice is auto-ARIMA — let the software try many combinations and pick the one that minimises a fit criterion such as AIC. Auto-ARIMA is in R’s forecast package, Python’s pmdarima, and many WFM platforms. It is fast, repeatable, and removes most of the discretion (and most of the opportunity to get it wrong). For most contact centre problems, auto-ARIMA is the right starting point.

The second is the classical Box-Jenkins methodology: examine the autocorrelation and partial-autocorrelation plots of the data and the residuals, infer which terms are needed, fit the model, and check the residuals are white noise. This is a more skilled process and produces models that the practitioner understands deeply, but it is slow and is rarely justified outside a dedicated statistical team.

The third is to pick conservative defaults and not over-fit. ARIMA(1, 1, 1) with seasonal SARIMA(1, 1, 1)₇ is a defensible model for most weekly-seasonal contact centre data. It will not be optimal but it will be robust, and the planner can always upgrade later.

Where ARIMA outperforms exponential smoothing

Two situations consistently reward the move from Holt-Winters to SARIMA. The first is queues with strong autocorrelation — where today’s level is meaningfully predicted by yesterday’s residual, not just by the smoothed long-term level. Some call types behave this way: a service that gets a wave of related contacts (a bill goes out, a notification fires) shows clear autocorrelation that an AR term captures and Holt-Winters does not.

The second is data that has been over-smoothed by exponential smoothing. If Holt-Winters is producing a forecast that looks visibly too smooth — missing turning points the data clearly shows — an ARIMA model with appropriately chosen AR and MA terms can pick up the structure exponential smoothing has flattened away.

Where ARIMA is not worth the effort

The honest answer is most of the time. For routine contact centre forecasting on stable queues with clear weekly seasonality, Holt-Winters and SARIMA produce nearly identical forecasts, and the additional complexity of ARIMA (more parameters, harder to explain, more sensitivity to misspecification) does not earn its keep. The cases where ARIMA materially outperforms are real but narrow. Operations that adopt ARIMA as a default tend to spend more time on the modelling and less time on the things that actually move forecast accuracy — data quality, event awareness, driver overlays, post-mortem learning.

Practical setup for contact centre data

If you are going to use SARIMA in a contact centre, three practical points matter. Use daily data with weekly seasonal period 7 for voice queues. Handle outliers explicitly before fitting — a single anomalous day in the training data can distort the model parameters substantially. Set the differencing carefully: most contact centre data does not have strong long-term trend at the timescales we forecast on, so d=0 with seasonal D=1 is often the right starting point. Test stationarity formally if you are unsure (the augmented Dickey-Fuller test is the standard).

For longer-horizon monthly forecasts the data behaves differently: the noise is lower, seasonality is annual, and structural changes (new products, channel migrations) matter more than autocorrelation. ARIMA can work well here but a regression-based model with explicit drivers usually works better.

Combining ARIMA with drivers (ARIMAX / SARIMAX)

The X suffix — ARIMAX or SARIMAX — adds external regressors to the model. The model then forecasts the next value using both its own autocorrelation structure and the external drivers. This is the right path for operations that have meaningful drivers (marketing campaigns, weather, system status) and that have outgrown pure Holt-Winters. Building this is more involved than a vanilla ARIMA, but it captures effects no pure time-series method can. See the regression with drivers article for the broader treatment of this approach.

Common mistakes

Three patterns recur. Choosing an over-complex model — ARIMA(3, 2, 2)(2, 1, 1)₇ with twelve total parameters fitted to two years of data is over-fitted by definition. Treating auto-ARIMA’s output as authoritative without inspecting the model — sometimes it picks something silly because of a data quirk. And ignoring residual analysis — a model whose residuals show clear autocorrelation has missed structure, and the forecast can be improved.

Conclusion

ARIMA and SARIMA are powerful, classical methods that earn their place in the planner’s toolkit but rarely as the default. For most contact centre operations, Holt-Winters is a more pragmatic baseline and ARIMA is the next step up when the data shows autocorrelation Holt-Winters cannot capture, or when external drivers justify the move to SARIMAX. The planners who get the most from ARIMA are the ones who treat it as a tool to reach for when needed, not as a flag of methodological sophistication.

Pair this with exponential smoothing for the comparison point most planners default to, and regression with drivers for the natural extension when external factors matter.