Forecasting method: regression with drivers

Forecasting · ~7 minute read

When the time series isn’t the whole story

Pure time-series methods — moving averages, exponential smoothing, ARIMA — assume that everything you need to forecast the future is contained in the recent past of the same series. For many contact centre queues that assumption is sufficient. For others, it is materially incomplete. A motor insurance line that handles claims is partly predictable from its own history and partly predictable from weather. A utility customer-service line is partly its own history and partly the temperature, the price-cap calendar, and the most recent billing cycle. A retail support line is partly seasonal pattern and partly the marketing team’s send schedule. In any operation where the external drivers materially shape demand, regression with drivers is the right method — and it is usually the highest-return forecasting upgrade a planning team can make.

What the method is

Regression forecasts the target variable (contact volume, AHT, or shrinkage) as a function of one or more input variables (drivers). The simplest version is a linear regression: volume = baseline + a×feature1 + b×feature2 + …. More flexible variants — gradient-boosted trees (XGBoost, LightGBM), random forests, generalised additive models — allow the relationship between each driver and the target to be non-linear, and allow drivers to interact with each other. The general structure is the same: feed in a feature table where each row is a time period and each column is either the historical observation or a driver, and let the model learn how the drivers map to the target.

The features that consistently earn their place

Six families of features matter in most contact centre regression models.

Calendar features. Day of week, week of month, month of year, public holidays (and the day before and after), school-term flags. These are free and almost always significant. A regression model without calendar features is fighting with one hand tied.

Lagged history. The same-day value last week, last fortnight, last month. These bring the time-series signal into the regression in a way the model can use alongside the drivers.

Weather. Temperature deviation from seasonal norm, precipitation, snow depth, severe-weather warning flags. See the weather in your forecast article for the wider treatment.

Marketing and business events. Email send volume, campaign flag, product launch, billing cycle date, regulatory notification date. Most operations have this data sitting in a marketing system or a calendar; the planner just has to ask for it.

System and operational signals. System outage flags, planned-maintenance windows, IVR changes, recent product updates. These are usually flagged manually but pay back for years once the discipline is in place.

External economic or social signals. Petrol price, unemployment rate, customer-base growth, social-media sentiment, competitor outage data. These matter more in some industries than others; they are usually worth a once-a-year review to see whether any of them have crept into significance.

Choosing the model

For a contact centre starting out with regression, two model choices consistently work.

Linear regression — or its regularised cousins, ridge and lasso — is the right starting point. It is explainable (each coefficient tells you exactly how much each driver moves the forecast), runs in milliseconds, and is robust to small data. The output is a model finance and audit will accept without resistance.

Gradient-boosted trees (XGBoost, LightGBM, CatBoost) are the right next step when the data is large enough and the relationships are clearly non-linear. They typically outperform linear regression by a few percentage points of accuracy when the drivers interact in complex ways. They are also the model class most modern WFM platforms use under the hood when they claim to use AI for forecasting. See the AI for forecasting article for the wider picture.

Avoid jumping straight to deep learning. The marginal accuracy gain over a well-tuned gradient-boosted model is usually small, and the cost in training time, explainability, and maintenance burden is large.

Engineering the features

The single biggest determinant of regression performance is feature quality, not model choice. Three habits separate good feature engineering from sloppy.

Normalise carefully. Temperature should be deviation from seasonal norm, not raw degrees Celsius. Volume should be log-transformed if the variance grows with the level. Calendar features should be one-hot encoded (separate flags for Monday, Tuesday, etc.) not numeric.

Avoid data leakage. A feature that uses information from the future is not allowed. The classic mistake is to include a lagged variable that has not actually been observed by forecast time — for a one-week-ahead forecast, you do not have last week’s actual yet, so a one-week lag is fine but a one-day lag is not.

Test the feature individually. A driver that does not move the target in a univariate analysis rarely earns its place in a multivariate model. Inspect the correlation, plot the relationship, then add to the model.

Explainability and governance

One of regression’s underappreciated strengths is that the model is explainable in a way pure time-series methods are not. The planner can answer the question “why is next week’s forecast 8% higher than last week’s?” with specifics: the temperature forecast is colder, a marketing send is going out on Wednesday, the bank holiday effect kicks in. Each driver’s contribution to the forecast is calculable and reportable. For governance, audit, and finance conversations, this explainability is genuinely valuable. Modern tooling (SHAP values, partial dependence plots) makes the same explainability available for gradient-boosted models, where the underlying maths is otherwise opaque.

Combining with a time-series baseline

A pragmatic pattern that often outperforms either method alone: run a Holt-Winters or SARIMA baseline, then use a regression model to predict the residual (the part Holt-Winters missed). The final forecast is the baseline plus the regression’s predicted residual. This approach gets you the seasonality discipline of Holt-Winters and the driver awareness of regression in one workflow. It is also robust — if the regression layer breaks, the Holt-Winters baseline still produces something sensible.

Common mistakes

Four patterns recur. Too many features for the amount of data. A model with twenty features fitted to one year of weekly data has more parameters than independent observations and will overfit. Forgetting to validate out-of-sample. A regression that fits the training data perfectly often forecasts the future badly; always hold out the last few months as a validation set. Treating coefficients as causal. The model tells you correlation, not causation; a coefficient on “day of school holidays” does not mean school holidays caused the change, just that the two are associated. Forgetting the driver forecast. To forecast next month’s volume using temperature as a driver, you need to forecast temperature. Errors in driver forecasts compound; sometimes the simpler model that doesn’t use the driver is more accurate overall.

Conclusion

Regression with drivers is the right next step for a planning function that has outgrown pure time-series methods. It is more work than Holt-Winters — data assembly, feature engineering, ongoing validation — but it pays back where external drivers materially shape demand. The planners who adopt it carefully, start with linear regression, validate honestly, and combine it with a robust time-series baseline produce forecasts that are both more accurate and more defensible than pure time-series operations. The planners who adopt it carelessly produce sophisticated-looking models that do not beat Holt-Winters by enough to justify the complexity. The discipline is in the feature engineering, not the model choice.

Pair this with exponential smoothing for the baseline, weather in your forecast for the most reliable driver, and using AI for contact centre forecasting for where this leads.

Comments

Comments are powered by Giscus — sign in with GitHub to join the discussion.