Where forecasts most often go wrong

Forecasting · ~6 minute read

The failure is rarely the algorithm

Most forecasts that fail don’t fail because the forecasting method was wrong. The Holt-Winters was fine; the ARIMA was reasonable; the foundation model produced sensible numbers. What failed was upstream of the model, downstream of the model, or in the conversation about what the model was actually being asked to do. A planning team that obsesses over algorithm choice while ignoring the surrounding failure modes ends up with a beautifully-tuned forecast that’s confidently wrong.

This series walks through the six failure modes that recur across operations, sectors, and forecasting methods. Each is the subject of one of the next five pieces; this opening article names them all so the diagnostic frame is visible from the start.

The diagnostic landscape. Most operations spend most time on box 5 and most of the actual damage lives in boxes 1, 2, and 6.

The six failure modes

1. Bad raw data. The most common failure and the one nobody usually owns. ACD reports that double-count abandoned calls. Wrap codes that drift over six months without anyone noticing. The IVR-deflection number nobody can reproduce. The marketing-campaign flag column the planner is supposed to populate that’s been blank since April. No algorithm fixes any of this. See the data problem nobody owns.

2. Forecasting the wrong thing. A technically correct forecast of the wrong metric. Voice-only volume in a multi-channel operation. Gross dialled volume when routing happens on net. Daily totals when the SL conversation lives at interval level. Each of these is a category error that planners regularly make and most operations don’t challenge until something visibly fails. See forecasting the wrong thing.

3. The point estimate. The single-number forecast that nobody believes by the end of the year. Finance treats every miss as evidence that planning “can’t forecast,” the planner accumulates a reputation for over-confidence, and the credibility loss isn’t recoverable by a better algorithm. See the point estimate trap.

4. Noise treated as signal (and vice versa). Contact arrival data is genuinely noisy. Most operations over-react to short-term variances that mean nothing and under-react to small persistent drifts that mean a lot. Both failures look the same on a dashboard. See noise vs signal.

5. The algorithm itself. Genuinely the smallest of the six. Algorithm choice matters at the margin, and operations should care about it — but they should care about it after the surrounding failures are addressed. The order matters.

6. The business-intelligence layer that isn’t there. Marketing knows about the campaign that’s about to launch. IT knows about the system release that will distort AHT. Finance knows about the pricing move that will produce a complaint spike. None of it reaches the planning team unless the planning team has built the forum that surfaces it. See the business-intelligence layer.

Why this order matters

Operations that try to improve forecasting by upgrading the algorithm first usually find the upgrade doesn’t deliver the accuracy lift they expected. The bigger gains live in the failures the algorithm can’t see — the data, the granularity, the BI layer. A planning team that audits its data, confirms it’s forecasting the right thing, builds a prediction interval, and runs a weekly drivers conversation typically lifts forecast accuracy by 30–50% before the algorithm choice matters at all. A team that upgrades the algorithm without fixing the surrounding failures usually finds the new model is no better than the old one.

The honest exception

Some operations genuinely have a model problem. A small-volume queue with a long history of being treated as a single average; a series with strong but un-modelled seasonality; a foundation-model implementation that’s been left untuned. In those cases the algorithm is the bottleneck. The diagnostic to distinguish — is the model performing materially worse than a naive baseline (seasonal naive, same-day-last-week)? If yes, the algorithm is the problem. If no, the algorithm is fine and one of the other five failures is doing the damage.

The series ahead

The next five pieces in this series each take one of the failure modes (the algorithm isn’t covered — the existing methods articles already do that). Each piece walks through what the failure actually looks like, why it’s so persistent, and what the operations that catch it early do differently. None of the fixes is technically difficult; all of them are operationally unglamorous. The reason most operations still struggle with forecast accuracy isn’t that the maths is hard. It’s that the disciplines are.

Next in the series: The data problem nobody owns.

Pair this with forecast accuracy metrics, your forecast is probably more accurate than you think, and Poisson and natural noise.