The data problem nobody owns

Forecasting · ~7 minute read

No algorithm fixes bad data

The single most consistent way contact-centre forecasts go wrong isn’t in the algorithm. It’s in the raw data that feeds the algorithm. ACD reports double-count abandoned calls. Wrap codes drift over six months without anyone noticing. The IVR-deflection figure can’t be reproduced from week to week. The marketing-campaign flag column the planner is supposed to populate has been blank since April. The model runs cleanly on top of all of this and produces a confidently-wrong answer. The planner tunes the model. The accuracy doesn’t move. The data was always going to be the limit.

The five data failures that recur

ACD aggregation inconsistencies. The most common: contacts that abandoned in the IVR are counted differently in different reports. The volume number the planner forecasts against and the volume number the operations report quotes don’t match. Both are produced from the same ACD. Both look authoritative. Neither owner knows the other exists.

Wrap-code drift. The agent codes for “billing query” and “account query” were 50/50 a year ago and are now 80/20. Nothing changed in the operation. The agents drifted toward one code because it was easier to find in the dropdown. The demand-decomposition forecast that splits volume by call reason is now systematically wrong, and the trend looks like a real shift in customer behaviour.

IVR-deflection numbers that can’t be reproduced. The IVR vendor produces a number; the ACD produces a different number; the planner picks one and forecasts against it. Six months later nobody can reproduce either figure. The forecast accuracy looks erratic because the denominator is.

The flag-column nobody updates. The marketing-campaign column, the system-release column, the regulatory-event column. Each is a manual flag that someone is supposed to populate. Each is empty for two-thirds of the year. The model treats these as “no event” weeks when actually nobody recorded the event.

Timezone and DST inconsistencies. The ACD reports in GMT, the WFM reports in local time, the data warehouse stores in UTC. The clock-change weekends produce a 25-hour day and a 23-hour day that nobody flags. Two weeks of forecast accuracy in October and March silently look worse than they are.

Indicative magnitudes. The data layer is the biggest accuracy lift available to most operations. The algorithm is the smallest.

Why nobody owns this

The data layer sits in a no-man’s-land between three functions. The ACD is owned by IT or telephony. The WFM platform is owned by the planning team. The data warehouse is owned by BI or analytics. Each function owns its piece, none owns the consistency between them, and the reconciliation work falls between the seats. When a discrepancy shows up, the conversation is “that’s an IT issue” or “that’s a planning issue,” and the resolution slips until the next discrepancy.

The planning team is usually the function that suffers most from the gap and the function least empowered to fix it. The pragmatic move is for the planning team to take ownership of the reconciliation even when they don’t own the underlying systems, because the planning team is the one that ends up explaining the missed forecast.

The reconciliation discipline

Three habits, run consistently, fix most data problems.

The weekly source-of-truth reconciliation. One spreadsheet, one hour a week. ACD volume vs WFM volume vs data warehouse vs ops report. The four numbers should agree; when they don’t, the planner chases the difference. Within a quarter, the chase produces a list of named issues that the planning team can drive to resolution.

The wrap-code quarterly audit. The top fifteen wrap codes, this quarter vs last quarter vs same quarter last year. Codes that have drifted by more than 20% without an operational change are red-flagged for review. Almost always there’s a behaviour story behind the drift (a code is easier to find, a code has been re-labelled, a TL has told their team to use a specific code). Catching the drift early prevents the forecast model from learning the wrong pattern.

The flag-column audit. Every quarter, scan the manual flag columns for the proportion of weeks correctly populated. If it’s below 80% the data isn’t reliable enough to use in the model. Either get it populated properly (often via an automated feed) or stop relying on it in the forecast.

The conversation to have with IT

Most data discrepancies have a technical fix that takes IT roughly half a day. The reason it doesn’t happen is that it’s never anyone’s priority. The planning team that brings IT a specific issue, a specific impact (“this is the biggest single source of forecast error we have, costing us roughly £X”), and a specific ask gets the half-day. The planning team that complains generally about “data quality” doesn’t. The conversation is identical to the finance conversation: be specific, name the impact, name the ask.

Conclusion

The data layer is the cheapest forecast-accuracy lift available to most operations and the one most consistently neglected. The work isn’t technically difficult — it’s organisational. Nobody owns it; the planning team suffers from it; the planning team has to take responsibility for fixing it even though they don’t own the underlying systems. Operations that build the reconciliation discipline find their forecast accuracy lifts more than any algorithm change could deliver. Operations that don’t spend years tuning models that were always going to be limited by the data feeding them.

Next in the series: Forecasting the wrong thing — granularity, metric, and channel.

Pair this with the Excel paradox, demand decomposition by call reason, and speech analytics for planners.