Is traditional service level a dead KPI?

Leadership · Real-time management · Workforce economics · ~8 minute read

The question that won’t go away

“80% of calls answered in 20 seconds” has been the headline contact centre KPI for forty years. For at least fifteen of those years there has been a steady, credible argument that the metric is past its sell-by date. The argument has been made by academics, consultants, customer-experience leaders, and a growing number of operations directors who have quietly de-emphasised it inside their own businesses. The defenders of SL point out that no replacement has fully landed and that the metric still measures something real. Both sides are partly right. The honest position is that traditional service level isn’t dead, but it isn’t sufficient either, and the operations that treat it as the headline number are over-indexing on a single signal that quietly misleads them on most of what matters.

Where the metric came from — and why “80/20” in particular

The 80/20 target wasn’t derived from customer research. It came from AT&T in the 1970s as a practical compromise between “the customer should be answered immediately” (operationally impossible) and “the customer can wait as long as needed” (commercially unacceptable). The 80% threshold and the 20-second window were both chosen because they were achievable with the staffing the operation could afford and the Erlang maths the engineers understood. The metric became the industry standard because everyone used it, not because it was independently validated as the best measure of customer experience.

That history matters. The number is a historic operational convenience that the industry has carried for half a century without revisiting the rationale. Most operations using 80/20 today couldn’t tell you why those specific numbers, beyond “it’s what we’ve always done.”

What the critics get right

The metric is binary at the threshold. A call answered in 19 seconds counts; a call answered in 21 seconds doesn’t. To the customer, the experience is almost identical. To the metric, one is a hit and one is a miss. Operations that game the metric do so by reshaping the distribution around the threshold — answering more calls just inside 20 seconds and letting the long tail stretch further. The headline number improves; customer experience gets worse.

It tells you about speed, not outcome. A call answered in 8 seconds and unresolved is a worse customer experience than a call answered in 35 seconds and resolved first time. SL doesn’t know the difference. The customer experience the operation is supposedly delivering is invisible to the KPI that supposedly measures it.

It correlates weakly with CSAT and NPS. Several published studies and many internal correlations have shown SL explains a small minority of the variance in customer-satisfaction outcomes — usually under 20%. The bigger drivers are resolution, advisor knowledge, and effort. SL is a hygiene factor, not a quality factor.

It encourages bad operational behaviour. Cutting calls short, rushing wrap-up, transferring rather than handling, hanging up when the customer pauses. Each behaviour improves SL and damages customer experience. The metric the operation is paid on rewards exactly the wrong things.

It distorts staffing decisions. Erlang-driven SL targets at 90/15 or 95/10 cost meaningfully more than 80/30 or 80/45 and produce a marginal customer-experience benefit at best. Operations chase tighter SLs because they can be measured, not because the customers feel a difference.

What the critics miss

The honest case for keeping SL alongside other measures rests on three points.

It’s the only sub-minute signal you have in real time. CSAT arrives a day or two after the contact. FCR can only be confirmed when (or whether) the customer comes back. NPS is monthly at best. SL is the metric the real-time analyst is watching at 11:23am while a queue is building. Whatever its faults, it’s the most timely operational signal that exists.

It’s a fairness signal as much as an experience signal. The customer who waits longer than the target is having a worse experience than the customer who gets answered inside it. That’s real, even if the threshold is arbitrary. SL captures the spread of wait times in a single number in a way few alternatives do as cleanly.

It anchors the staffing maths. Erlang takes a service-level target as input and produces a staffing number as output. The entire planning toolkit rests on having a target. Replacing SL with something less computable means either rebuilding the maths or staffing by intuition. Most replacements proposed in the literature are weaker on this dimension than their advocates acknowledge.

SL on its own is incomplete. The modern operation watches a small set of metrics together — speed, customer-led patience, outcome, effort, and segment-level fairness.

What’s replacing or supplementing it

The credible alternatives don’t replace SL outright. They complete the picture. Five metrics have earned a place alongside it.

Abandonment rate. The percentage of customers who hang up before being answered. The metric is customer-led: it captures the patience the actual customer base has, rather than the patience the operation assumes they have. An operation with an SL of 65% and an abandonment rate of 3% is working better than one with an SL of 82% and an abandonment rate of 8% — customers in the first case are willing to wait; customers in the second case are leaving. See the Erlang A calculator for the maths that incorporates customer patience.

Average speed of answer (ASA). The mean wait time across all calls. Catches the long tail that SL hides. If the SL is 82% in 20 seconds but the ASA is 90 seconds, the 18% answered outside the threshold are waiting a very long time. SL doesn’t tell you that; ASA does.

First-contact resolution. The percentage of contacts resolved in a single interaction without the customer calling back. Outcome-led, not speed-led. The single best predictor of customer satisfaction in most contact centre research. Drives repeat-contact volume down and lifts the productive use of the same hours.

Customer-effort score (CES). A short customer-side survey asking how much effort the customer had to put in to get their issue resolved. The Harvard Business Review article that introduced CES in 2010 found it predicts loyalty better than NPS or CSAT for service interactions specifically. The metric travels well across channels and contact types.

Outcome by customer segment. Service level for the whole queue can be 82% while vulnerable customers, complex cases, or specific demographic groups are getting materially worse experiences. The aggregate SL hides the segment-level pattern. The FCA Consumer Duty in financial services and similar frames in utilities, telecoms, and gambling make segment-level outcome measurement an explicit regulatory expectation. SL alone no longer passes scrutiny.

The Consumer Duty dimension

For UK financial-services operations, this isn’t a theoretical conversation. The FCA Consumer Duty’s “consumer support” outcome explicitly asks firms to evidence that customers seeking help get appropriate, timely, and effective support. The regulator has been clear that an 80/20 service level on its own doesn’t evidence that. The expected pack now includes abandonment, ASA, segment-level outcomes for vulnerable customers, complaint volumes and resolution times, and switch-back rates from self-service. An operation that produces only an SL number in response to a regulatory review is not protected. The trend is spreading from financial services to utilities (Ofgem), telecoms (Ofcom), and the gambling sector (UKGC). The direction of travel is clear.

What good operations are doing now

The operations that handle this well have moved through three stages.

Stage 1: SL as the only headline. The pack reports SL and AHT. Everything else is buried in appendices. The operation lives or dies on the SL number. Most operations were here ten years ago. Many still are.

Stage 2: SL alongside abandonment and ASA. The three real-time speed metrics in the same view. Most operations are at this stage now — or moving towards it. It catches the worst SL-gaming behaviours but still doesn’t address outcome.

Stage 3: a balanced pack across speed, outcome, effort, and segment. SL is one of several signals. The operation can answer the regulator’s question, the CFO’s question, the customer experience team’s question, and the real-time analyst’s question with the same pack, each from a different angle. The leading operations are at this stage. The next decade of contact centre measurement looks like this.

The pragmatic answer

Is traditional service level a dead KPI? No — but it’s no longer sufficient on its own. Keep it. The real-time signal, the fairness logic, and the staffing maths all still rely on it. But promote it from “the headline number” to “one of five or six metrics that together describe the operation.” The operations that handle the next decade well will be the ones that can answer five questions with the same pack: How fast did we answer? Did the customer give up first? Did we solve the problem? How hard was it for them? Did the answer differ by segment? An SL of 82% answers one of those questions. Five answers is the bar to aim for.

Pair this with setting the right service-level target, composite metrics that hide the truth, Consumer Duty for planners, and leading vs lagging indicators.