Why longer calls can improve your service level

Real-time management · Forecasting · Quality · Leadership · ~7 minute read

The instinct that’s wrong as often as it’s right

The instinct under service-level pressure is always the same. The queue is building. The dashboard is amber. The team is told to shorten calls, tighten wrap-up, get customers off the line faster. AHT comes down for an hour. Service level recovers. The intervention is declared a success. The instinct gets reinforced. The next time the queue builds, the same lever gets pulled.

The instinct is sometimes right and frequently wrong. The same operations that pull this lever are often the operations whose service level drifts down over months despite the daily firefighting. The reason is counter-intuitive but well-documented in the contact-centre literature: rushed calls produce repeat contacts, transfers, escalations, and complaints — each of which costs the queue more than the time the original call would have taken to handle properly. An extra 30 seconds on the first call is often cheaper to the queue than the second call it prevents.

This article walks through the maths, the operational conditions where the effect dominates, where it doesn’t, how to spot the opportunity in your own data, the quality and coaching implications, and the leadership conversation needed to defend “longer is better” when the AHT trend line is moving the wrong way.

The maths most operations don’t do

Start with a simple model. A queue has 200 contacts an hour. AHT is 6 minutes. First-contact resolution (FCR) is 80% — meaning 20% of those contacts will produce a repeat contact within a few days. Real total demand is therefore 200 + (200 × 0.20) = 240 contacts an hour, of which 200 are first contacts and 40 are repeats.

Now the intervention. The team pushes AHT down from 6 minutes to 5.5 minutes — a 9% reduction. Looks like a clear win. But the corner-cutting drops FCR from 80% to 70%. Repeat-contact rate rises from 20% to 30%. Real demand becomes 200 + (200 × 0.30) = 260 contacts an hour, of which 200 are first contacts and 60 are repeats.

Erlang maths: at AHT 6 / FCR 80%, total handle time per hour is 240 contacts × 6 minutes = 1,440 agent-minutes = 24 agent-hours of work. At AHT 5.5 / FCR 70%, total handle time per hour is 260 × 5.5 = 1,430 agent-minutes = 23.8 agent-hours of work. Marginal saving. But repeat contacts are typically harder (frustrated customer, complicated history) — so the average AHT on those 60 repeats is closer to 7 minutes than 5.5. Recalculating: 200 × 5.5 + 60 × 7 = 1,100 + 420 = 1,520 agent-minutes = 25.3 agent-hours.

The “efficiency” intervention has increased total handle-time demand by 5%. Same staffing produces a worse service level. The metric that moved on the dashboard improved; the operation got worse.

The hidden cost of rushed calls. Time saved per call × lift in repeat rate × longer AHT on repeats often produces more total work, not less.

Where the effect dominates

The longer-calls-help-SL effect is strongest in operations with four characteristics.

1. Resolution-driven contact reasons. Account problems, technical support, complaints, complex enquiries. Where the customer’s problem either gets solved or it doesn’t. The effect is weaker in transactional contacts (a balance check, a simple data update) where the answer is binary and short.

2. High repeat-contact rate. Operations above 25% repeat-contact rate almost always benefit from longer first calls. Operations at 8% probably don’t — the marginal call to prevent isn’t there.

3. Long AHT on repeat calls. When a customer calls back, their AHT is typically 20–50% longer than the original call — more history, more frustration, more verification, more escalation. The bigger this delta, the bigger the case for longer first calls.

4. Tight CSAT/complaint sensitivity. If your operation tracks complaints, FCR, or vulnerable-customer outcomes seriously, the cost of rushing isn’t just queue arithmetic — it’s regulatory exposure and complaint volume that takes weeks to clear.

Where it doesn’t hold

Three conditions where the conventional “cut AHT” instinct is actually right.

Genuine wasted time. Excess pleasantries, unnecessary recap, hold time the customer doesn’t need. Some operations carry 30–60 seconds of recoverable waste per call without affecting outcome. Cutting that is real efficiency.

Simple transactional volumes. When the call is genuinely simple and the AHT is bloated by process drag, shortening helps without hurting resolution. The operations that find this usually have FCR above 90% already — the headroom is on the call, not on the outcome.

Skill-mismatched contacts. Calls that should have been transferred earlier rather than handled at length by an under-skilled agent. Here “shorter calls” really means “earlier transfers,” which is a routing improvement, not a coaching one.

How to spot the opportunity in your data

The diagnostic is straightforward and rarely run. Pull a quarter of contact-level data with three fields per contact: AHT, FCR flag, and repeat-contact flag within 7 days. Then:

1. Compute the repeat-contact rate by AHT band. Group AHT into deciles (or quintiles). For each band, calculate what percentage of those contacts produced a repeat within 7 days. In most operations, the lowest-AHT band has a meaningfully higher repeat rate than the middle bands. The curve isn’t monotonic — the very longest calls (escalations, complaints already in progress) also have high repeat rates — but the bottom-end uplift is the diagnostic signal.

2. Compute the “total minutes” cost per first-call AHT band. For each band, work out: average first-call AHT + (repeat rate × average repeat AHT). The band that minimises this total is the AHT target the operation should actually be working towards — not the lowest-AHT band, and almost never the headline-AHT-target band that came down from finance.

3. Look at the contact-reason mix in the lowest-AHT band. If the rushed calls are heavily weighted to complex contact reasons (complaints, retention, technical), the repeat-contact effect will be strong. If they’re weighted to simple transactional reasons, less so.

What changes in QA and coaching

The longer-calls insight changes the QA conversation. Three shifts.

The form should reward resolution, not speed. See what to actually score on a quality form. The category that lifts the operation is customer outcome, not AHT-compatibility. Operations that score “the agent kept the call short” positively are scoring against themselves.

Coaching shifts from speed to depth. “Resolve the issue properly. Take the time. The next call you save is the one you didn’t cause.” This requires the team leader to understand the underlying maths — which most don’t until somebody walks them through it.

Performance management criteria change. An agent at AHT 7:30 and FCR 88% is outperforming an agent at AHT 5:15 and FCR 68%. The first is producing fewer total handle-time minutes per customer issue resolved than the second. The performance-management framework needs to recognise that — or the operation will keep promoting the wrong behaviour.

The leadership conversation

The hardest part isn’t the maths. It’s the conversation with the operations director or CFO when the AHT trend line ticks up and they ask why. The answer needs to be ready:

“AHT is up 4% this quarter. Repeat-contact rate is down 12%. Total handle-time demand is down 5%. Service level is stable. CSAT is up 3 points. The longer calls are cheaper.”

That conversation only works if the operation is measuring repeat-contact rate alongside AHT, reporting them together, and treating total handle time as the headline efficiency metric. Most operations report AHT in isolation and lose the argument before they start. Operations that report the full picture defend longer calls comfortably; operations that report AHT alone find themselves cutting calls year after year, attrition rising, complaints rising, and SL drifting down despite the visible “efficiency.”

The honest scope of the claim

This isn’t an argument that longer is always better. It’s an argument that AHT is a partial metric, that the queue includes the calls you cause as well as the calls you receive, and that the operation should optimise for total handle time per resolved customer issue rather than for AHT in isolation. In most operations that re-framing tilts the answer towards slightly longer first calls. In a few it doesn’t. The discipline is doing the maths rather than running on instinct — because the instinct, on its own, costs more than it saves.

Conclusion

The instinct under SL pressure is to shorten calls. The maths is more nuanced. Repeat contacts cost the queue more than the original call would have, repeat AHT is typically longer than first-call AHT, and the customer experience worse. In most operations with FCR below 85% and repeat-contact rates above 20%, a deliberate move to slightly longer first calls produces a lower total handle-time demand, a higher service level for the same headcount, and a meaningfully better customer experience. The operations that get this right report the full picture, coach for resolution rather than speed, and have rehearsed the leadership conversation that defends the AHT line moving in what looks like the wrong direction.

Pair this with composite metrics that hide the truth, is service level a dead KPI, designing a meaningful QA programme, sometimes the right thing to do is nothing, and the Erlang C calculator.