Chat concurrency and AHT — the metrics most operations measure wrong

Forecasting · Quality · Real-time management · ~7 minute read

Concurrency is the chat lever that voice doesn’t have

The single biggest driver of chat economics is concurrency — how many simultaneous conversations an agent handles. It’s also the metric most operations measure incorrectly, target poorly, or both. Pushing concurrency from 2 to 3 looks like a 50% productivity lift on paper. In practice the lift is 25–35% because each additional simultaneous chat adds AHT (the agent task-switches) and reduces FCR (less focused conversation). Above a certain level, the marginal chat costs more in quality than it returns in efficiency.

What concurrency actually is

Three measurements that get conflated.

Maximum concurrency. The platform limit on how many chats an agent can hold simultaneously. Often set to 4 or 5. Operationally meaningless.

Routing concurrency. The number the routing engine assigns to an agent before sending the next chat elsewhere. The planning target.

Observed concurrency. The actual average number of active chats per agent, measured over time. The real number.

Observed concurrency is almost always lower than routing concurrency because of natural gaps — the period between one chat closing and the next starting, the wrap-up time, the agent breaks. The honest planning number is observed concurrency, typically 0.3–0.7 below the routing target.

Productivity rises with concurrency; quality falls. The sweet spot is operation-specific — usually 2.0–3.0 observed concurrency for most contact-reason mixes.

The trade-off curve

For most operations the curve sits around: concurrency 1.5 gives AHT ~18 min and FCR 88%; concurrency 2.5 gives AHT ~22 min and FCR 82%; concurrency 3.5 gives AHT ~30 min and FCR 74%; concurrency 4.5+ becomes operationally unsustainable for most contact types.

Translated to per-resolved-customer cost: concurrency 1.5 is wasteful; concurrency 2.5–3.0 is usually optimal for mixed-complexity operations; concurrency 4+ produces faster individual conversations but more repeat contacts that destroy the apparent gain. The honest target is where the per-resolved-customer cost is minimised, not where AHT is highest or concurrency highest.

How to measure it honestly

Three measurements every operation running chat should track.

Observed concurrency by hour. Average active chats per logged-in agent, by interval. Reveals when concurrency drifts above sustainable level.

AHT by concurrency level. Plot AHT against the concurrency the agent was operating at. The slope tells you the per-additional-chat AHT cost.

FCR / repeat-contact by concurrency level. Same analysis on outcome metrics. The slope here is usually steeper than the AHT one — concurrency damages quality more than it damages speed.

The operational rules that make concurrency sustainable

Five rules that prevent the concurrency target from drifting into damage.

Cap by contact reason. Simple contacts can run at higher concurrency than complex ones. Complaint chats should run at 1; transactional at 3–4. The routing engine should respect this.

Cap by agent tenure. New agents should run at lower concurrency for the first 90 days. Forcing tenure-1 agents to maximum concurrency damages quality and attrition simultaneously.

Drop-off rules. When concurrency is high, the routing engine should prioritise finishing existing chats over starting new ones. Otherwise, the queue grows while agents juggle.

Quality-triggered re-routing. If a chat goes complex (long response times, escalation language), the routing engine should reduce that agent’s concurrency. Most platforms support this; most operations don’t configure it.

Real-time visibility. Agents should see their own concurrency level and the operational target. Hidden concurrency feels arbitrary; visible concurrency invites engagement.

Conclusion

Concurrency is the chat-specific lever and the most-mismeasured chat metric. Routing concurrency isn’t observed concurrency; observed concurrency drives the economics. The trade-off curve flattens above 3.0 for most operations — pushing higher costs more in quality than it returns in productivity. The honest target is per-resolved-customer cost, not per-conversation speed. Operations that measure and target chat properly run profitable chat; operations that don’t carry hidden cost in repeat contacts and attrition.

Pair with how to calculate staffing for chat, pros and cons of chat, QA for chat vs voice, composite metrics that hide the truth, and planning for async messaging.