← ccPlanning Academy · Channel planning track

Webchat & concurrency

Deep-dive lesson · about 10 minutes · short quiz at the end

ccPlanning academy · channel planning · deep dive

Webchat & concurrency

Live like a call, but one agent handles several at once — and that breaks the maths.

The big idea

Chat is real-time, but concurrent.

Like a call, a chat must be served while the customer waits — so service level still matters. Unlike a call, one agent runs two, three or four chats at once. That single fact — concurrency — means you cannot staff chat with plain voice Erlang.

The concurrency factor

How many chats per agent, really.

A 3 concurrent chats

A planning input as important as AHT — and just as easy to assume wrongly.

Why it isn’t a clean multiplier

Three chats don’t mean triple capacity.

It’s tempting to say “concurrency 3, so one agent = three voice agents.” Not so. Juggling chats adds switching overhead, and at the same instant two customers can both be typing and waiting. Effective capacity rises with concurrency but less than proportionally — and the relationship isn’t linear.

Handle time stretches

Concurrency inflates AHT.

A chat handled alongside two others takes longer in wall-clock time than the same chat handled solo — the agent is dividing attention and there are gaps while the customer types. So “chat AHT” isn’t a fixed number; it depends on the concurrency you actually run. Measure them together.

The quality ceiling

Push concurrency too high and everything sags.

More chats per agent looks efficient until response times lengthen, customers wait between replies, quality drops and agents burn out. There’s a practical ceiling — often around 2–3 for complex chat — beyond which apparent efficiency turns into bad service. Concurrency is a dial with a danger zone, like occupancy.

How to staff it

Erlang on concurrency-adjusted capacity, or simulate.

The common approach: convert chat demand using a realistic concurrency factor to an equivalent workload, then apply Erlang-style maths — accepting it’s an approximation. For accuracy at high concurrency or blended chat/voice, simulation (from the advanced track) does better than any closed formula.

The blended-agent question

Voice and chat on the same person?

An agent on a call can’t simultaneously give a chat real-time attention — true voice+chat blending usually means one or the other at a time, not genuine concurrency across both. Be honest about what your people can actually do at once, or the plan overstates capacity.

Measuring chat

Watch concurrency, response time and resolution — together.

A healthy chat operation reports the concurrency it’s running, the time between customer message and agent reply (not just the time to first response), and resolution. Looking at any one alone hides the trade-off you’re actually making.

Why “times three” is a trap

Concurrency 3 isn’t 3× the agents

It’s tempting: “we run 3 chats each, so 10 agents = 30 voice-equivalent.” But switching between chats has overhead, two customers type at once, and each chat’s wall-clock AHT stretches the more you juggle. Real effective capacity is more like 2–2.3×, not 3.

Plan on the clean multiplier and you under-staff and miss response times. Capacity rises with concurrency — just less than proportionally.

The takeaway

Concurrency is the whole game — and it’s not linear.

Chat is live but concurrent: capacity rises with concurrency but less than proportionally, AHT stretches with it, and there’s a quality ceiling. Staff on a realistic concurrency factor (or simulate), be honest about blending, and watch concurrency, response time and resolution together.

Now test yourself ↓

1 / 10

Slides done? Here’s the same idea in a bit more depth — the part worth keeping.

In depth: live like a call, but one agent runs several

Like a call, a chat must be served while the customer waits, so service level still matters. Unlike a call, one agent runs two, three or four chats at once — and that single fact, concurrency, means you cannot staff chat with plain voice Erlang. Concurrency is a planning input as important as AHT, and just as easy to assume wrongly, so getting it right is most of the job.

Why concurrency isn’t a clean multiplier

It’s tempting to say “concurrency 3, so one agent equals three voice agents” — but juggling chats adds switching overhead, and at the same instant two customers can both be typing and waiting, so effective capacity rises with concurrency less than proportionally, and the relationship isn’t linear. Concurrency also inflates handle time: a chat handled alongside two others takes longer in wall-clock time than the same chat handled solo, so “chat AHT” isn’t a fixed number — it depends on the concurrency you actually run, and the two must be measured together. And there’s a quality ceiling: push concurrency too high and response times lengthen, customers wait between replies, quality drops and agents burn out, with a practical limit often around two to three for complex chat. Like occupancy, it’s a dial with a danger zone.

How to staff it, honestly

The common approach is to convert chat demand using a realistic concurrency factor into an equivalent workload and apply Erlang-style maths, accepting it’s an approximation; for accuracy at high concurrency or blended chat and voice, simulation does better than any closed formula. Be honest about blending, too — an agent on a call can’t simultaneously give a chat real-time attention, so true voice-plus-chat blending usually means one or the other at a time, not genuine concurrency across both, and assuming otherwise overstates capacity. Finally, measure chat in the round: the concurrency you’re running, the time between customer message and agent reply (not just time to first response), and resolution — any one alone hides the trade-off you’re making.

The principle to remember: concurrency is the whole game, and it’s not linear. Capacity rises with it but less than proportionally, AHT stretches with it, and there’s a quality ceiling — staff on a realistic factor or simulate, be honest about blending, and watch concurrency, response time and resolution together.

Quick quiz

Five questions. Pick an answer to each, then check your score.

1. Why can’t you staff webchat with plain voice Erlang?

Service level still matters, but concurrency breaks the one-agent-one-contact assumption.

2. Does concurrency of 3 mean one agent equals three voice agents?

Switching overhead and simultaneous waits mean it’s not a clean multiplier.

3. What does concurrency do to chat handle time?

Chat AHT depends on the concurrency you run — measure them together.

4. What happens if you push concurrency too high?

There’s a practical ceiling — concurrency is a dial with a danger zone, like occupancy.

5. Can an agent on a call give a chat real-time attention at the same time?

Be honest about what people can actually do at once, or the plan overstates capacity.