← ccPlanning Academy · Channel planning track
Webchat & concurrency
Slides done? Here’s the same idea in a bit more depth — the part worth keeping.
In depth: live like a call, but one agent runs several
Like a call, a chat must be served while the customer waits, so service level still matters. Unlike a call, one agent runs two, three or four chats at once — and that single fact, concurrency, means you cannot staff chat with plain voice Erlang. Concurrency is a planning input as important as AHT, and just as easy to assume wrongly, so getting it right is most of the job.
Why concurrency isn’t a clean multiplier
It’s tempting to say “concurrency 3, so one agent equals three voice agents” — but juggling chats adds switching overhead, and at the same instant two customers can both be typing and waiting, so effective capacity rises with concurrency less than proportionally, and the relationship isn’t linear. Concurrency also inflates handle time: a chat handled alongside two others takes longer in wall-clock time than the same chat handled solo, so “chat AHT” isn’t a fixed number — it depends on the concurrency you actually run, and the two must be measured together. And there’s a quality ceiling: push concurrency too high and response times lengthen, customers wait between replies, quality drops and agents burn out, with a practical limit often around two to three for complex chat. Like occupancy, it’s a dial with a danger zone.
How to staff it, honestly
The common approach is to convert chat demand using a realistic concurrency factor into an equivalent workload and apply Erlang-style maths, accepting it’s an approximation; for accuracy at high concurrency or blended chat and voice, simulation does better than any closed formula. Be honest about blending, too — an agent on a call can’t simultaneously give a chat real-time attention, so true voice-plus-chat blending usually means one or the other at a time, not genuine concurrency across both, and assuming otherwise overstates capacity. Finally, measure chat in the round: the concurrency you’re running, the time between customer message and agent reply (not just time to first response), and resolution — any one alone hides the trade-off you’re making.
The principle to remember: concurrency is the whole game, and it’s not linear. Capacity rises with it but less than proportionally, AHT stretches with it, and there’s a quality ceiling — staff on a realistic factor or simulate, be honest about blending, and watch concurrency, response time and resolution together.
Quick quiz
Five questions. Pick an answer to each, then check your score.
1. Why can’t you staff webchat with plain voice Erlang?
Service level still matters, but concurrency breaks the one-agent-one-contact assumption.
2. Does concurrency of 3 mean one agent equals three voice agents?
Switching overhead and simultaneous waits mean it’s not a clean multiplier.
3. What does concurrency do to chat handle time?
Chat AHT depends on the concurrency you run — measure them together.
4. What happens if you push concurrency too high?
There’s a practical ceiling — concurrency is a dial with a danger zone, like occupancy.
5. Can an agent on a call give a chat real-time attention at the same time?
Be honest about what people can actually do at once, or the plan overstates capacity.