How to calculate staffing numbers for chat
Erlang doesn’t work for chat
Most operations applying voice-style Erlang maths to their chat queue get the wrong answer. The reasons are structural: chat agents handle multiple concurrent conversations, the service-level definition is different, the arrival pattern interacts with concurrency in non-linear ways, and the natural variability around the mean behaves differently. The result is staffing models that either under-staff (chat queue grows and SL collapses) or over-staff (agents idle while the model recommends “just one more”). This article walks through the staffing calculation that actually works for chat, the assumptions that break it, and the practical method most planning teams should use.
The model that works
The concurrency-based staffing model has four inputs.
Forecast chat volume per interval. Just like voice — arrivals per 15 or 30 minutes.
Average handling time per chat. The time from first agent message to chat close, calendar time, not agent-active time. Usually 15–30 minutes depending on complexity.
Target concurrency. The number of simultaneous chats an agent handles. Operationally set, typically 2–4. Critical: this is the planning concurrency, not the maximum allowed.
Service-level target. For chat, usually first-response time (typically “90% of chats answered within 60 seconds”) rather than resolution time.
The staffing number falls out as:
Agents required = (Volume × AHT in minutes) ÷ (interval minutes × concurrency × productivity factor)
For example: 60 chats per 30-minute interval, AHT 20 minutes, target concurrency 2.5, productivity 75% gives: (60 × 20) ÷ (30 × 2.5 × 0.75) = 1200 / 56.25 = 21.3 agents. Add a small buffer for first-response SL variance and round up.
The assumptions that break it
Three assumptions in the model are usually wrong and need calibration to your operation.
Concurrency drift. The “target concurrency” you set isn’t always what agents actually run. New agents handle fewer concurrent chats; experienced ones handle more; the average drifts. Measure your actual concurrency, don’t assume the target is achieved.
Complex-contact AHT inflation. Chat AHT for complex contacts is much longer than for simple ones — sometimes 3–5x. If the mix is skewing complex (often after deflection has removed the simple), the average AHT in your forecast model needs to track the mix change. See capacity planning when mix is changing.
Abandon-and-call behaviour. Customers who abandon a slow chat often call voice. The chat queue and the voice queue are connected; staffing chat too thinly raises voice volume. Plan them together, not separately.
The first-response SL math
Voice SL is “answered within 20 seconds”; chat SL is usually “first agent response within X seconds.” The maths is different because the queue empties differently — an agent finishing one chat picks up the longest-waiting one. The first-response SL needs slightly more capacity than the calculation suggests; a 10–15% buffer over the formula above is typical for hitting first-response targets reliably.
The practical method
For most planning teams the right approach is: compute the calculation above for each interval; add the buffer; round up; validate against your actual concurrency rather than your target; and recalibrate quarterly as the contact mix and agent experience curve shift. This is more responsive than running a fortnightly simulation but more honest than treating chat as Erlang-able.
Conclusion
Chat staffing requires its own model. Volume, AHT, concurrency, productivity — four inputs, one formula, a buffer for first-response SL variance. Don’t apply Erlang. Don’t assume target concurrency is achieved. Don’t plan chat independently of voice. Operations that do this properly hit their chat SL with predictable staffing; operations that don’t live with chronic chat backlog or chronic over-staffing.
Pair with pros and cons of chat, does chat remove voice, chat concurrency and AHT, Poisson and natural noise, and the Erlang C calculator (for the voice side).