Quality assurance for chat — what changes from voice

Quality · Leadership · ~7 minute read

The voice form doesn’t port across

Many operations launch chat and use their voice QA form for the new channel. The form has “agent used the customer’s name three times”; chat doesn’t need that. The form has “tone was empathetic and warm”; chat measures empathy differently. The form scores 12 items, most of which don’t translate. The chat QA score that emerges feels arbitrary because it’s scored against criteria that weren’t designed for the medium. Chat needs its own QA programme, built from its own conversational rhythm. This article walks through what changes.

What stays the same

Three categories carry across cleanly. Customer outcome (was the issue resolved). Compliance and regulatory items. Process adherence (right path, right tools). These earn their place on the chat form the same way they earn it on voice.

What changes

Five categories where the chat form needs different items.

1. Response timing. Time to first response, time between messages, longest gap. Chat conversational rhythm is a quality dimension voice doesn’t have. Items earn a place when poor timing damages the customer experience.

2. Written-communication quality. Clarity, structure, professional tone, appropriate informality. Voice judges this differently — tone of voice carries half the signal. In chat the text carries all of it.

3. Concurrency-related quality. Did the agent juggle multiple chats poorly (long gaps, confused threads, copy-paste errors visible)? Voice doesn’t have this dimension; chat needs items for it.

4. Use of canned responses. Most chat platforms support templates. Quality is about choosing the right template, personalising it, and not over-relying on them. Voice scripting is a different problem.

5. Emoji and informality usage. Chat tolerates informality voice doesn’t. Operations that score chat to a voice standard penalise the conversational ease that customers actually like.

The chat form needs three categories of work: keep, replace, drop. Most operations skip the replace and drop steps.

The calibration challenges

Chat calibration is harder than voice calibration in two specific ways. First, evaluators read chats slower than they listen to calls — an 800-word chat takes longer to evaluate than a 5-minute call. Second, contested items on chat are usually about timing and rhythm, which are harder to discuss without the transcript in front of everyone. Plan more time for calibration sessions on chat than you allocated for voice.

AI-led QA on chat

AI-led QA works better on chat than on voice in some respects (text is easier to parse) and worse in others (informal language, abbreviations, customer typos). The honest position: AI catches compliance and specific phrases well; struggles with empathy signal in informal text; performs reliably on response-time metrics. Pair AI-led QA with a human layer on the contested items, just like voice. See AI-led vs human QA.

The operating-model question

Should chat and voice QA be the same team or separate? Three common arrangements work. Same team, different forms. The most common — evaluators trained on both, forms specific to channel. Separate teams. For operations with very large chat operations and the budget. Same form, channel-weighted. A single form with channel-specific weighting on items. Looks clean; usually loses the specificity that makes either form useful.

Conclusion

Chat QA is not voice QA in text. The form needs different items, the calibration takes longer, the AI-led layer behaves differently, and the operating model has its own decisions. Operations that build chat QA properly produce scores agents trust and a quality signal that pairs cleanly with voice. Operations that port the voice form to chat produce arbitrary scores and quiet QA failure.

Pair with designing a meaningful QA programme, what to score on a quality form, AI vs human QA, calibration done well, and pros and cons of chat.