Designing a meaningful QA programme

Quality · Leadership · ~7 minute read

Most QA programmes produce scores nobody acts on

Most contact centres have a QA programme. Most QA programmes produce scores. Few of those scores change anything. The structure is rigorous, the evaluations are thorough, the conversations rarely happen — and the agents who get scored absorb the unspoken message that the activity doesn’t matter. A meaningful QA programme starts from the opposite premise: every evaluation must drive a behaviour, every score must have an owner, and the operating model must close the loop from evaluation to coaching to measurable change.

The four principles

Every evaluation must drive a behaviour. If the score doesn’t lead to a conversation, and the conversation doesn’t lead to a change, the evaluation didn’t earn its place. Trace every item on the form forward: what action does it trigger when it lands at certain levels? If the honest answer is “nothing,” remove the item.

The sample must be defensible. Most QA programmes evaluate 4–6 contacts per agent per month. That sample size is statistically thin and trains agents to game it. A defensible sample is bigger (10+ per agent per month if practical), structured across contact types, and ideally complemented by AI-led 100% coverage on a few high-value behaviours. See AI-led vs human QA.

Calibration must be a habit. Without ongoing calibration, every evaluator scores to a private standard and the scores stop being comparable. Calibration done well is fortnightly, structured, and visible. See calibration done well.

Every score has an owner. Not the evaluator. Not the QA lead. The agent’s team leader. The team leader owns the follow-up, the coaching, and the measurable behaviour change. Without ownership the loop stays open and the programme decays.

If any step is missing or unowned, the loop stays open and the programme decays into scores nobody acts on.

What to put on the form

A meaningful form has 6–10 items, not 25. The discipline of fewer items is what separates serious QA from compliance theatre. The right items satisfy three tests: they map to a customer outcome that matters, they can be scored objectively (or with clear calibrated guidance), and they trigger a known coaching response.

The four categories that usually earn a place: customer outcome (was the issue resolved, was the customer satisfied), process adherence (was the right path followed), compliance (regulatory and risk items that have real consequences), and communication (clarity, empathy, professionalism). For more detail see what to actually score on a quality form.

The cadence that works

Daily evaluations land in the agent’s view within 48 hours. Weekly team-leader review reads the agent-level trends. Fortnightly calibration session keeps scoring consistent. Monthly QA pack to operations leadership shows the operation-level pattern. Quarterly programme review revisits the form, the sample design, and the calibration set.

Each cadence has a different decision attached. Daily → coaching action. Weekly → performance management input. Monthly → trend analysis and programme adjustment. Quarterly → structural change. Mixing the cadences mixes the decisions and the programme loses focus.

The operating model

A typical QA operating model has three roles: evaluators (who score, ideally dedicated rather than seconded), team leaders (who coach), and a QA lead (who owns the programme, runs calibration, and reports to operations). In smaller operations the roles compress. In larger ones they expand to include specialist coaches and a calibration owner separate from the QA lead.

The single biggest operating-model mistake is having team leaders score their own teams. Looks efficient; destroys credibility. Independent evaluation is what gives the score weight.

Common pitfalls

The form that scores what’s easy. “Agent said their name within 5 seconds” is easy to score and worthless. “Agent surfaced and addressed the underlying customer concern” is hard to score and important.

Composite scores without decomposition. See composite metrics that hide the truth. The same logic applies to QA: the single weighted score hides everything that matters.

Coaching that never lands. If the score doesn’t turn into a within-week coaching conversation, the programme is producing data, not change. See coaching from QA results.

Conclusion

A meaningful QA programme is built around four principles, scored on a small form, run on disciplined cadence, and owned at the team-leader level. The discipline isn’t in the platform or the scoring methodology — it’s in the operating model that connects evaluation to behaviour change. Operations that get this right see their QA scores drift the right way over time; operations that don’t see scores stay flat while customer experience drifts in the opposite direction.

Pair with what to actually score on a quality form, calibration done well, coaching from QA results, and the QA vendor directory.