Qualitative Metrics: Making Discovery & Demo Quality Coachable

Tim Brömme
LinkedIn
Quality, Made Visible — Part 6 of the SE Rockstars PreSales KPI series

TL;DR

  • The skills that decide a deal — discovery depth, demo focus, storytelling — get dismissed as "soft skills," which is code for "we don't measure them, so we don't coach them."
  • Qualitative metrics fix that by scoring how well the work was done on a defined rubric, turning vague craft into an observable, repeatable standard.
  • Five worth tracking: Discovery Quality Score, Demo Quality Score, Sales Satisfaction (an internal AE-to-SE NPS), Demo-to-Meeting Conversion, and Handoff Completeness.
  • A rubric scored 1–10 from call reviews is the engine. Conversation intelligence tools like Gong or KickScale let you run it at scale instead of spot-checking a handful of calls a quarter.
  • These scores predict outcomes earlier than win rate does — which is exactly why they're a coaching instrument, not a report card.

"It always gets labelled as soft skills — for me it's just the craft," a senior solution engineer at a vertical SaaS company in the DACH region told us. "Nobody teaches it to you. You join and think: OK, someone comes in, puts up an hour of bad slides, and leaves. That's just normal, so I guess that's how I have to do it too."

That last sentence should worry every PreSales leader. Bad demo habits don't spread because people are lazy. They spread because nobody named the standard, nobody measured against it, and nobody coached the gap. "Soft skills" is the label we put on the most important things an SE does — and then use as an excuse not to manage them. This article is about taking that label off.

Why call discovery and demo quality "metrics" at all?

Because the alternative is leaving your highest-leverage skills entirely to chance. Leading indicators tell you whether the right artifacts exist. Lagging indicators tell you what closed. Operational metrics tell you whether capacity is deployed. None of them tell you the thing a manager most needs to coach: was the discovery actually good, or did the SE just fill in the MEDDPICC fields?

Qualitative metrics answer that. They assess execution quality — discovery depth, demo effectiveness, collaboration, handoff rigor — on a defined rubric rather than a gut feeling. And because quality shows up before the outcome does, a Discovery Quality Score that's trending down is an early warning you can act on this week, not a post-mortem you read next quarter.

The objection is always the same: quality is subjective, so you can't measure it. You can. Subjectivity is what you get without a rubric. The moment you write down what a 9 looks like versus a 4, two managers reviewing the same call start landing within a point of each other. That's not perfect science. It's a shared standard — which is all coaching has ever needed.

What should you actually measure?

Five qualitative KPIs cover the work an SE owns from first call to handoff. You don't need all five on day one. You need one rubric, applied consistently, and the discipline to grow from there.

1. Discovery Quality Score. A structured 1–10 rating of a discovery call against a fixed rubric: problem clarity, impact quantified, stakeholders mapped, success metrics defined, next step agreed. This is the metric that makes "good discovery" observable — and it predicts POC readiness, technical win rate, and cycle length better than almost anything else you can score early.

What weak discovery looks like is feature-checking dressed up as curiosity. One SE leader at a DACH-based vertical SaaS company named the upgrade he wants from his team:

"I want to bring a kind of value discovery into the mix. Not just the classic functional background. Especially for the big products, I want to move away from feature-function description and toward more industry-specific value discovery. Not just: how does the process run, where does the button need to be, what mobile features are needed — but really with prior research, industry knowledge, and an understanding of typical pain points in the sector."

That's a rubric in disguise. "Prior research done," "industry pain points surfaced," "value quantified, not just process mapped" — those are scoreable line items. The leader already knows what good looks like. A Discovery Quality Score just writes it down so the whole team can be held to it, not only the top performers who do it on instinct.

2. Demo Quality Score. A rubric-based 1–10 assessment of a demo: alignment to the success criteria you uncovered in discovery, narrative and flow, buyer engagement, differentiation, how questions were handled, and whether it ended on a committed next step. The shift this metric forces is from "did we demo?" to "did the demo move the deal forward?" A polished feature tour that earns zero commitment scores low — as it should.

3. Sales Satisfaction (internal AE-to-SE NPS). A quarterly pulse where AEs rate their SE partnership 0–10, plus a couple of specific prompts on deal impact and risk identification. Almost nobody does this, and it's one of the highest-signal things you can run. You don't need a survey tool. A 15-minute coffee chat with each AE, same three questions every time — rate last quarter's collaboration 1–10, what worked, what should we improve — surfaces friction faster than any dashboard and builds the trust that makes joint execution actually work.

4. Demo-to-Meeting Conversion. The percentage of demos that produce a concrete next step — workshop, technical deep dive, security review, POC planning — inside a defined window, say 14 days. It's the quantitative shadow of the Demo Quality Score: high conversion means your demos are relevant and value-anchored; low conversion usually points at thin discovery or a demo that toured features instead of solving a problem.

5. Handoff Completeness. A short checklist scoring whether the SE-to-CS handoff contains what's needed to deliver what was sold: scope boundaries, success metrics with baseline and target, stakeholder map, key risks, technical assumptions, and any commitments made during the sale. Run it as a stage gate on higher-ACV deals. Done right, it's a revenue-protection metric — it kills the "promise versus reality" gaps that drive churn and escalations.

Qualitative Metrics: Making Discovery & Demo Quality Coachable — key metrics summary
The core metrics for this chapter at a glance.

A Discovery Quality Score rubric you can use Monday

Don't overthink the instrument. Five dimensions, scored 0–2, summed to a 10-point scale. Anything that can't be evidenced from the call recording scores zero — "I'm sure they covered it" is not a score.

  • Problem clarity — 0 — Absent: Talked features, never the problem · 1 — Partial: Problem named, not explored · 2 — Strong: Root problem + business consequence surfaced
  • Impact quantified — 0 — Absent: No numbers · 1 — Partial: Vague "it's costly" · 2 — Strong: €/time/risk attached to the pain
  • Stakeholders mapped — 0 — Absent: Single contact, no map · 1 — Partial: Roles named · 2 — Strong: Buying center + decision process understood
  • Success metrics defined — 0 — Absent: None · 1 — Partial: Loose goals · 2 — Strong: Explicit, measurable success criteria agreed
  • Next step committed — 0 — Absent: "We'll follow up" · 1 — Partial: Soft next call · 2 — Strong: Specific next step with date + owner

Benchmark: treat 8–10 as strong, 5–7 as coachable, below 5 as a deal at risk. The goal isn't to grade SEs into a ranking — it's to make the gap visible so the coaching conversation has somewhere to land.

Don't let the rubric become the thing you optimize for

Here's the failure mode nobody warns you about. The moment a review is predictable, people prepare for the review instead of the customer. An SE manager at a vertical SaaS company in the DACH region described exactly how this rots:

"After the tenth dryrun with my managers, I knew exactly what they were going to say. So I started preparing the dryruns specifically for the managers — they'd say: great dryrun, you're super prepared — but the real meeting turned out to be quite different."

That's Goodhart's Law wearing a lanyard. When a measure becomes a target, it stops measuring. The defense is twofold. First, score real calls, not rehearsals — which is precisely where conversation intelligence earns its keep. Second, rotate reviewers and pull in an outside eye who has no patience for internal jargon. The rubric measures the work; it should never become the work.

Where conversation intelligence comes in

Hand-scoring every call doesn't scale, and a manager who reviews four calls a quarter is sampling noise. Conversation intelligence platforms — Gong, KickScale, and the like — record and transcribe calls automatically, so you can apply your rubric across the whole team instead of spot-checking. Some will draft the score itself against your criteria, leaving the manager to validate and coach rather than transcribe.

A word of caution that an SE manager shared with us about his own team: automation is a multiplier, not a replacement. The value of these scores is the coaching conversation they trigger. If the AI scores the call and nobody ever talks about it, you've automated a report nobody reads. Use the tooling to free up time for the human part — the debrief, the "here's the one thing to change next call." That's where the score becomes a skill.

Frequently asked questions

How do you measure something as subjective as demo quality? With a rubric. Define 5–6 dimensions — success-criteria alignment, narrative, engagement, differentiation, question handling, committed next step — and score each on a fixed scale. Two reviewers using the same rubric on the same recording will land within a point of each other. That shared standard, not perfect objectivity, is what makes the metric coachable.

What's a good Discovery Quality Score to aim for? On a 10-point rubric, treat 8–10 as strong, 5–7 as coachable, and below 5 as a deal-risk signal worth a manager's attention. Don't fixate on the team average — watch the trend and the spread. A widening gap between top and bottom performers tells you exactly where enablement should focus next.

Can AI tools like Gong or KickScale score these automatically? Yes — they record and transcribe calls, and increasingly draft rubric scores against your criteria, which lets you assess the whole team instead of sampling a few calls. But the score is only valuable if it feeds a coaching conversation. Automate the scoring; never automate away the debrief that turns the score into a behavior change.

Isn't an AE-to-SE NPS just internal politics? Done as a popularity contest, yes. Done with specific prompts — deal impact, risk identification, commercial awareness — it surfaces collaboration friction months before it shows up in win rate. The lightweight version is a quarterly 15-minute conversation with each AE asking the same three questions. It builds trust and catches alignment problems early, without adding process.


This is Part 6 of a 10-part series on PreSales performance measurement, drawn from the PreSales KPI Playbook and hundreds of conversations with solution engineering leaders. The Trusted Advisor Academy helps PreSales teams turn rubrics like these into everyday coaching practice — so discovery and demo quality stop being "soft skills" and start being a standard.

About the authors: Tim Brömme and Jan-Erik Jank are the co-founders of SE Rockstars and the Trusted Advisor Academy. Between them they bring 30+ years of enterprise PreSales experience, eight-figure closed deal portfolios, and 350+ solution engineers coached.

This isn't just a training. It's rewiring how your team operates.

100% free | No commitment required

Stop Training for 2 Days a Year. Start Training Every Single Week.