Undefined.

The report is right. The fight in the room isn't about the numbers.

Apr 01, 2026

Three teams pull their “loyal customer” numbers. Each one is confident in their count. Each count is different.

Team A defines loyal as anyone who’s made a purchase three or more times in the past year.
Team B defines it as customers who’ve returned in at least eight of the past twelve months.
Team C runs subscriptions - anyone currently active counts, including people who signed up last week.

None of these definitions are wrong. They’re just not the same definition.

For a while, nobody notices - because the reports were built separately, the teams work separately, and the word “loyal” feels like it means something obvious.

It doesn’t.

This isn’t a data quality problem. The data is fine. Every query is pulling exactly what it was written to pull.

It’s a vocabulary problem - and those are harder to fix because they’re harder to see.

Data quality issues usually show up as errors.

Vocabulary problems usually show up as disagreements that feel like they’re about the numbers.

The meeting gets tense over a chart. Someone pulls a different report. Someone else questions the source. And somewhere in the conversation is a definition that was never written down, holding together an interpretation that not everyone shares.

Ask five people what “lapsed” means and you’ll get at least three answers.

Someone who hasn’t purchased in 90 days?
Six months?
A year?
Someone who unsubscribed, or
just someone who stopped engaging?

The word is in the report. The definition isn’t.

AI can’t fix a vocabulary problem. But it can surface one before it turns into a fight.

The prompts below work because they’re designed to surface ambiguity rather than paper over it. That’s a deliberate choice, not a default behavior. Left to its own assumptions, AI will often just answer the question - filling in the blanks you didn’t know were blank. Ask it to look for ambiguity specifically, and you get something different: a list of what your team hasn’t agreed on yet.

That’s the unexpected utility here: not AI as the answer, but AI as the thing that makes you answer before you build.

Two prompts worth keeping

Prompt 1: The definition audit

Use this when you inherit a report, join a new team, or want to pressure-test a set of metrics before a major presentation.

I’m going to share a list of metrics from an existing report. 
For each metric, identify:

1. Terms in the metric name or description that could be interpreted differently by different stakeholders 
2. What’s missing from the definition - what would you need to know to pull this number consistently? 
3. What questions a skeptical stakeholder might raise about this metric

Here are the metrics: 
[paste metric names and any existing definitions]

Here’s what the output looks like in practice - this is what comes back when you run “Retained Customers” through the prompt with no additional context provided:

Retained Customers
Ambiguous terms: “Retained” - could mean any customer with at least one repeat purchase ever, or specifically within a defined window.
Missing from definition: Time window (12 months? 24?), channel scope (all channels or a specific one?), minimum purchase threshold.
Stakeholder questions likely: “Retained since when?” / “Does a single repeat purchase count, or do we need a minimum frequency?” / “How does this interact with the lapsed definition?”

The output is a checklist of conversations to have before the report or dashboard goes live. Not after.

Prompt 2: The glossary starter

Use this when your team needs a shared metric definition record and doesn’t have one - or when the existing one is a document nobody opens.

I’m going to share a set of reports and the metric names in them. 

For each metric, draft a definition record with the following fields:

- Metric name - Plain-language description (one sentence) 
- Calculation or formula (as specific as possible given what I’ve shared) 
- What it includes 
- What it explicitly excludes 
- Time period or window (if applicable) 
- Who typically owns or maintains this definition 
- What would trigger a need to revisit this definition

Where there isn’t enough information to fill a field, flag it as "needs clarification" and note what’s required to complete it. Don’t guess.

Here are the reports: 
[paste or describe]

Here’s a partial output for the same metric:

Metric name: Retained Customers
Description: (needs clarification - reports reference repeat purchasers but no consistent window or channel scope is defined)
Formula: COUNT of customer_id WHERE purchase_count > 1 - (needs clarification: within what time period?)
What it includes: (needs clarification)
What it excludes: (needs clarification - does this exclude first-time buyers only, or also lapsed customers?)
Time window: NEEDS CLARIFICATION - no window defined in existing reports
Owner: (needs clarification - marketing or CRM team?)
Trigger for revisit: Change in retention strategy, new channel launch, shift in purchase frequency benchmarks

The fields that come back blank are the point.

This isn’t a finished glossary - it’s a working draft and a map of the conversations your team still needs to have.

The difference between this and a generic “define my metrics” prompt is that this one shows you exactly where the work is, or the conversations you still need to have.

Even with best intentions, we can find ourselves in a situation where there’s no record of how a definition got decided - or when, or why.

So when the business shifts, when a new stakeholder joins, when a report gets refreshed for a new quarter, the same question surfaces again.

And it gets answered from scratch, sometimes differently than before.

I’ve sat in more than a few of those meetings.

The numbers look fine.

The mood doesn’t.

And it takes longer than it should to realize that everyone is defending a different version of the same word - not the analysis, not the methodology. Just the word.

Building a shared glossary helps.

But the more durable habit is asking what the word means before it goes into production - and keeping the answer somewhere it can be found and updated.

Because “loyal” this year might not be “loyal” next year if the pricing model changed or a new channel launched.

The definition isn’t wrong. It’s just out of date. And nobody flagged it.

Every week in Teach Data with AI, I write about what I'm building and testing in real data work - the prompts, the approaches, what held up and what didn't. Subscribe here.

Teach Data with AI

Discussion about this post

Ready for more?