Glossary

Evaluation set

A fixed set of tasks used to test AI changes.

Why it matters
Business and delivery impact.
Prevents silent regressions.
How to measure
Proof you can track.
Score trends over time.
Example
What it looks like in practice.
Twenty standard prompts for design system QA.
Related
Keep the context connected.