Definition

Model evaluation

Model evaluation is the process of measuring whether an AI model or system performs well enough for a specific use. It can test accuracy, reliability, bias, robustness, latency, cost, safety, and business impact before and after deployment.

Last updated: 25 June 2026

Why it matters

It replaces confidence based on demos with evidence tied to the actual workflow and risk level.

Signals to watch

Evaluation data is defined
Metrics match the task
Performance is monitored over time

Related definitions

Testing data AI hallucination AI ROI