Statement 23: Validate, assess, and update model
Agencies must
- Criterion 84: Set techniques to validate AI trained models. - There are multiple qualitative and quantitative techniques and tools for model validation, informed by the AI system success criteria (see Design section), including: - correct classifications, predictions or forecasts, and factual correctness and relevance
- identify between positive and negative instances, and distinguish between classes
- benchmarking
- consistency in responses, clarity and coherence
- source attribution
- data-centric validation approaches for GenAI models.
 
- Criterion 85: Evaluate the model against training boundaries. - Evaluation considerations include: - poor or degraded performance of the model
- change of AI context or operational setting
- data retention policies
- model retention policies.
 
- Criterion 86: Evaluate the model for bias, implement and test bias mitigations. - This includes: - using suitable tools that test and discover unwarranted associations between an algorithm’s protected input features and its output
- evaluating performance across suitable and intersectional dimensions
- checking if bias could be managed through updating the training data (see Statement 18)
- implementing bias mitigation thresholds that can be configured post-deployment
- implementing pre-processing or post-processing techniques such as disparate impact remover, equalised odds post-processing, content filtering, and RAG.
 
Agencies should
- Criterion 87: Identify relevant model refinement methods. - These considerations may trigger model refinement or retirement and can include: - model parameter or weight adjustments – further training or re-training the model on a new set of observations, or additional training data
- adjusting data pre-processing or post-processing components
- model pruning – to reduce redundant mathematical calculations and speed up operations.
 
 
 
              
  