Train: statements 20 - 25

The train stage covers the creation and selection of models and algorithms. The key activities in this stage include modelling, pre- and post-processing, model refinements, and fine-tuning. It also considers the use of pre-trained models and associated fine-tuning for the operational context.

AI training involves processing large amounts of data to enable AI models to recognise patterns, make predictions, draw inferences, and generate content. This process creates a mathematical model with parameters that can range from a few to trillions. Training an AI model might require adjustment of these parameters, entailing increased processing power and storage. 

Training a model can be compute-heavy, relying on infrastructure that may be significantly expensive. The model architecture, including choice of the AI algorithm and learning strategy, together with the size of the model dataset, will influence the infrastructure requirements for the training environment.

The AI Model encapsulates a complex mathematical relationship between input and output data that it derives from patterns in a modelling dataset. AI models can be chained together to provide more complex capabilities.

Pre-processing and post-processing augment the capabilities of the AI model. Application, platform, and infrastructure components are shown here as well as they all contribute to the overall behaviour and performance of the whole AI system.

Due to the number of mathematical computations involved and time taken to execute them, training can be a highly intensive stage of the AI lifecycle. This will depend on the infrastructure resources available, the algorithms used to train the AI model and the size of the training datasets.

Key considerations during this stage include:

  • the model architecture, including the AI model and how components within the model interact, as well as the use of off-the-shelf or pre-trained models
  • selection and development of the algorithms and learning strategies used to train the AI model
  • an iterative process of implementing model architecture, setting hyperparameters, and training on model datasets
  • model validation tests, supplemented by human evaluation, which evaluate whether the model is fit-for-purpose and reliable
  • trained model selection assessments, which streamline development and enhance capabilities by comparing various models for the AI system
  • continuous improvement frameworks which set processes for measuring model outputs, business, and user feedback to manage model performance.

If after multiple attempts of refinement, the model does not meet requirements or success criteria, a new model may need to be created, business requirements updated, or the model is retired.

See the Design lifecycle stage for details on measuring model outputs, as well as business and user feedback, to manage AI model performance.

See the Apply version control practices statement in the Whole of AI lifecycle section for detail on tracking changes to training models, trained models, algorithms, learning types, and hyperparameters.

Connect with the digital community

Share, build or learn digital experience and skills with training and events, and collaborate with peers across government.