Evaluate: statements 26 - 30
This stage includes testing, verification, and validation of the whole AI system. It is assumed that agencies have existing capability on test management and on testing traditional software and systems.
Testing should be done continuously at each component of the AI system lifecycle. The level of testing at each stage could be at unit, integration, system, or system-of-systems, depending on the scope of development at that stage and the test strategy.
Testing can be divided into formal and informal phases. Formal testing is the phase when the system under test (SUT) is formally versioned, and the outcomes are evaluated for deciding whether to deploy to production or not. Statements within the standard apply to the formal testing phase.
Where an AI system or components have been procured, deployment and integration into the deployer’s environment may need to be done before starting formal testing.
Testing of AI systems differs from testing traditional software as they can be probabilistic and non-deterministic in nature. While probabilistic systems do not give you an exact value, non-deterministic systems may have different outputs using the same inputs.
Other key differences include your approach to regression testing. While making small changes to a non-AI system may have limited consequences, a step change in test data or in parameters can be significant for an AI system. This means you need to conduct more robust regression testing to mitigate the heightened risk of escaped defects.
Note that an AI system which learns dynamically will change its behaviour without being formally updated. This means changes may occur on the same deployment version and without formal testing. This will require a more rigorous continuous monitoring post deployment. The development team or supplier should confirm whether your AI system has been designed to learn dynamically or statically.