Search

This stage includes testing, verification, and validation of the whole AI system. It is assumed that agencies have existing capability on test management and on testing traditional software and systems.

Testing should be done continuously at each component of the AI system lifecycle. The level of testing at each stage could be at unit, integration, system, or system-of-systems, depending on the scope of development at that stage and the test strategy.

Testing can be divided into formal and informal phases. Formal testing is the phase when the system under test (SUT) is formally versioned, and the outcomes are evaluated for deciding whether to deploy to production or not. Statements within the standard apply to the formal testing phase.

Where an AI system or components have been procured, deployment and integration into the deployer’s environment may need to be done before starting formal testing.

Testing of AI systems differs from testing traditional software as they can be probabilistic and non-deterministic in nature. While probabilistic systems do not give you an exact value, non-deterministic systems may have different outputs using the same inputs.

Other key differences include your approach to regression testing. While making small changes to a non-AI system may have limited consequences, a step change in test data or in parameters can be significant for an AI system. This means you need to conduct more robust regression testing to mitigate the heightened risk of escaped defects.

Note that an AI system which learns dynamically will change its behaviour without being formally updated. This means changes may occur on the same deployment version and without formal testing. This will require a more rigorous continuous monitoring post deployment. The development team or supplier should confirm whether your AI system has been designed to learn dynamically or statically.

Inclusive design embraces broad diversity to meet the varied needs and perspectives of a wide range of user groups.

Integrate: statements 31 - 32

The integrate stage of the AI lifecycle focuses on implementing and testing an AI system within an agency’s internal organisational environment, including with its systems and data.

Deploying a standalone or integrated AI system into existing IT infrastructure involves assessing compatibility and confirming data interoperability. This may require some reconfiguration to current systems.

Agencies can achieve the best outcomes at this stage by adopting practices that closely align with those implemented at the test stage. This ensures the AI system has been thoroughly tested against its intended purpose prior to integration – a key measure before the AI system potentially contaminates the business environment. See Test section for a detailed discussion on testing methods.

Following recommended practices for managing code integration workflows for AI systems will help agencies to maintain quality, security, and consistency.

Deploy: statements 33 - 36

The deploy stage involves introducing all the AI technical components, datasets, and related code into a production environment where it can start processing live data.

Deployment involves rigorous testing, governance, and security practices to ensure systems perform as intended in live environments.

Defining structures for secure deployment of an AI system is particularly crucial in a government setting. Deployment strategies should be incremental, consistent, and non-disruptive.

Notes:

The Australian Signals Directorate’s Australian Cyber Security Centre report on Deploying AI Systems Securely provides further advice on deploying AI systems securely.

Monitor: statements 37 - 39

The monitor stage of the AI lifecycle includes operating and maintaining the AI system. Monitoring is critical to ensuring the reliability, availability, performance, security, safety, and compliance of an AI system after it is deployed.

Monitoring AI systems is critical because changes in the operating environment and inputs could result to degradation and potential harms. Effective monitoring includes continuous performance evaluation, anomaly detection, intervention, and proactive incident response.

The measures implemented at this stage helps identify if a system is generating outputs that misalign with its intended purpose and promptly remedy issues.

Criterion 1 – Embrace diversity

Decommission: statements 40 - 42

The Decommissioning stage of the AI lifecycle focuses on the planning, delivery, and documentation of decommissioning activities.

Decommissioning an AI-enabled system encompasses the entire AI production system. It includes retiring, shutting down, or repurposing system components. It is part of the full AI lifecycle process and is distinct from other activities that may occur during the lifecycle like retiring data, AI models, infrastructure, or captured data.

Taking a structured approach will allow you to safely decommission an AI system – mitigating the risk of unauthorised access or a data breach while ensuring ongoing compliance.

Notes:

The Pilot AI Assurance Framework recommends understanding the implications of decommissioning an AI system to ensure agencies can address all potential consequences.
The Voluntary AI Safety Standard promotes proactive stakeholder engagement during the retirement stage, as well as the importance of maintaining detailed records.
The Policy for the responsible use of AI in government encourages business and technology process enhancement to assist APS AI capability uplift over time.

‘An AI system is a machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments. Different AI systems vary in their levels of autonomy and adaptiveness after deployment.’

The standard adopts an agency-first approach. Rather than introducing new processes or duplication, it emphasises the reuse of agency policies, frameworks and practices.

Agencies may choose to combine the standard with existing frameworks, such as project governance or data, to include AI-related activities. The standard complements existing frameworks and legislation to help ensure agencies meet their obligations in the use of AI.

The challenges for government use of AI are complex and linked with other governance considerations, such as:

The Australian Public Service (APS) Code of Conduct
data governance
cyber security
ICT infrastructure
privacy
sourcing and procurement
copyright
ethics practices.

While not exhaustive, a list of related existing frameworks and related resources is provided by the Policy for the responsible use of AI in government.

The practices outlined in this document take the form of standard statements, criteria, and explanatory notes.

Statement: the standard statements describe ‘what’ needs to be done.
Criteria: each statement has at least one criterion to satisfy the statement. Each criterion is ‘required’ or ‘recommended’.
1. Required: agencies must satisfy criterion marked as required to meet the standard. Required criterion are driven by Australian legislation, regulation and policies, and ethics principles.
2. Recommended: agencies should implement any criterion marked as recommended.
Explanatory notes: explanatory notes are provided for each criterion. These notes are intended to offer guidance rather than serve as a comprehensive checklist. Unless specified, explanatory notes are not mandatory but are intended to support understanding, offer ways to implement the criterion, and provides examples, scenarios, and concepts to guide implementation.

The level of detail and implementation of each statement will vary across use cases. Practical use case guidance has been provided in the Use Case Applications section of the standard.

The standard is applicable regardless of whether an agency develops an AI system in-house or contracts an external provider to build or supply it. Engaging external providers does not prevent agencies from implementing each of the criteria in the statements. Agencies that adopt the standard are accountable for ensuring it is met in line with the required and recommended criterion.

Transparency documents can be utilised to support assessments, including open-source software.

For early experimentation, proof of concept, and pilots of AI products and services, the standard should be used to provide guidance for building responsible and safe AI systems, ensuring a clear pathway to production.

The standard helps government:

contribute to the ethical use of AI to ensure public trust
stay compliant with regulatory requirements and alignment with AI strategic frameworks
align with cybersecurity guidelines and the AI Assurance framework
support innovation and whole of economy growth
support AI sourcing and adoption processes
provide alignment with international AI best practices across government.

Scope

In scope

The standard applies to:

AI services and products for administrative decision-making in government
AI systems that may produce discriminatory, unfair, or harmful outcomes
platform, data, and software for AI services and products
a product or service with at least one AI model, hosted internally or externally
reuse of AI assets, including applying to new or changed use cases
systems with embedded AI services and products
publicly available AI tools, such as ChatGPT.

Examples of the types of AI considered for the standard includes machine learning, computer vision, deep learning, artificial neural networks, generative AI (GenAI) or any combinations of these.

Out of scope

While the below list is out of scope, agencies can adapt and apply the standard at their own discretion:

automated decision-making
robotic process automation
human-repeatable scripts or processes
artificial general intelligence
incidental use of AI.

The standard does not define, but works in conjunction with, the following:

procurement processes and guidance
project management methodologies
risk identification and impact assessment
incident, problem, and change management.