-
Statement 9: Conduct pre-work
-
Statement 9: Conduct pre-work
-
Agencies must:
Criterion 27: Define the problem to be solved, its context, intended use, and impacted stakeholders.
This includes:
- analysing the problem through problem-solving frameworks such as root cause analysis, design thinking, and DMAIC (define, measure, analyse, improve, control)
- define user needs, system goals and the scope of AI in the system
- identifying and documenting stakeholders, including:
- internal or external end-users, such as APS staff or members of the public
- indigenous Australians, refer to Framework for Governance of Indigenous Data
- people with lived experiences, including those defined by religion, ethnicity, or migration status
- data experts, such as owners of the data being used to train and validate the AI system
- subject matter experts, such as internal staff
- the development team, including SROs, architects, and engineers.
- understanding the context of the problem such as interacting processes, data, systems, and the internal and external operating environment
- phrasing the problem in a way that is technology agnostic.
Criterion 28: Assess AI and non-AI alternatives.
This includes:
- starting with the simplest design, experimenting, and iterating
- validate and justify the need for AI by conducting an objective quality evidence assessment
- differentiating parts that could be solved by traditional software from parts that could benefit from AI
- determine why using AI would be more beneficial over non-AI alternatives by comparing KPIs
- considering the interaction of any AI and non-AI components
- considering existing agency solutions, commercial, or open-source off-the-shelf products
- examining capabilities, performance, cost, and limitations of each option
- conducting proof of concept and pilots to assess and validate the feasibility of each option
- for transformative use cases, consider foundation and frontier models. Foundation models are quite versatile, trained on large data sets, and can be fine-tuned for specific contexts. Frontier models are at the forefront of AI research and development, trained on extensive datasets, and may demonstrate creativity or reasoning.
Criterion 29: Assess environmental impact and sustainability.
Developing and using AI systems may have corresponding trade-offs with electricity usage, water consumption, and carbon emissions.
Criterion 30: Perform cost analysis across all aspects of the AI system.
This includes:
- infrastructure, software, and tooling costs for:
- acquiring and processing data for training, validation, and testing
- tuning the AI system to your particular use case and environment
- internally or externally hosting the AI system
- operating, monitoring, and maintaining the AI system.
- cost of human resources with the necessary AI skills and expertise.
- infrastructure, software, and tooling costs for:
Criterion 31: Analyse how the use of AI will impact the solution and its delivery.
This includes:
- identifying the type of AI and classification of data required
- identifying the implications of integrating the AI system with existing departmental systems and data, or as a standalone system
- identifying legislation, regulations, and policies.
-
Statement 10: Adopt a human-centred approach
-
Agencies must
Criterion 32: Identify human values requirements.
Human values represent what people deem important in life such as autonomy, simplicity, tradition, achievement, and social recognition.
This includes:
- using traditional requirement elicitation techniques such as surveys, interviews, group discussions and workshops to capture relevant human values for the AI use case
- translating human values into technical requirements, which may vary depending on the risk level and AI use case
- reviewing feedback to identify ignored human-values in the AI system
- understanding the hierarchy of human values and emphasising those with higher relevance
- considering social, economic, political, ethical, and legal values when designing AI systems
- considering human values that are domain specific and based on the context of the AI system.
Criterion 33: Establish a mechanism to inform users of AI interactions and output, as part of transparency.
Depending on use case this may include:
- incorporating visual cues on the AI product when applicable
- informing users when text, audio or visual messages addressed to them are generated by AI
- including visual watermarks to identify content generated by AI
- providing transparency on whether a user is interacting with a person, or system
- including a disclaimer on the limitations of the system
- displaying the relevance and currency of the information being provided
- persona level transparency adhering to need-to-know principles
- providing alternate channels where a user chooses not to use the AI system. This may include channels such as a non-AI digital interface, telephony, or paper.
Criterion 34: Design AI systems to be inclusive, ethical, and meets accessibility standards using appropriate mechanisms.
This includes:
- identifying affirmative actions or preferential treatment that apply for any person or specific stakeholder groups
- ensuring diversity and inclusion requirements, and guidelines, are met throughout the entire AI lifecycle
- providing justification to situations such as pro-social policy outcomes
- reviewing and revisiting ethical considerations throughout the AI system lifecycle.
Criterion 35: Define feedback mechanisms.
This includes:
- providing options to users on the type of feedback method they prefer
- providing users with the choice to dismiss feedback
- provide the user with the option to opt-out of the AI system
- ensuring measures to protect personal information and user privacy
- capturing implicit feedback to reflect user's preferences and interactions, such as accepting or rejecting recommendations, usage time, or login frequency
- capturing explicit feedback via surveys, comments, ratings, or written feedback.
Criterion 36: Define human oversight and control mechanisms.
This includes:
- identifying conditions and situations that need to be supervised and monitored by a human, conditions that need to be escalated by the system to a supervisor or operator for further review and approval, and conditions that should trigger transfer of control from the AI system to a supervisor or operator
- defining the system states, errors, and other relevant information that should be observable and comprehensible to an informed human
- defining the pathway for the timely intervention, decision override, or auditable system takeover by authorised internal users
- subsets of inputs and outputs that may result in harm should be recorded for monitoring, auditing, contesting, or validation. This will facilitate reviewing of false positives against inputs that triggered them, and of false negatives that result in harms
- identifying situations where a supervising human might become disengaged and designing the system to attract the operators attention
- map human oversight and control requirements to corresponding risks they mitigate
- identifying required personas and defining their roles
- adherence to privacy and security need-to-know principles.
Agencies should:
Criterion 37: Involve users in the design process.
The intention is to promote better outcomes for managing inclusion and accessibility by setting expectations at the beginning of the AI system lifecycle.
This includes:
- considering security guidance and the need-to-know principle
- involving users in defining requirements, evaluating, and trialling systems or products.
-
Statement 11: Design safety systemically
-
Statement 11: Design safety systemically
-
Agencies must:
Criterion 38: Analyse and assess harms.
This includes:
- Utilising functional safety standards that provide frameworks for a systematic and robust harms analysis.
Criterion 39: Mitigate harms by embedding mechanisms for prevention, detection, and intervention.
This includes:
- designing the system to avoid the sources of harm
- designing the system to detect the sources of harm
- designing the system to check and filter its inputs and outputs for harm
- designing the system to check for sensitive information disclosure
- designing the system to monitor faults in its operation
- designing the system with redundancy
- designing intervention mechanisms such as warnings to users and operators, automatic recovery to a safe state, transfer of control, and manual override
- designing the system to log the harms and faults it detects
- designing the system to disengage safely as per requirements
- for physical systems, designing proper protective equipment and procedures for safe handling
- ensuring the system meets privacy security requirements and adheres to the need-to-know principle for information security.
Agencies should
Criterion 40: Design the system to allow calibration at deployment.
This includes:
- where initial setup parameters are critical to the performance, reliability, and safety of the AI system.
- where initial setup parameters are critical to the performance, reliability, and safety of the AI system.
-
Statement 12: Define success criteria
-
Statement 12: Define success criteria
-
Agencies must
Criterion 41: Identify, assess, and select metrics appropriate to the AI system.
Relying on a single metric could lead to false confidence, while tracking irrelevant metrics could lead to false incidents. To mitigate these risks, analyse the capabilities and limitations of each metric, select multiple complementary metrics, and implement methods to test assumptions and to find missing information.
Considerations for metrics includes:
- value-proposition metrics – benefits realisation, social outcomes, financial measures, or productivity measures
- performance metrics – precision and recall for classification models, mean absolute error for regression models, or bilingual evaluation understudy (BLEU) for text generation. This can include summarisation tasks, inception score for image generation models, or mean opinion score for audio generation
- training data metrics – data diversity and data quality related measures
- bias-related metrics – demographic parity to measure group fairness, fairness through awareness to measure individual fairness, counterfactual fairness to measure causality-based fairness
- safety metrics – likelihood of harmful outputs, adversarial robustness, or potential data leakage measures
- reliability metrics – availability, latency, mean time between failures (MTBF), mean time to failure (MTTF), or response time
- citation metrics – measures related to proper acknowledgement and references to direct content and specialised ideas
- adoption-related metrics – adoption rate, frequency of use, daily active users, session length, abandonment rate, or sentiment analysis
- human-machine teaming metrics – total time or effort taken to complete a task, reaction time when human control is needed, or number of times human intervention is needed
- qualitative measures – checking the well-being of the humans operating or using the AI system, or interviewing participants and observing them while using the AI system to identify usability issues
- drift in AI system inputs and outputs - changes in input distribution, outputs, and performance over time.
After metrics have been identified, understand and assess the trade-offs between the metrics.
This includes:
- assessing trade-offs between different success criteria
- determining the possible harms with incorrect output, such as a false positive or false negative
- analysing how the output of the AI system could be used. For example, determine which instance would have greater consequences: a false negative that would fail to detect a cyberattack; or a false positive that incorrectly flags a legitimate user as a threat
- assessing the trade-offs among the performance metrics
- understanding the trade-offs with costs, explainability, reliability, and safety
- understanding the limitations of the selected metric and ensure measures are considered when building the AI system, such as selecting data and training methods
- trade-offs are documented, understood by stakeholders, and accounted for in selecting AI models and systems
- optimising the metrics appropriate to the use case.
Agencies should
Criterion 42: Reevaluate the selection of appropriate success metrics as the AI system moves through the AI lifecycle.
Criterion 43: Continuously verify correctness of the metrics.
Before relying on the metrics, verify the following:
- metrics are accurately reflected when the AI system does not have enough information
- metrics correctly reflect errors, failures, and successful task performance.
-
Statement 13: Establish data supply chain management processes
-
Statement 13: Establish data supply chain management processes
-
Agencies must
Criterion 44: Create and collect data for the AI system and identify the purpose for its use.
It is important to identify:
- what data will be used and is fit-for-purpose for the AI system
- the sensitivity of the data, such as personal, protected, or otherwise sensitive
- consent provided on usage including when to retain or destroy data, ensuring the proposed uses in the AI system align with the original limits of the consent
- speed and mode of the data supply
- how the data will be used at each stage of the AI system
- where the data will be stored at each stage of the AI system
- changes to the data at different points of the AI system
- methods to manage and monitor data access
- methods to manage any real-time data changes
- data retention policies
- cross-agency or cross-border data governance, if relevant
- any risks and challenges associated with data elements of off-the-shelf AI models, products, or services in the AI system
- cyber supply chain management
- data quality monitoring and remediation
- comprehensive documentation at each stage of the AI system to facilitate traceability and accountability
- adherence to relevant legislation.
- The consent framework for use of data across the AI system should satisfy the following:
- clear framework
- kept up to date
- individuals are provided with informed consent for how their data will be used
- a dedicated team to own and maintain a register on how data is being used and to show compliance with the terms of the consent
- The data should be thought of in groupings or packages, including:
- the data within the organisation
- the data surrounding the algorithm, APIs, and user interface
- the data used to train the AI system
- the data used for testing and integration
- data inputted at regular intervals in monitoring the data
- the data used at deployment, including input and output data from and to users.
Criterion 45: Plan for data archival and destruction.
Consider the following:
- will data be made available for future use, and what data
- restrictions and access controls in place
- will data be restricted until a specific date
- file formats to ensure data remains available during the archival period
- alignment with data sharing arrangements
- arrangements for data used to train and test AI models, and associated model management arrangements
- clear criteria for data archival and destruction for the data used at each stage of the AI lifecycle
- guidelines in the Information management for records created using Artificial Intelligence (AI) technologies | naa.gov.au.
Agencies should:
Criterion 46: Analyse data for use by mapping the data supply chain and ensuring traceability.
Mapping the data supply chain to the AI system involves capturing how data will be stored, shared, and processed, particularly at the training and testing stages, which involve regular injections of data. When mapping the data account for:
- how data was sourced
- what data is required by the system, ensuring that excess data or data irrelevant to the functioning of system is not consumed by the system
- the amount and type of data the system will use
- what could affect the reliable accessibility of data
- how data will be fused and transformed
- how will the data be secured at rest and in transit
- how will the data be used by the system.
Ensuring traceability entails maintaining awareness of the flow of data across the AI system.
This includes:
- data sovereignty controls and considerations including legal implications for geographic locations for data (including its metadata and logs) when at rest, in transit, or in use. For classified data processing on cloud platforms, it is recommended to use cloud service providers and cloud services located in Australia, as per Cloud assessment and authorisation | Cyber.gov.au
- providing the level of detail for debugging data errors and troubleshooting
- enforcing organisational policies on information management
- enhancing visibility over changes to the data occurring during migrations, system updates, or other errors
- supporting users to identify and fix data issues with a clear information audit trail
- supporting diagnosis for bias
- managing the quality of data to maintain availability and consistency.
Criterion 47: Implement practices to maintain and reuse data.
This involves determining ongoing mechanisms for ensuring data is protected, accessible, and available for use in line with the original consent parameters.
Any changes in data scope, including expansion in scope and usage patterns, would need to be monitored and addressed.
-
Statement 14: Implement data orchestration processes
-
Statement 14: Implement data orchestration processes
-
Agencies must
Criterion 48: Implement processes to enable data access and retrieval, encompassing the sharing, archiving, and deletion of data.
Considerations include:
- security classifications and permissions of the data
- speed or mode of the data, such as streaming or batch data
- alignment to Guidelines for data transfers | Cyber.gov.au.
Agencies should
Criterion 49: Establish standard operating procedures for data orchestration.
This includes:
- defining responsibilities between business areas and identifying mutual outcomes to be managed across teams. This is particularly important for business areas that are owners of datasets
- considering inclusion of infrastructure arrangements and use of cloud arrangements for data storage or processing.
Practices to be defined include:
- data governance
- data testing
- security and access controls.
Criterion 50: Configure integration processes to integrate data in increments.
This includes:
- enabling agencies to better manage incident identification and intervention during data integration
- ensuring risks of creating personal identifiable information from data integration are managed appropriately.
Criterion 51: Implement automation processes to orchestrate the reliable flow of data between systems and platforms.
Criterion 52: Perform oversight and regular testing of task dependencies.
This should involve having comprehensive backup plans in place to handle potential outages or incidents.
The following should be considered:
- regular backups of critical data
- failover mechanisms
- detailed recovery procedures to minimise downtime and data loss.
Criterion 53: Establish and maintain data exchange processes.
This includes:
- how often will data need to be accessed by the system
- at what points will the frequency, magnitude, or speed of access change
- how will security processes adapt when data is exposed to new risks across the AI system
- how will data be monitored for changes to accessibility or completeness
- will the sensitivity of the data change once processed or analysed
- how to validate data trust and authenticity.
-
Statement 15: Implement data transformation and feature engineering practices
-
Statement 15: Implement data transformation and feature engineering practices
-
Agencies should
Criterion 54: Establish data cleaning procedures to manage any data issues.
Data cleaning involves appropriately treating data errors, inconsistencies, or missing values to improve performance of the AI system. Data cleaning should be documented, and possibly included in the metadata, each time it is conducted to manage issues such as:
- blanks, nulls, or trailing spaces
- structural errors or unwanted formatting
- missing data
- spelling mistakes
- repetition of words
- irrelevant characters
- content or observations irrelevant to the purpose of the AI system.
For open-source data, or data that has not yet been validated or can be trusted, consider using a sandbox environment.
Criterion 55: Define data transformation processes to convert and optimise data for the AI system.
This could leverage existing Extract, Transform and Load (ETL) or Extract, Load and Transform (ELT) processes.
Consider the following data transformation techniques:
- data standardisation – convert data from various sources into a consistent format
- data reorganisation – organise data to make it easier to query and analyse
- data integration – combine data from different sources for a single unified view
- discretisation – convert continuous data into discrete intervals
- missing value imputation – analyse what values need to be imputed and the method
- convert data from one source to another, such as log transformation
- smoothing – to even out fluctuations
- convert unstructured data to structured data
- Optical Character Recognition (OCR) – convert images of text into machine readable format
- object labelling and tracking – in images, audio, and video
- signal processing and transformation
- point in time of data – a snapshot of data at a specific point in time.
Criterion 56: Map the points where transformation occurs between datasets and across the AI system.
Consider:
- security checks.
Criterion 57: Identify fit-for-purpose feature engineering techniques.
Feature engineering techniques include:
- feature creation and extraction – deriving features from existing data to help the AI system produce better quality outputs
- feature selection – selecting attributes or fields that provide relevant context to the AI model
- encoding – converting data into a format that can be better used in AI algorithms
- binning – grouping data into categories
- specific conversion – changing data from one format to another for AI compatibility
- scaling – mapping all data to a specific range to help improve AI outputs.
Criterion 58: Apply consistent data transformation and feature engineering methods to support data reuse and extensibility.
Consider:
- metadata and tagging of the data
- data transformation not limited to AI models and processes.
-
Statement 16: Ensure data quality is acceptable
-
Statement 16: Ensure data quality is acceptable
Connect with the digital community
Share, build or learn digital experience and skills with training and events, and collaborate with peers across government.