• Agencies must

    • Criterion 6: Identify and assign AI roles to ensure a diverse team of business and technology professionals with specialised skills.

      Specialist roles may include, noting that an individual may perform one or more of these roles:

      • AI accountable official: A senior executive accountable for their agency’s implementation of the Policy for the responsible use of AI in government
      • Data scientists and analysts: Professionals who collect, process, and analyse datasets to inform AI models. They will have expertise in statistical analysis which supports the development of reliable AI systems
      • AI integration engineers: Professionals responsible for planning, designing and implementing all components requiring integration in an AI system. The role includes reviewing client needs, developing and testing specifications and documenting outputs
      • AI and machine language engineers: Specialists who design, build, and maintain AI models and algorithms. They work closely with data scientists to implement scalable AI systems
      • AI test engineers: Specialists who verify and validate AI systems against business and technical requirements
      • Ethics and compliance officers: Specialists who ensure that AI systems adhere to legal standards and ethical guidelines, mitigating risks associated with AI systems
      • Domain experts: Individuals with specialised knowledge in specific fields, such as healthcare or finance, who provide context and insights to ensure that AI systems are relevant and effective within their respective domain.
    • Criterion 7: Build and maintain AI capabilities by undertaking regular training and education of end users, staff, and stakeholders.

      This may involve:

      • agencies should provide regular training programs keep staff updated on the latest tools, methodologies, ethical guidelines and regulatory requirements
      • consider how to tailor training to the knowledge requirements of each role and provide staff involved in procurement, design, development, testing, and deployment of AI systems with specialised training. For example, individuals responsible for managing and operating AI decision-making systems should undergo specific AI ethics training
      • consider tailoring training for people with disability
      • consider interactive workshops, simulations, case study walk-throughs and computing sandpit environments to provide more immersive and real-world-like experiences especially for more complex aspects of AI.

    Agencies should:

    • Criterion 8: Mitigate staff over reliance, under reliance, and aversion of AI.

      This may involve:

      • perform periodic technology-specific training, performance assessments, peer reviews, or random audits
      • implement a regular feedback loop for incorrect AI outcomes.
         
  • Statement 4: Enable AI auditing

  • Statement 4: Enable AI auditing

  • Agencies must: 

    • Criterion 9: Provide end-to-end auditability.

      End-to-end AI auditability refers to the ability to trace and inspect the decisions and processes involved in the AI system lifecycle. This enables internal and external scrutiny. Publishing audit results enables public accountability, transparency, and trust.

      This may include:

      • establishing documentation across the AI system lifecycle as agreed with the accountable official. This should demonstrate conformance with the AI technical standard, and compliance with relevant legislation and regulations.
      • establishing traceability of decisions and changes from requirements through to operational impacts
      • ensuring accessibility, availability, and explainability of technical and non-technical information to assist audits
      • ensuring audit logging of the AI tools and systems are configured appropriately

        This may include:

        • enabling or disabling the capture of system inputs and outputs
        • detect and record modifications to the system’s operation or performance
        • record who made the modification, under what authority, and the rationale for the modification
        • record system version and any other critical system information.
      • reviewing of audit logs
      • ensuring independence and avoiding conflict of interest when undertaking AI audits.
    • Criterion 10: Perform ongoing data-specific checks across the AI lifecycle.

      This should address:

      • data quality for AI training, capabilities, and limitations
      • how data was evaluated for bias
      • controls to detect and manage data poisoning
      • legislative compliance.
    • Criterion 11: Perform ongoing model-specific checks across the AI lifecycle.

      This should address:

      • track and maintain experiments with new models and algorithms to ensure reproducibility, achieving similar model performance with the same dataset
      • output flaws such as factually incorrect, nonsensical, or misleading information, which may be referred to as AI hallucinations
      • bias and potential harms, such as ensuring fair treatment of all demographic groups
      • model explainability
      • controls to detect and manage model poisoning
      • legislative compliance.
  • Statement 5: Provide explainability based on the use case

  • Statement 5: Provide explainability based on the use case

  • Agencies must

    • Criterion 12: Explain the AI system and technology used, including the limitations and capabilities of the system.

      AI algorithms and technologies such as deep learning models, are often seen as 'black boxes'. This can make it difficult to understand how they work and the factors that generate outcomes. Providing clear and understandable explanations of AI outputs helps maintain trust and transparency with AI systems.

      Explainability on the specific context of the use case ensures clear understanding and reasoning behind AI system output. This supports accountability, trust, and ethical considerations.

      This may include:

      • explaining the AI system such as:
        • consideration of trade-offs such as cost and performance
        • what changes are made with AI system updates
        • how feedback is used in improving AI system performance
        • whether the AI system is static or learns from user behaviour
        • whether AI techniques would provide clearer explanations and validate AI actions and decisions.
      • use cases that are impacted by legislation, regulation, rules, or third-party involvement
      • explain how the system operates including situations that require human intervention
      • explain technical and governance mechanisms that ensure ethical outcomes from the use of an AI system
      • inform stakeholders when changes are made to the system
      • persona level explainability adhering to need-to-know principles.

    Agencies should:

    • Criterion 13: Explain outputs made by the AI system to end users.

      This typically includes:

      • explaining:
        • AI outputs that have serious consequences
        • how outputs are based on the data used
        • consequences of system actions and user interactions
        • errors
        • high-risk situations
        • avoid explanations that are confusing or misleading
      • using a variety of methods to explain outputs.
    • Criterion 14: Explain how data is used and shared by the AI system.

      This includes:

      • how personal and organisational data is used and shared between the AI system and other applications
      • who can access the data
      • where identified data has been used, or will be used, for AI system training.
         
  • Statement 6: Manage system bias

  • Statement 6: Manage system bias 

  • Management of bias and its potential harms of an AI system is critical to ensuring compliance with federal anti-discrimination legislation. Australia’s anti-discrimination law states: 

    …it is unlawful to discriminate on the basis of a number of protected attributes including age, disability, race, sex, intersex status, gender identity and sexual orientation in certain areas of public life, including education and employment. 

    Certain forms of bias, such as affirmative measures for disadvantaged or vulnerable groups, play a constructive role in aligning AI systems to human values, intentions, and ethical principles. At the same time, it’s important to identify and address biases that may lead to unintended or harmful consequences. A balanced approach to bias management ensures that beneficial biases are preserved while minimising the impact of problematic ones.

    When integrating off-the-shelf AI products, it’s essential to ensure they deliver fair and equitable outcomes in the targe operating environment. Conducting thorough bias evaluations becomes especially important when documentation or supporting evidence is limited.

    Agencies must:

    • Criterion 15: Identify how bias could affect people, processes, data, and technologies involved in the AI system lifecycle.

      Systemic biases: are rooted in societal and organisational culture, procedures, or practices that disadvantage or benefit specific cohorts. These biases manifest in datasets and in the processes throughout the AI lifecycle.

      Human bias: can affect design decisions, data collection, labelling, test selection, or any process that require judgment throughout the AI lifecycle. They could be conscious (implicit) or unconscious (explicit).

      Statistical and computational bias occurs when data used to train an AI system is not representative of the population. This is explored in more depth in the data section.

      This includes: 

      • establishing a bias management plan, outlining how bias will be identified, assessed, and managed across the AI system lifecycle
      • checking for systemic bias, which are rooted in societal and organisational culture, procedures, and practices that disadvantage or benefit specific cohorts. These biases manifest in datasets and in the processes throughout the AI lifecycle
      • checking for algorithmic bias in decision-making systems, where an output from an AI system might produce incorrect, unfair or unjustified results
      • checking for human bias, which can be conscious or unconscious biases in design decisions, data collection, labelling, test selection, or any process that requires judgment throughout the AI lifecycle
      • checking for statistical and computational bias, which can occur when data used to train an AI system is not representative of the population
      • checking for bias based on the application of AI, such identifying cognitive bias in a computer vision system
      • considering intended bias, such as identifying specific circumstances for a person or a group
      • considering inherent bias when reusing pre-trained AI models.
         
      • Examples of sources of bias includes:
        • Cognitive bias – systematic human inclinations or reasoning, such as subconscious judgements based on the current norms of individuals. Based on how people interpret and understand information in their surroundings, such as only using data that reinforces an individual's belief
        • Authority bias – tendency to provide greater weighting or consideration of information from an authority source
        • Availability bias – providing undue weighting to information or processes that they are, or have been, actively involved with
        • Confirmation bias – tendency to interpret, favour, or seek out information that reinforces a personal belief, value or understanding, such as a political alignment
        • Contextual bias – reliance upon unnecessary or irrelevant information which may unduly influence a decision
        • In-group or labelling bias – preferential treatment is provided to those who belong in the same group. Adversely, out-group bias is where unfavourable treatment is provided to those who belong in other groups
        • Stereotype bias – generalisations about an individual or group of people based on shared characteristics, such as age, gender, or ethnicity
        • Anchoring bias – tendency to rely heavily on the first piece of information they receive
        • Group think – tendency for people to strive for consensus within a group
        • Automation bias – tendency to rely on automated systems and ignore contradictory information made without automation.
    • Criterion 16: Assess the impact of bias on your use case.

      This typically involves:

      • identifying stakeholders and the potential harms to them
      • identifying existing countermeasures and assessing their effectiveness
      • engaging with diverse and multi-disciplinary stakeholders in assessing the potential impacts of bias
      • using bias assessment tools relevant to your use case.
    • Criterion 17: Manage identified bias across the AI system lifecycle.

      For off-the-shelf products, AI deployers should ensure that the AI system provides fair outcomes. Evaluating for bias will be critical where insufficient documentation from the off-the-shelf AI model supplier is provided.

      This involves:

      • engaging multi-disciplinary skillsets and diverse perspectives, including:
        • policy owners, legal, architecture, data, IT experts, program managers, service delivery professionals, subject matter experts
        • people with lived experience, for example people with disability, gender or sexual diversity and people who are culturally and linguistically diverse.
      • implementing multiple approaches to reduce automation bias and monitor to detect unwanted bias that might emerge
      • identifying bias-specific documentation requirements such as data and model provenance records:
        • document selection criteria for selecting stakeholders, metrics, and other design-related decisions
        • document any discarded requirement, design, data, model, or tests with corresponding rationale
        • document biases that resulted in decommissioning the data, the model, the application, or the system
      • performing periodic context-based bias awareness training for teams
      • consideration of lifecycle stage-specific mitigations, including:
        • identify and validate root causes of bias before addressing them
        • identify corrective and preventive actions corresponding to the root causes of bias
        • identify fairness metrics at design. Performance metrics, such as accuracy and precision, aggregated over the entire dataset could hide bias. For example, a cancer detecting device with 90 per cent accuracy averaged across the entire dataset could hide underperformance on a minority population. Disaggregating performance metrics into suitable attributes can detect whether a system performs fairly across demographics, environmental conditions, and other risk factors
        • analyse data for bias and fix issues in the data. See Model and Context dataset section  for more information
        • test independence strategy, functional performance testing, fairness testing, and user acceptance testing
        • configure, calibrate, and monitor bias-related metrics during phased roll-out
        • monitor bias-related metrics and unintended consequences during operations. Provide mechanisms for end-users to report and escalate experiences of bias.
        • audit for how risks of bias are identified, assessed, and mitigated throughout the lifecycle.
        • find and use suitable tools that discover and test for unwarranted associations between the AI system outputs and protected input features
        • implement bias mitigation techniques after harmful bias has been identified
        • implement bias mitigation thresholds that can be configured post-deployment to ensure equity for cohorts, such as people with lived experience.
           
  • Statement 7: Apply version control practices

  • Statement 7: Apply version control practices

  • Version control is a process that tracks and manages changes to information such as data, models, and system code. This allows business and technical stakeholders to identify the state of an AI system when decisions are made, restore previous versions, and restore deleted or overwritten files.

    AI system versioning can extend beyond traditional coding practices, which manages a package of identifiable code or configuration information. Version control for information such as training data, models, and hyperparameters will need to be considered.

    Information across the AI lifecycle, that was used to generate a decision or outcome, must be captured. This applies to all AI products, including low code or no code third-party tools.

    Agencies must

    • Criterion 18: Apply version management practices to the end-to-end development lifecycle.

      Australian Government API guidelines mandate the use of semantic versioning. They should be enhanced to cater for AI related information and processes.

      Version standards should clearly document the difference between production and non-production data, models and code.

      This involves applying version management practices to:

      • the model, training and operation dataset, data in the AI system, training algorithm, and hyperparameters
      • maintaining design documentation outlining the end-to-end AI system state in line with existing organisational control mechanisms
      • include point-in-time date and timestamps to data and any changes in data
      • authorship, relevant licencing details, and changes since last version
      • capturing approvals from accountable officials for workflow and model reviews, datasets used for training, and relevant hyperparameters
      • managing any data poisoning and AI poisoning
      • data versioning supporting AI interoperability should include the following:
        • consistency: data structures, exchanges, and formats across different sources are well-defined
        • integration: enables data sourced from different sources to be integrated in a seamless manner
        • all documents relating to the establishment, design, and governance of an AI implemented solution must be retained as per the Archives Act 1983.
    • This does not apply to:
      • third-party software products, which are subject to existing controls.

    Agencies should

    • Criterion 19: Use metadata in version control to distinguish between production and non-production data, models, and code.

      This includes:

      • a simple and transparent way for all users of the system to understand the version of each component at the time a decision was made
      • the use of tags in the version number to provide a visual representation of non-production versions without needing direct access to data or source control toolsets
      • the use of metadata can also distinguish between different control states where outputs can vary, and core system functionality of the system has not changed.
    • Criterion 20: Use a version control toolset to improve useability for users. 

      Version toolsets improve the usability for service delivery and business users, addressing activities such as appeals, Ministerial correspondence, executive briefs, court cases, audit, assurance, privacy, and legislative reviews

      This includes:

      • using purpose built in-house or commercial version management products
      • storing sufficient information to allow rollback to a previous system state
      • considering archival requirements of training data used in a test environment. 
    • Criterion 21: Record version control information in audit logs.

      This includes:

      • use of a commit hash to identify the control state of all elements, to reduce the volume and complexity of audit log data
      • recording AI predictions and actions taken
      • pro-active data analytics to be processed against the audit logs, to monitor and assess ongoing AI system performance
      • where low code or no code third-party tools are used.
  • Statement 8: Apply watermarking techniques

  • Statement 8: Apply watermarking techniques

  • AI watermarking can be used to embed visual or hidden markers into generated content, so that its creation details can be identified. It provides transparency, authenticity, and trust to content consumers.

    Visual watermarks or disclosures provide a simple way for someone to know they are viewing content created by, or interacting with, an AI system. This may include generated media content or GenAI systems.

    The Coalition for Content Provenance and Authenticity (C2PA)  is developing an open technical standard for publishers, creators, and consumers to establish the origin and edits of digital content. Advice on the use of C2PA is out of scope for the standard.

    Agencies must:

    • Criterion 22: Apply visual watermarks and metadata to generated media content to provide transparency and provenance, including authorship.

      This will only apply where AI generated content may directly impact a user. For instance, using AI to generate a team logo would not need to be watermarked.

    • Criterion 23: Apply watermarks and metadata that are WCAG compatible where relevant

    • Criterion 24: Apply visual and accessible content to indicate when a user is interacting with an AI system.
      For example, this may include adding text to a GenAI interface so that users are aware they are interacting with an AI system rather than a human.

    Agencies should:

    • Criterion 25: For hidden watermarks, use watermarking tools based on the use case and content risk.

      This includes:

      • including provenance and authorship information
      • encrypting watermarks for high-risk content
      • using an existing tool or technique when practicable
      • embedding watermarks at the AI training stage to improve their effectiveness and allows additional information such as content modification to be included
      • verifying that the watermark does not impact the quality or efficiency of content generation, such as image degradation or text readability
      • including data sources, such as publicly available content used for AI training to manage copyright risks, and product details such as versioning information.
    • Criterion 26: Assess watermarking risks and limitations.

      This includes:

      • ensuring users understand there is a risk of third parties replicating a visual watermark and to not over rely on watermarks, such as sourcing content from external sources
      • preventing third-party use of watermarking algorithms to create their own content and act as the original content creator
      • consider situations where watermarking is not beneficial. For example, watermarking can be visually distracting for decision makers, or when its overused in low-risk applications
      • consider situations where malicious actors might remove or replicate the watermark to reproduce content generated by AI
      • managing copyright or trademark risks related to externally sourced data.

Connect with the digital community

Share, build or learn digital experience and skills with training and events, and collaborate with peers across government.