Findings and recommendations

Introduction

This report outlines the findings from the pilot of the draft Australian Government artificial intelligence (AI) assurance framework (the pilot framework) and supporting guidance, conducted by the Digital Transformation Agency (DTA) from September to November 2024.  

The pilot framework contains a draft AI impact assessment tool designed to help Australian Government agencies to identify, assess and manage potential AI use case impacts and risks. This report refers to the pilot framework as the AI impact assessment tool to better reflect the purpose of the document. The pilot involved staff from 21 Australian Government agencies, listed at Appendix A, that volunteered to test the pilot framework’s AI impact assessment tool. 

AI systems operate with elevated complexity, speed, and scale which can amplify potential harmful outcomes in ways that existing technology governance frameworks and processes may not fully address. AI system opacity, bias, unpredictability, novelty and rapid evolution can further compound these challenges. Robust governance and assurance processes are essential to ensuring that AI systems are operating as intended, upholding ethical principles and managing risks effectively throughout their lifecycle. By addressing specific AI challenges, these responsible AI practices serve as critical enablers for AI innovation.  

In June 2024, the Australian Government and state and territory governments announced the National framework for the assurance of AI in government (the national framework). This provided the first nationally consistent, principles-based approach to assurance of government AI use, aligned with Australia's AI Ethics Principles. The impact assessment process tested in this pilot, provides a tool for agencies to demonstrate their use of AI is consistent with the national framework and the Ethics Principles.

AI accountability and transparency requirements at the agency level for all in-scope agencies was introduced in September 2024, with the Policy for the responsible use of AI in government (the AI policy). However, at the use case level, each agency must decide its own AI governance settings. While aspects of established approaches may be useful, the governance of AI systems remains an emerging discipline, and agencies face challenges navigating complex decisions without the benefit of established expertise or time-tested frameworks.

Inconsistent approaches to AI use case governance can lead to gaps in AI risk identification and mitigation, which can hamper efforts to secure public trust and confidence in government’s safe and responsible adoption of innovative technologies. This in turn may limit government AI adoption, particularly in more complex areas. These are precisely the areas where potential elevated risks, with appropriate mitigations, may be justified by greater potential benefits in terms of improved government services and efficiency. Managing these risks to realise these benefits safely and responsibly requires robust governance and assurance. The impact assessment process tested through this pilot is a key first step to achieving these goals.  

The AI impact assessment process tested in this pilot aims to provide agencies with a consistent, structured approach to assessing an AI use case’s alignment with Australia's AI Ethics Principles. The draft impact assessment process comprised 11 sections:

  • The first 3 sections ask the assessing officer to document the purpose and expected benefits of the AI use case, along with a high-level assessment of key risks and planned mitigation measures. If all risks are rated low, the assessing officer can seek executive endorsement to conclude the assessment at section 3 and proceed with the use case.  
  • If any of the section 3 threshold risks are rated medium or high, the assessment should proceed through the remaining sections 4 to 11. These sections require assessing officers to document how they will address key risks and ensure their AI use is safe and responsible.  

The supporting guidance mirrors the assessment’s 11 section structure with advice for completing each section. In October 2024, during the pilot period, the DTA published the pilot AI assurance framework and guidance. 

Since the pilot concluded, the DTA has published further resources to support agency AI adoption that complement the impact assessment tool – including:

Key insights

Close to two-thirds of pilot survey respondents said the draft assessment tool helped them identify risks that existing processes would not have captured

Close to 90% considered the guidance helpful or very helpful for completing the assessment.

Around 70% of survey responses reported the assessment questions were clear and easy to understand.

Establishing clear, consistent AI governance practices would help lift confidence in exploring AI innovation.

Agencies remain cautious about adopting AI

Most participants only tested low-risk, less complex AI use cases using the draft assessment tool during the pilot period. This meant many of the assessments concluded at the initial threshold assessment stage (sections 1-3) and did not proceed to the extended assessment (sections 4-11) required for use cases with elevated risk. Most use cases were in the early exploratory stages – only a handful of participants reported assessing in-production use cases as part of the pilot.  

Key data constraints included:

  • the small pool of participants  
  • limited number of extended assessments beyond the section 3 threshold assessment
  • incomplete survey response set
  • short pilot period – participants reported this did not allow enough time for comprehensive assessments, while other urgent priorities diverted agency resources away from this non-mandatory pilot exercise  
  • divergent feedback on some aspects of the assessment process, reflecting varied experiences and perspectives, which may be challenging to address to the satisfaction of all stakeholders.  

While there may be other agencies that did not participate in the pilot that are exploring or already deploying more complex AI use cases, pilot participants reported their agencies were reluctant to pursue higher-risk AI adoption. Participants cited factors including resource constraints and uncertainty around AI-specific governance and risk management processes. Concerns that a misstep could result in unintended harm or expose the agency to reputational damage influence this cautious approach. 

Higher-risk adoption could deliver the greatest value, with appropriate mitigations. Addressing this requires an integrated, strategic approach, with impact assessment being just one of the tools required for achieving safe, responsible and successful innovation. To support agencies, the DTA has developed additional resources, such as the AI technical standard, AI procurement advice, model AI contract clauses and an AI contract template, and will update the Policy for the responsible use of AI in government. These efforts aim to build a robust foundation for responsible AI adoption, providing agencies with tools and guidance to navigate the complexities of AI implementation.

Agencies are calling for clear parameters and practical guidance  

Close to two-thirds of pilot survey respondents said the draft assessment tool helped them identify risks that existing processes would not have captured. Around half said they found the assessment process useful for ensuring responsible use of AI. Close to 90% considered the guidance helpful or very helpful for completing the assessment. Further insights are provided under Survey data in the Context, data and rationale section of this report. 

Pilot participants generally welcomed the draft assessment tool, noting it helped build trust and confidence that AI projects are managing risks and impacts safely and responsibly. Securing this trust and confidence – both internally, with agency staff and leaders, and with relevant external stakeholders – is crucial for the successful rollout of AI solutions.  

With no mandatory requirements on AI use case governance in the Australian Government, pilot agencies reported low confidence in AI adoption. Agencies also appeared reluctant to invest resources in complex AI projects without clear criteria to verify and publicly demonstrate their AI use is safe. Publishing an updated impact assessment tool will set the consistent governance expectations needed to support confidence in AI adoption, working in tandem with other DTA resources published since the pilot concluded, including the AI technical standard.  

Among the agencies that already adopted AI, it was clear that governance practices were inconsistent and not always comprehensive. This inconsistency may lead to gaps in AI risk identification and management, which could result in unintended negative outcomes that undermine public trust.  

Greater flexibility will help meet different agency needs and contexts  

Some agencies reported the draft assessment tool complemented and strengthened their existing governance processes. They found integrating it into their operations straightforward.  

Others reported that parts of the assessment appeared to duplicate existing agency processes. For example, some larger operational agencies already have extensive governance, risk and assurance processes, supported by dedicated resources. However, even these agencies said the pilot assessment tool helped them identify risks not captured by existing processes.

To accommodate diverse agency needs, a flexible approach to adopting the assessment tool is outlined in Recommendation 1.  

Identifying and assessing AI risk remains a challenge  

A key challenge pilot participants identified in feedback interviews and survey responses was identifying and assessing AI risk. Strengthening the risk assessment process to include more objective criteria is an area of focus for the next phase of updates.  

Of all the sections in the pilot assessment, the initial threshold risk assessment step (section 3.1) received the most comments and suggestions for improvement, both in the feedback interviews and survey responses. When asked if any sections were particularly challenging, 43% of survey responses referenced the risk assessment. 

Section 3.1 asks assessing officers to provide risk ratings in response to a series of open-ended questions, requiring consideration of a wide range of potential use case impacts – some of which are abstract or indirect – including social, ethical, legal, and reputational risks. The pilot draft instructs assessing officers to record risk ratings ‘accounting for any risk mitigations and treatments’, rather than simply assessing inherent, pre-mitigation risk levels. This requirement can complicate the assessment process, increasing the likelihood of subjective or inconsistent ratings, as officers must assess both the risks and the effectiveness of any treatments applied.

Participants noted that officers completing the assessment ‘don’t know what they don’t know’ and at times were not aware of the ways AI could introduce new risks or amplify existing risks of harm. This highlights the importance of involving colleagues with diverse expertise in assessments to ensure potential risks are identified and assessed accurately and consistently.  

Pilot participants with experience in fields such as risk management, data governance and ICT generally understood the rationale behind the risk questions. However, those with less exposure to these topics found it more challenging to interpret the questions and apply them to their AI use case.  In general, participants called for more guidance to support risk identification, assessment and mitigation. Recommendation 2 outlines an approach to address this feedback, including clarifying the risk assessment questions, adding more explanatory guidance and focusing on inherent risk. 

Securing legal review was another major hurdle

Participants who conducted extended assessments, beyond the section 3 threshold assessment, reported significant challenges completing the legal review at section 11.1. Participants reported that their legal teams sought greater clarity on the specific legal aspects of the AI use case to sign off, highlighting the importance of clearly defining the scope and purpose of each section.  

Legal teams usually require at least several weeks to provide advice, and even longer for complex matters. Some participants also reported their internal legal teams would need to procure external legal advice to complete this section. The updated assessment tool will seek to address these concerns and provide effective consideration of legal aspects of each AI use case, as outlined in Recommendation 3.  

Other updates will help to clarify and streamline aspects of the assessment

In addition to the key insights above, pilot participants provided valuable insights and practical suggestions to improve the assessment tool, summarised under Key feedback themes and proposed responses in the Context, data and rationale section of this report. The DTA will consult other relevant experts and consider other developments in the AI policy landscape to inform further updates. This is addressed in Recommendation 4

Recommendations

Next page

Context, data and rationale

Connect with the digital community

Share, build or learn digital experience and skills with training and events, and collaborate with peers across government.