The statements below are intended as an addendum to the AI technical standard for Australian Government. These updates build upon the current framework to address the specific considerations associated with agentic AI. All existing statements, criteria, and general guidance outlined in the AI technical standard still apply. Some criteria in this standard may also apply to non-agentic forms of AI. Agencies exploring or using agentic AI should use both standards.
Whole of AI lifecycle
Statement AGT.1: Establish governance and safeguards to ensure responsible and ethical development and deployment of agentic AI systems
Agencies must:
Criterion AGT.1.1: Identify and assign responsibilities for agents and human accountability for decisions resulting from agentic AI systems
In an agentic system, agents are tasked with actioning responsibilities, while a human should be assigned accountability for the decisions made by these agents.
This may include:
- clearly defining what actions and decisions each agent is responsible for, ensuring proper documentation is maintained at each stage of deployment
- determining the agents responsible for providing updated actions
- assigning accountability to a human for agent behaviour, decisions, and outcomes, even where decisions are made autonomously across multiple steps
- assigning human accountability for outcomes generated by an agentic AI system within complex multi-agent or multi-agentic systems
- ensuring accountability can be traced back to agent actions, and supported by documented records that can be reviewed and audited
- clearly defining accountability with third-party or external systems for data inputs and outputs. Examples may include unstructured data such as website information, structured data through an API call, or data flows between other external agents or agentic systems.
Criterion AGT.1.2: Identify effective human oversight and intervene when necessary
Oversight must be maintained through a human-in-the-loop or human-on-the-loop governance model. Coordination between AI agents and humans must be clearly defined. Agencies should monitor the full agentic AI system, including where agents or tools in third-party or external systems are being used.
- embedding oversight over autonomous agentic AI systems through a human-in-the-loop or human-on-the-loop model
- defining clear real-time monitoring processes for agent behaviour and agentic workflows
- implementing human-review processes at key stages to verify outcomes remain safe, accurate, and reliable
- enabling human intervention processes for irreversible or high-risk actions
- documenting escalation pathways and accountability for autonomous decision-making
- ensuring system interactions and escalations with humans are accessible.
Criterion AGT.1.3: Explain agentic system decision-making processes
This may include:
- ensuring systems with AI agents provide explanations for decision-making processes
- documenting the reasoning process when using LLMs, for example, use reasoning models and methods such as chain-of-thought (CoT) and including prompts to ‘think step-by-step’ to capture the output of the agent’s thinking process
- ensuring explanations from internal agentic reasoning does not expose personal or sensitive data
- using auditable decision-making record explanations when reasoning or thinking processes may reveal personal or sensitive information
- using modelling and graphs to show statistical knowledge
- gathering metadata from AI agents in each lifecycle phase at runtime to enhance explainability, auditability, and interpretability.
Agencies should:
Criterion AGT.1.4: Establish an orchestration layer
The orchestration layer or orchestration agent oversees the entire agentic system. It delegates tasks to agents, enables agents to interact with tools and the environment, assigns roles, and manages issues that may arise in the agentic workflow.
Orchestration patterns can include sequential execution, parallel execution, and conditional branching.
Orchestration may include:
- structuring agentic workflows
- task and event management, such as delegating tasks to agents
- API management and governance
- API governance may require extensions to existing management tools
- management of new standards, such as model context protocol (MCP), Agent2Agent (A2A), agent communication protocol (ACP) and agent network protocol (ANP)
- memory retrieval
- model, agent, tool, or data orchestration
- accounting for multi-agent coordination in multi-agent systems to include tracking progress of all agents, their assigned tasks, and coordinating protocols
- accounting for multi-agent interoperability to ensure agents communicate and collaborate effectively while safely exchanging information
- monitoring agent interactions and the use of observability tools, such as a control tower
- managing errors
- communication and feedback loops.
Statement AGT.2: Establish memory management mechanisms
Agencies should:
Criterion AGT.2.1: Establish robust memory management mechanisms
This includes:
- ensuring agent memory is governed and auditable
- defining when memory is used, what may be stored, retention periods, hosting constraints, and audit and purge mechanisms
- ensuring agents can effectively retain, recall, and use information
- implementing techniques to prevent memory duplication by updating or replacing out-of-date information
- providing safeguard measures and operational controls to protect memory from leakage and poisoning
- implementing secure measures to ensure memory is protected
- setting requirements for memory retention, disposal, and minimisation
- ensuring conflict resolution rules are defined for handling contradicting memory issues. These rules help agents avoid retaining inconsistent, outdated, or mutually conflicting information in memory
- providing mechanisms to ensure outdated or irrelevant information is removed to maintain agent performance and accuracy over time
- ensuring all information required for record keeping purposes is captured, including prompts created by agents, and inputs and outputs throughout the workflow.
Different methods that could be used to retain information:
- State: refers to information that an AI model retains during the execution of a task. Agent states are temporary and cleared after the task ends. When using GenAI, most models are stateless systems by default, handling each prompt on its own without retaining memory of earlier interactions. The use of structured workflows can help manage the agents state, guiding the agents next action based on the current state. States can include agent instructions, pending or executed tools, prompts, and results.
- Short-term memory: management processes enable agentic workflows to maintain context across multiple tasks during a single session. This approach allows the agent to recall prior interactions using short-term memory, remaining active only for the duration of the session. Simulating short-term memory can include maintaining the full conversation which requires more storage and tokens, including only the most recent messages which reduces cost but may result in loss of context and quality, and summarising messages which would require effective summarising techniques.
- Long-term memory: Long-term memory is available across multiple sessions and tasks can be stored in databases and vector stores. Apply long-term memory frameworks according to specific use case requirements. Examples of long-term memory includes episodic which allows agents to recall previous actions, semantic which encodes domain knowledge, and vector which allows for similarity-based retrieval.