Agentic AI Addendum statements: Whole of AI lifecycle

The statements below are intended as an addendum to the AI technical standard for Australian Government. These updates build upon the current framework to address the specific considerations associated with agentic AI. All existing statements, criteria, and general guidance outlined in the AI technical standard still apply. Some criteria in this standard may also apply to non-agentic forms of AI. Agencies exploring or using agentic AI should use both standards.

Whole of AI lifecycle

Statement AGT.1: Establish governance and safeguards to ensure responsible and ethical development and deployment of agentic AI systems

Agencies must:

Criterion AGT.1.1: Identify and assign responsibilities for agents and human accountability for decisions resulting from agentic AI systems

In an agentic system, agents are tasked with actioning responsibilities, while a human should be assigned accountability for the decisions made by these agents.

This may include:

clearly defining what actions and decisions each agent is responsible for, ensuring proper documentation is maintained at each stage of deployment
determining the agents responsible for providing updated actions
assigning accountability to a human for agent behaviour, decisions, and outcomes, even where decisions are made autonomously across multiple steps
assigning human accountability for outcomes generated by an agentic AI system within complex multi-agent or multi-agentic systems
ensuring accountability can be traced back to agent actions, and supported by documented records that can be reviewed and audited
clearly defining accountability with third-party or external systems for data inputs and outputs. Examples may include unstructured data such as website information, structured data through an API call, or data flows between other external agents or agentic systems.

Criterion AGT.1.2: Identify effective human oversight and intervene when necessary

Oversight must be maintained through a human-in-the-loop or human-on-the-loop governance model. Coordination between AI agents and humans must be clearly defined. Agencies should monitor the full agentic AI system, including where agents or tools in third-party or external systems are being used.

embedding oversight over autonomous agentic AI systems through a human-in-the-loop or human-on-the-loop model
defining clear real-time monitoring processes for agent behaviour and agentic workflows
implementing human-review processes at key stages to verify outcomes remain safe, accurate, and reliable
enabling human intervention processes for irreversible or high-risk actions
documenting escalation pathways and accountability for autonomous decision-making
ensuring system interactions and escalations with humans are accessible.

Criterion AGT.1.3: Explain agentic system decision-making processes

This may include:

ensuring systems with AI agents provide explanations for decision-making processes
documenting the reasoning process when using LLMs, for example, use reasoning models and methods such as chain-of-thought (CoT) and including prompts to ‘think step-by-step’ to capture the output of the agent’s thinking process
ensuring explanations from internal agentic reasoning does not expose personal or sensitive data
using auditable decision-making record explanations when reasoning or thinking processes may reveal personal or sensitive information
using modelling and graphs to show statistical knowledge
gathering metadata from AI agents in each lifecycle phase at runtime to enhance explainability, auditability, and interpretability.

Agencies should:

Criterion AGT.1.4: Establish an orchestration layer

The orchestration layer or orchestration agent oversees the entire agentic system. It delegates tasks to agents, enables agents to interact with tools and the environment, assigns roles, and manages issues that may arise in the agentic workflow.

Orchestration patterns can include sequential execution, parallel execution, and conditional branching.

Orchestration may include:

structuring agentic workflows
task and event management, such as delegating tasks to agents
API management and governance
API governance may require extensions to existing management tools
management of new standards, such as model context protocol (MCP), Agent2Agent (A2A), agent communication protocol (ACP) and agent network protocol (ANP)
memory retrieval
model, agent, tool, or data orchestration
accounting for multi-agent coordination in multi-agent systems to include tracking progress of all agents, their assigned tasks, and coordinating protocols
accounting for multi-agent interoperability to ensure agents communicate and collaborate effectively while safely exchanging information
monitoring agent interactions and the use of observability tools, such as a control tower
managing errors
communication and feedback loops.

Statement AGT.2: Establish memory management mechanisms

Agencies should:

Criterion AGT.2.1: Establish robust memory management mechanisms

This includes:

ensuring agent memory is governed and auditable
defining when memory is used, what may be stored, retention periods, hosting constraints, and audit and purge mechanisms
ensuring agents can effectively retain, recall, and use information
implementing techniques to prevent memory duplication by updating or replacing out-of-date information
providing safeguard measures and operational controls to protect memory from leakage and poisoning
implementing secure measures to ensure memory is protected
setting requirements for memory retention, disposal, and minimisation
ensuring conflict resolution rules are defined for handling contradicting memory issues. These rules help agents avoid retaining inconsistent, outdated, or mutually conflicting information in memory
providing mechanisms to ensure outdated or irrelevant information is removed to maintain agent performance and accuracy over time
ensuring all information required for record keeping purposes is captured, including prompts created by agents, and inputs and outputs throughout the workflow.

Different methods that could be used to retain information:

State: refers to information that an AI model retains during the execution of a task. Agent states are temporary and cleared after the task ends. When using GenAI, most models are stateless systems by default, handling each prompt on its own without retaining memory of earlier interactions. The use of structured workflows can help manage the agents state, guiding the agents next action based on the current state. States can include agent instructions, pending or executed tools, prompts, and results.
Short-term memory: management processes enable agentic workflows to maintain context across multiple tasks during a single session. This approach allows the agent to recall prior interactions using short-term memory, remaining active only for the duration of the session. Simulating short-term memory can include maintaining the full conversation which requires more storage and tokens, including only the most recent messages which reduces cost but may result in loss of context and quality, and summarising messages which would require effective summarising techniques.
Long-term memory: Long-term memory is available across multiple sessions and tasks can be stored in databases and vector stores. Apply long-term memory frameworks according to specific use case requirements. Examples of long-term memory includes episodic which allows agents to recall previous actions, semantic which encodes domain knowledge, and vector which allows for similarity-based retrieval.

Next statement

Design