A2A Integration How-To: Build a Python Agent, Delegate a Task, and Add Memory
Agent2Agent (A2A) is an agent interoperability protocol for task delegation between opaque agents — collaboration that, by the live specification, does not require access to another agent's internal state, memory, or tools (A2A specification). Google announced A2A on April 9, 2025, and donated it to the Linux Foundation on June 23, 2025, under the Apache-2.0 license.
This guide is the practical path. It assumes you already know why A2A exists. For protocol concepts and primitives, read A2A protocol explained. For where agent-to-agent delegation stops and tool access begins, read A2A vs MCP.
TL;DR
- Use
a2a-sdk1.x for this guide. The official Python SDK supports client and server code, JSON-RPC, HTTP+REST, gRPC, SSE streaming, and webhook push (a2a-python). - A minimal A2A flow is: publish an Agent Card, run a server, let a client discover the card, then delegate a task with
send_message(...). - Streamed results require
capabilities.streaming=truein the Agent Card. A2A streams status and artifact updates over SSE (streaming and async). - A2A does not carry another agent's memory. If the delegated agent needs context across tasks or sessions, add a separate, explicit, domain-scoped memory layer.
Install the A2A SDK for Python
The A2A Python SDK is the official Python implementation for building A2A clients and servers, distributed on PyPI as the a2a-sdk package (a2a-python).
Install the SDK:
pip install a2a-sdk uvicornThe official package is Apache-2.0 licensed and requires Python 3.10+ (a2a-python). This article targets a2a-sdk 1.x and the live A2A specification at a2a-protocol.org/latest, because SDK APIs changed substantially across the 0.2, 0.3, and 1.0 releases — for example, the client creation path and the interface model. Always verify class and field names against the live SDK before you ship.
Before building your own agent, read the official examples:
- The smallest real sample is
samples/python/agents/helloworldina2a-samples. - The official eight-step Python tutorial starts with an echo agent and progresses to LangGraph streaming (tutorial introduction).
The rest of this article follows the same shape as the official tutorial, compressed into the integration path you usually need first.
Publish an A2A Agent Card
An A2A Agent Card is the discovery document that tells clients what an agent is, where to reach it, which transports it supports, and which skills it exposes (Agent Card tutorial).
For the current A2A discovery path, serve the card at:
/.well-known/agent-card.jsonThe A2A discovery topic documents well-known URI discovery (RFC 8615), registry discovery, and direct configuration; use /.well-known/agent-card.json for the current well-known path (agent discovery). Older material may show /.well-known/agent.json; do not copy that path for new integrations.
At how-to level, the Agent Card needs:
namedescriptionversion- at least one interface or transport — the endpoint lives in
supported_interfaces, a list ofAgentInterfaceentries, each with aurland aprotocol_binding
Core optional fields include:
capabilities, such asstreamingandpushNotificationsskills[]securitySchemes- default input and output modes
The official Python tutorial builds the card by defining an AgentSkill and an AgentCard (Agent Card tutorial). Keep the card accurate. Clients use it to decide whether they can call your agent.
Illustrative shape — align exact fields with a2a-sdk 1.x and the live tutorial:
# Illustrative: align exact fields with a2a-sdk 1.x and the live tutorial.
from a2a.types import AgentCard, AgentSkill, AgentCapabilities, AgentInterface
skill = AgentSkill(
id="echo",
name="Echo",
description="Returns the submitted text.",
tags=["example"],
examples=["hello"],
)
agent_card = AgentCard(
name="Echo Agent",
description="A minimal A2A agent for testing client delegation.",
version="0.1.0",
capabilities=AgentCapabilities(streaming=True),
supported_interfaces=[
AgentInterface(protocol_binding="JSONRPC", url="http://localhost:9999"),
],
skills=[skill],
default_input_modes=["text/plain"],
default_output_modes=["text/plain"],
)This is not a full schema reference. Use the live specification for the complete Agent Card model (A2A specification).
Run an A2A server in Python
An A2A server is the service endpoint that accepts A2A requests, executes agent work, and returns task status or artifacts through the protocol (server tutorial).
In the official Python server tutorial, the server path is:
- implement an agent executor,
- create a
DefaultRequestHandlerwithagent_executor=...,task_store=InMemoryTaskStore(), andagent_card=..., - build the routes with
create_agent_card_routes(...)andcreate_jsonrpc_routes(...)froma2a.server.routes, - compose a
Starlette(routes=...)app, - run it with
uvicorn(server tutorial).
Note:
a2a-sdkv1.0 removedA2AStarletteApplication. Build the Starlette app from the route builders, as the live helloworld sample does.
Illustrative server skeleton — align exact symbols with a2a-sdk 1.x and the live tutorial:
# Illustrative: based on the official a2a-sdk 1.x server tutorial / helloworld sample.
import uvicorn
from a2a.server.request_handlers import DefaultRequestHandler
from a2a.server.routes import (
create_agent_card_routes,
create_jsonrpc_routes,
)
from a2a.server.tasks import InMemoryTaskStore
from starlette.applications import Starlette
# Your executor implements the work the agent performs.
agent_executor = EchoAgentExecutor()
request_handler = DefaultRequestHandler(
agent_executor=agent_executor,
task_store=InMemoryTaskStore(),
agent_card=agent_card,
)
routes = []
routes.extend(create_agent_card_routes(agent_card))
routes.extend(create_jsonrpc_routes(request_handler, "/"))
app = Starlette(routes=routes)
uvicorn.run(app, host="127.0.0.1", port=9999)See the official helloworld sample for the complete, runnable wiring.
Use InMemoryTaskStore() for a local quickstart. For production, treat task storage, authentication, streaming capacity, and webhook validation as explicit engineering decisions. The official tutorial demonstrates the wiring; it does not remove those deployment concerns (server tutorial, streaming and async).
Discover an A2A Agent Card and delegate a task
An A2A client is the caller that discovers an agent's card, creates a protocol client, and delegates work through A2A operations (client tutorial).
The official Python client flow is:
- use
A2ACardResolverto fetch the Agent Card, - call
create_client(...), - build a
SendMessageRequestand callsend_message(...)to delegate work (client tutorial).
send_message takes a SendMessageRequest object, not a raw dict. Build the message with the new_text_message helper. The exact helper and request symbols are shown in the official tutorial; verify them before you ship.
Illustrative client skeleton — note the card fetch and client creation are asynchronous:
# Illustrative: based on the official a2a-sdk 1.x client tutorial.
import asyncio
import httpx
from a2a.client import A2ACardResolver, ClientConfig, create_client
from a2a.helpers import new_text_message
from a2a.types.a2a_pb2 import Role, SendMessageRequest
async def main():
async with httpx.AsyncClient() as httpx_client:
resolver = A2ACardResolver(
httpx_client=httpx_client,
base_url="http://127.0.0.1:9999",
)
card = await resolver.get_agent_card() # async — await it
config = ClientConfig(streaming=False)
client = await create_client(agent=card, client_config=config)
message = new_text_message("hello", role=Role.ROLE_USER)
request = SendMessageRequest(message=message)
async for chunk in client.send_message(request):
print(chunk)
asyncio.run(main())The exact message model should follow the live SDK and tutorial. The integration point that matters is discovery first, then delegation. Do not hard-code capabilities the card does not declare.
A2A defines RPC operations for both quick calls and long-running work. The common methods:
| Method | Purpose |
|---|---|
message/send | Submit a message and create a task. |
message/stream | Open an SSE stream for status and artifact updates. |
tasks/get | Poll the current state of an active or historical task. |
tasks/cancel | Abort a running task. |
tasks/resubscribe | Re-establish a dropped stream for a running task. |
tasks/pushNotificationConfig/* | Configure webhook endpoints for push updates. |
For short tasks, message/send may be enough. For long-running tasks, plan for task polling or resubscription (A2A specification).
Stream an artifact with A2A SSE
An A2A Artifact is a result object made of Parts, returned by the delegated agent as work progresses or finishes (A2A specification).
Streaming is not implicit. The Agent Card must declare:
{
"capabilities": {
"streaming": true
}
}When streaming is enabled, A2A uses SSE to emit status and artifact updates (streaming and async). Clients should handle reconnects. Servers should account for the resource cost of long-lived SSE connections.
The task lifecycle matters for non-trivial work. A2A task states include SUBMITTED, WORKING, COMPLETED, FAILED, CANCELED, REJECTED, INPUT_REQUIRED, and AUTH_REQUIRED (A2A specification). If the client disconnects or the job continues beyond the first response, use tasks/get or tasks/resubscribe rather than assuming the original request is still active.
For asynchronous completion through webhooks, treat push notifications as an attack surface. The A2A async guidance covers push notification configuration and calls out protections such as signature verification with JWT, HMAC, or mTLS, JWKS rotation, callback-URL validation against SSRF, and single-use nonces (streaming and async). The A2A security analysis likewise treats webhook security as a first-class concern (arXiv:2504.16902).
A2A integration gotchas: auth, discovery, streaming, and long tasks
A small A2A example can run locally in minutes. A safe integration needs a checklist.
Authentication belongs in transport headers
Declare supported schemes in the Agent Card's securitySchemes. A2A supports schemes including API key, Bearer, OAuth2, OIDC, and mTLS (A2A specification).
Credentials travel in transport headers, never inside the JSON-RPC body. Treat SSE as its own authenticated channel; browser SSE APIs do not easily carry custom headers, so handle authorization during the initial HTTP handshake, and do not assume the discovery request authenticates every stream event.
Discovery must use the current well-known path
Use:
/.well-known/agent-card.jsonA2A discovery supports well-known URI discovery, registries, and direct configuration (agent discovery). For public or partner-facing agents, the well-known path gives clients a stable place to fetch the card. For internal systems, direct config or a registry may fit better.
Streaming requires an explicit capability
Streaming works only when the card explicitly lists capabilities.streaming=true. Streaming uses SSE, so clients need reconnection behavior and servers need capacity controls for long-lived connections (streaming and async).
Long-running work needs task management
Do not model every delegation as a request-response call. Use task state, tasks/get, cancellation, and resubscription where the work can exceed a single interaction (A2A specification).
Webhook push needs verification
If you enable push notifications, verify signatures, validate callback URLs, use HTTPS, handle key rotation, and prevent replay with nonces. The A2A async topic and the A2A security paper both treat webhook security as a first-class concern (streaming and async, arXiv:2504.16902).
Add persistent memory to an A2A workflow
Persistent agent memory is a separate state layer that lets agents read, write, and verify context across tasks or sessions instead of relying only on the current A2A message.
A2A moves tasks and messages between opaque agents. The specification's boundary is intentional: agents collaborate without needing access to each other's internal state, memory, or tools (A2A specification). That means durable memory is not something to smuggle into the protocol. It is a complementary layer. The moment a delegated agent must remember a user preference, a project context, or a previous decision across tasks, you need it.
A practical pattern, without coupling the protocol to a specific store:
- The A2A task carries an explicit
domain, such asproject:acme. - The receiving agent reads memory for that domain before doing work.
- The receiving agent writes findings back to the same domain after completion.
- Other approved agents pointed at the same domain can use those findings later.
The flow stays clean because memory lives beside the protocol, not inside it:
┌────────────────────┐ ┌────────────────────┐
│ Client Agent │ │ Delegated Agent │
└─────────┬──────────┘ └─────────┬──────────┘
│ │
│ 1. Delegate task (A2A JSON-RPC, carries │
│ domain="project:acme") │
├──────────────────────────────────────────────>│
│ │ 2. memory_read(domain)
│ │ → prior context
│ │
│ │ 3. Do the work
│ │
│ │ 4. memory_write(domain)
│ │ → findings persist
│ 5. Return Artifact (A2A JSON-RPC) │
│<──────────────────────────────────────────────┤
│ │Mnemoverse documents this shared-domain pattern for agent frameworks: point agents at the same domain and one agent's findings become available to the others (agent frameworks). The multi-tenant model keeps memory domain-scoped and per tenant — no cross-organization leakage, no auto-propagation of private state (multi-tenant). The design rationale is covered in shared memory for multi-agent systems.
The Mnemoverse persistent-memory API fits this pattern. The REST base is https://core.mnemoverse.com/api/v1 with an X-Api-Key header (keys are prefixed mk_live_); the Python package is mnemoverse, and the MCP server @mnemoverse/mcp-memory-server exposes memory_read, memory_write, and related tools. The free tier allows 1,000 queries per day and 10,000 atoms. Manage keys and domains at console.mnemoverse.com.
Illustrative placement — this is a boundary pattern, not an A2A SDK API:
# Illustrative boundary pattern, not an A2A SDK API.
async def handle_a2a_task(task):
domain = task.metadata["domain"] # explicit scope, e.g. "project:acme"
prior_context = memory_read(
domain=domain,
query="What should I know before handling this task?",
)
artifact = await do_agent_work(task, prior_context)
memory_write(
domain=domain,
content="Finding from the delegated task...",
source="a2a-task",
)
return artifactThis does not mean Mnemoverse implements A2A or acts as an A2A agent. It means A2A can delegate work while memory remains explicit, scoped, and auditable. If your agents also use MCP tools, see memory over MCP and the protocol boundary in A2A vs MCP.
Common questions
How do I get started with A2A in Python?
Install a2a-sdk 1.x with pip, follow the official Python tutorial, define an AgentSkill and AgentCard, build a Starlette server from the a2a.server.routes builders with a DefaultRequestHandler, then use A2ACardResolver, create_client(...), and send_message(...) from a client.
Where is the A2A Agent Card served?
The current A2A discovery path is /.well-known/agent-card.json (RFC 8615). The Agent Card declares the agent name, description, url or endpoint, version, and at least one interface or transport, plus optional capabilities, skills, security schemes, and default input or output modes.
How does an A2A client delegate a task?
The client resolves the Agent Card with A2ACardResolver, creates a client with create_client(...), and sends work with send_message(...). For long-running work, the client can poll tasks/get or call tasks/resubscribe.
How does A2A streaming work?
A2A streaming is available when the Agent Card declares capabilities.streaming=true. The protocol uses SSE for status and artifact updates, and clients should handle reconnection and the resource cost of long-lived streams.
Where does persistent memory fit with A2A?
A2A moves tasks and messages between opaque agents; persistent memory is a separate layer. A practical pattern is to pass an explicit domain in the A2A task, read memory for that domain, write findings back, and keep the scope per tenant.
Related
- A2A protocol explained — the conceptual guide to A2A and its core model.
- A2A vs MCP — where agent-to-agent delegation stops and tool access begins.
- Shared memory for multi-agent systems — why delegated agents need a memory layer outside the message bus.
- Agent framework memory use cases — how to point agents at the same memory domain.
- Multi-tenant memory — domain-scoped isolation for teams, tenants, and projects.
- Memory MCP — how memory tools fit when agents use MCP.
Mnemoverse is a persistent-memory engine for AI agents: store, retrieve, and verify knowledge across sessions without making A2A responsible for another agent's internal state.
