Agentic Patterns
The Orchestrated Brain: Advanced Architectural Patterns in Multi-Agent AI
The architectural landscape of artificial intelligence is undergoing a fundamental shift. We are moving beyond the era of the monolithic, single-prompt interface toward a sophisticated ecosystem of interconnected, specialized agents. In this new paradigm, managing intelligence becomes as important as intelligence itself.
This transition mirrors the evolution of distributed computing: systemic value comes less from the capabilities of individual nodes and more from the orchestrated interactions of the collective.
1. The Microservices Moment for AI
The shift from single-agent deployments to orchestrated collectives is driven by the “single-agent bottleneck.” General-purpose Large Language Models (LLMs) often struggle with high-stakes, multi-domain objectives that require long-horizon reasoning.
Just as microservices broke monolithic software into manageable units, Multi-Agent Systems (MAS) decompose complex objectives into specialized subcomponents. Each agent operates autonomously with a specific toolset, enabling the system to achieve outcomes that exceed the reasoning limits of any individual model.
2. Comparative Analysis of Patterns
Choosing an orchestration pattern is a business-aligned architectural decision that affects cost, latency, and reliability.
| Pattern | Primary Mechanism | Best Use Case | Coordination Overhead |
|---|---|---|---|
| Sequential | Step-by-step processing (Chains) | Workflows with clear dependencies. | Low |
| Concurrent | Multiple agents work in parallel | Brainstorming and quorum-based decisions. | Moderate |
| Supervisor | A central manager delegates and reviews | Enterprise Standard. Structured workflows. | High |
| Hierarchical | Multi-layered delegation | Large-scale enterprise workflows. | Very High |
While sequential patterns are simpler, the Supervisor Pattern remains the preferred framework for high-stakes applications because it provides a centralized safety net against stochastic failure.
3. The Supervisor Pattern: Centralized Governance
At the heart of modern AI orchestration is the Supervisor. This lead model does not merely execute tasks; it governs the flow of information through a stateful graph.
Orchestrator loop (Supervisor pattern): plan → dispatch → collect → synthesise → verify
def run(query: str) -> str:
"""Full orchestration loop: plan → TRL → dispatch → collect → synthesise → verify."""
job_id = f"{socket.gethostname()}-{id(query)}"
client = redis_streams.get_client()
print(f"[orchestrator/analytical] Planning sub-tasks for: {query!r}")
tasks = plan(query)
print(f"[orchestrator/analytical] {len(tasks)} sub-tasks: {tasks}")
trl = build_trl(query, tasks)
print(f"[orchestrator/analytical] TRL: {len(trl.key_facts_to_verify)} facts to verify")
dispatch(client, tasks, job_id)
print(f"[orchestrator/analytical] Dispatched {len(tasks)} tasks (job_id={job_id})")
results = collect_results(client, len(tasks), job_id)
print(f"[orchestrator/analytical] Collected {len(results)}/{len(tasks)} results")
print("[orchestrator/creative] Synthesising report...")
draft = synthesise(query, results, trl)
print("[orchestrator/analytical] Verifying draft...")
verification = verify(draft, results)
if verification.approved:
print("[orchestrator/analytical] Draft approved.")
return draft
else:
print(f"[orchestrator/analytical] Issues found: {verification.issues}")
return verification.revised_draft or draft
The Four Functional Pillars
Decomposition: Breaking a complex user intent into discrete, manageable subtasks.
Delegation: Routing tasks to workers based on domain expertise and tool availability.
Governance: Reviewing worker outputs for logical consistency and factual accuracy.
Aggregation: Synthesizing disparate outputs into a cohesive final delivery.
4. Cognitive State Bifurcation: Logic vs. Synthesis
A critical failure mode in early agent systems was “semantic bleeding,” where the model’s desire to tell a fluent story caused it to gloss over factual gaps. To counter this, we implement Cognitive State Bifurcation, forcing the system to move through two distinct phases.
Phase A: The Analytical State (The Logic Phase)
In this state, the Supervisor operates as a pure logician. The goal is to ground all worker outputs in “The Truth.” It identifies Factual Divergence (conflicting data points) and Contextual Drift (agents focusing on different segments of a problem without realizing it).
def plan(query: str) -> list[str]:
"""Analytical Hub: decompose query into sub-tasks."""
raw = gemini_client.generate_analytical(query, system_instruction=PLAN_SYSTEM)
try:
tasks = json.loads(raw)
if isinstance(tasks, list):
return [str(t) for t in tasks]
except json.JSONDecodeError:
pass
return [query]
def build_trl(query: str, tasks: list[str]) -> TechnicalRequirementList:
"""Analytical Hub: generate a Technical Requirement List for the planned tasks."""
payload = json.dumps({"query": query, "tasks": tasks})
raw = gemini_client.generate_analytical(payload, system_instruction=TRL_SYSTEM)
try:
data = json.loads(raw)
raw_tasks = data.get("tasks", tasks)
parsed_tasks = []
for i, t in enumerate(raw_tasks):
if isinstance(t, dict):
parsed_tasks.append(SubTask(index=t.get("index", i), task=t["task"]))
else:
parsed_tasks.append(SubTask(index=i, task=str(t)))
return TechnicalRequirementList(
query=data["query"],
tasks=parsed_tasks,
key_facts_to_verify=data.get("key_facts_to_verify", []),
)
except Exception:
return TechnicalRequirementList(
query=query,
tasks=[SubTask(index=i, task=t) for i, t in enumerate(tasks)],
key_facts_to_verify=[],
)
Phase B: The Creative State (The Synthesis Phase)
Once the logical blueprint is verified and locked, the Supervisor transitions into a narrator role. Crucially, in this state, the model is forbidden from retrieving new data. It acts as an editor-in-chief, turning the “Frozen State” of facts into a human-readable narrative.
def synthesise(query: str, results: list[str], trl: TechnicalRequirementList) -> str:
"""Creative Hub: combine findings into a narrative report guided by the TRL."""
findings = "\n\n---\n\n".join(
f"Finding {i + 1}:\n{r}" for i, r in enumerate(results)
)
facts = "\n".join(f"- {f}" for f in trl.key_facts_to_verify)
prompt = (
f"Original query: {query}\n\n"
f"Key facts that MUST be covered:\n{facts}\n\n"
f"Research findings:\n{findings}"
)
return gemini_client.generate_creative(prompt, system_instruction=SYNTHESIS_SYSTEM)
def verify(draft: str, results: list[str]) -> VerificationResult:
"""Analytical Hub: verify the creative draft against raw findings."""
findings = "\n\n---\n\n".join(
f"Finding {i + 1}:\n{r}" for i, r in enumerate(results)
)
prompt = f"Draft report:\n{draft}\n\nRaw research findings:\n{findings}"
raw = gemini_client.generate_analytical(prompt, system_instruction=VERIFY_SYSTEM)
try:
data = json.loads(raw)
return VerificationResult(**data)
except Exception:
return VerificationResult(approved=True, issues=[], revised_draft=None)
5. Fighting “Agent Drift” and the Refinement Loop
“Agent Drift” is the phenomenon where performance collapses over long-horizon tasks. Research shows that as context grows, models can stop solving the right problem (Goal Drift) or let logs crowd out the original signal (Context Drift).
The Refinement Loop is the primary defense. By forcing the Supervisor to periodically summarize progress and distill raw conversational history into compact “beliefs” (e.g., “authentication requires a Bearer token”), the system maintains focus. Systems using explicit memory distillation show roughly 21% higher stability than those relying on raw history.
Practical Refinement: Belief Distillation Example
The following implementation addresses both Context Drift (by reducing token bloat) and Goal Drift (by enforcing alignment with technical requirements).
def refine_beliefs(history: list[dict], current_beliefs: list[str]) -> list[str]:
"""
Distills raw history into compact 'beliefs'.
Counters Context Drift (bloat) and Goal Drift (mission divergence).
"""
distillation_prompt = (
f"Current Beliefs: {current_beliefs}\n\n"
f"Recent Execution Logs: {history}\n\n"
"Update the list of beliefs. Remove contradictions and keep only "
"hard technical facts required for final synthesis. Ensure new "
"beliefs remain aligned with the primary mission objective."
)
raw_response = gemini_client.generate_analytical(
distillation_prompt,
system_instruction="You are a Memory Distiller. Output a JSON list of strings."
)
try:
new_beliefs = json.loads(raw_response)
return new_beliefs
except json.JSONDecodeError:
return current_beliefs
6. Performance Engineering: The Coordination Tax
Orchestration introduces a “Coordination Tax.” Multi-agent systems use significantly more tokens and introduce multi-dimensional latency.
Coordination points: dispatch + collect over Redis streams
def dispatch(client: _redis.Redis, tasks: list[str], job_id: str) -> None:
"""Push sub-tasks onto the Redis tasks stream."""
for i, task in enumerate(tasks):
redis_streams.push_task(
client,
STREAM_TASKS,
{"job_id": job_id, "task_index": i, "task": task},
)
def collect_results(
client: _redis.Redis, expected: int, job_id: str, timeout_s: int = 300
) -> list[str]:
"""Wait for `expected` results for this job_id from the results stream."""
gathered: list[str] = []
raw = redis_streams.read_results(client, STREAM_RESULTS, expected * 2, timeout_s)
for item in raw:
if item.get("job_id") == job_id:
gathered.append(item.get("result", ""))
if len(gathered) >= expected:
break
return gathered
Time-to-First-Token (TTFT): Often higher in Supervisor models because the manager must wait for workers.
Inter-Token Latency (ITL): The perceived smoothness of the output.
Model Tiering: A common optimization strategy is to use a high-reasoning model (e.g., Gemini 1.5 Pro) for the Supervisor and smaller, faster models (e.g., Gemini 1.5 Flash) for workers.
7. The Divergence/Convergence Paradox
A major bottleneck in agentic AI is the tension between exploration and execution.
Divergence: Agents must explore unique, high-entropy reasoning paths to solve hard problems.
Convergence: Agents must eventually agree on a single, safe truth.
Failure here can lead to Sycophancy (agents agreeing to avoid conflict) or Escalation (agents spiraling into arguments). Advanced architectures use a three-tier governance structure: Local Consensus, Inter-Cluster Coordination, and Global Orchestration.
8. Standards and Interoperability: The A2A Protocol
As we move beyond single-vendor platforms, the Agent-to-Agent (A2A) standard enables autonomous agents to discover, authenticate, and interact across boundaries. Built on JSON-RPC and OAuth 2.0, A2A allows a hiring agent from one platform to coordinate securely with a background-check agent from another.
Summary for AI Architects
Enforce Cognitive State Bifurcation: Separate the Analytical State (logic and verification) from the Creative State (narrative synthesis) to prevent "semantic bleeding."
Mitigate Agent Drift: Implement periodic Refinement Loops to distill execution logs into structured "beliefs," preventing both Context Drift and Goal Drift.
Optimize for the Coordination Tax: Use Model Tiering to balance high-reasoning costs with worker speed, and manage stateful communication through asynchronous streams like Redis.
Architect for Convergence: Build governance structures that reconcile agent divergence into a unified, verifiable "Frozen State" before final output delivery.

