Software-Engineering on Corebaseit — POS · EMV · Payments · AI

I Spent Years on Adaptive Filters. I Was Already Training Neural Networks.

contact@corebaseit.com (Vincent Bevia) — Tue, 14 Apr 2026 10:00:00 +0100

I spent years implementing LMS-based equalizers and echo cancellers in telecommunications. Only later did I fully appreciate what I had been doing mathematically: the same family of update rules that powers neural network training today.

Not as a loose analogy — as the same structure of optimization. Widrow and Hoff formalized the Least Mean Squares (LMS) algorithm in 1960 for the Adaline. Rumelhart, Hinton, and Williams scaled related ideas through multi-layer networks with backpropagation in 1986. The vocabulary changed from adaptive filtering to deep learning, but the core idea — adjust parameters in the direction that reduces error, one small step at a time — is continuous across both worlds.

This post is my attempt to make that lineage explicit: what LMS actually is, why it is structurally the same rule as stochastic gradient descent on a linear model, how the engineering trade-offs line up, and why non-stationarity remains the hard problem in both domains.

LMS Is Not a Metaphor for Training — It Is the Algorithm

The LMS update for a linear combiner (FIR filter or single Adaline) is:

$$ \mathbf{w}(n+1) = \mathbf{w}(n) + \mu , e(n) , \mathbf{x}(n) $$

Here (\mathbf{w}(n)) is the weight vector at time (n), (\mathbf{x}(n)) is the input vector (tap-delay line or feature vector), (e(n) = d(n) - y(n)) is the error between the desired response (d(n)) and the output (y(n) = \mathbf{w}^\top(n)\mathbf{x}(n)), and (\mu) is the step size.

That is stochastic gradient descent on the instantaneous squared error (\frac{1}{2}e^2(n)) with respect to (\mathbf{w}). The gradient of (\frac{1}{2}(d - \mathbf{w}^\top\mathbf{x})^2) with respect to (\mathbf{w}) is (-e,\mathbf{x}). Walking in the opposite direction of the gradient (or equivalently, in the direction (+e,\mathbf{x}) when you define the update as above) is exactly the LMS rule.

So if you have ever shipped an LMS equalizer or echo canceller, you have implemented the foundational learning rule that underlies a huge fraction of modern machine learning: small steps proportional to error times input. The notation in Haykin’s Adaptive Filter Theory differs from PyTorch docs; the mathematics does not.

Multi-layer networks add the chain rule (backpropagation) to compute how error propagates to earlier layers, but the local update at a linear layer trained with mean squared error is still the same structural move: adjust weights in proportion to error and activations. Everything else — momentum, Adam, adaptive learning rates — is engineering on top of that spine.

The Engineering Trade-Offs Are the Same Trade-Offs

In telecommunications, the step size (\mu) controls the classic compromise: convergence speed versus steady-state misadjustment. Too large — the filter can diverge or oscillate. Too small — the filter cannot track a fast-fading channel or a moving echo path. Entire chapters of adaptive filtering textbooks are devoted to stability bounds on (\mu) (often expressed in terms of input power and filter length) and to variants that fix the worst-case behavior.

In deep learning, the learning rate (\eta) plays the same role at a higher level: too high and training diverges or chatters around a minimum; too low and you underfit or burn compute without making progress. The community talks about learning-rate schedules, warm-up, and cosine decay — different names for the same instinct: the right step size depends on the landscape and may need to change over time.

Normalized LMS (NLMS) scales the update by the inverse of the input energy (|\mathbf{x}(n)|^2) (with a small regularizer to avoid division by zero). The goal is stable convergence when input power varies — the same motivation that shows up in adaptive optimizers that normalize updates by running statistics of gradients (RMSProp-style normalization is not identical to NLMS, but the intent — tame the step when the signal scale changes — is shared). The DSP community spent decades refining these ideas for real-time hardware; ML rediscovered many of the same pressures when training became unstable at scale.

Non-Stationarity Was Always the Real Problem — and Still Is

Adaptive filters were built for non-stationary environments: multipath fading, time-varying echoes, drifting noise floors. The “true” optimal weights are not fixed; they move. The filter is not supposed to converge once and freeze — it is supposed to track. That mindset is closer to production ML than a static batch fit on a fixed dataset.

Modern systems face the same phenomenon under different labels: distribution shift, concept drift, stale features, changing user behavior, adversarial drift in inputs. The model that was optimal last month is not guaranteed to be optimal this month. Retraining on a schedule, online updates, monitoring, and guardrails are the engineering response — conceptually in the same family as “never assume the channel is static.”

Research on in-context learning in linear models (for example Akyürek et al., 2022) even investigates which learning algorithms are implicitly approximated by transformers under simplified settings — another reminder that the boundary between classical adaptive signal processing and contemporary ML is thinner than course catalogs suggest.

The Bigger Picture

For engineers who came up through telecommunications and signal processing, the move into AI is often described as a career pivot. In my experience, it is closer to a change of vocabulary on top of a continuous mathematical thread: error-driven updates, step-size discipline, stability under non-stationarity, and the centrality of second-order statistics (explicitly in LMS, implicitly in much of modern training).

The boundary between DSP and machine learning was never as sharp as the literature implied. If you understand LMS, you already understand a piece of what every deep learning framework is doing when it steps the weights. The rest is scale, architecture, and tooling — important, but not magic.

References

Widrow, B., & Hoff, M. E. “Adaptive switching circuits.” IRE WESCON Convention Record, 4, 96–104, 1960.
Haykin, S. Adaptive Filter Theory (4th ed.). Prentice Hall, 2002.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. “Learning representations by back-propagating errors.” Nature, 323, 533–536, 1986.
Akyürek, E. et al. “What learning algorithm is in-context learning? Investigations with linear models.” 2022. arxiv.org/abs/2211.15661

Multi-Agent Systems Scale Vertically. They Need to Scale Horizontally.

contact@corebaseit.com (Vincent Bevia) — Fri, 03 Apr 2026 10:00:00 +0100

This post continues the ideas explored in Part I: Super Agents and Multi-Agent Communication and Part II: Swarm Intelligence. Those posts covered how agents coordinate within a workflow. This one asks what happens after the workflow ends.

After spending time with the orchestrator pattern and the swarm pattern, I kept running into the same gap — one that the field has not been honest enough about.

Agents can communicate within a workflow. They can share state, hand off tasks, and coordinate through structured message protocols. I covered all of that in the previous posts, and all of that is solved. What is not solved is this: once the run completes and the agents figure out how to handle a complex workflow, that knowledge stays isolated. The next run starts cold.

That is the vertical scaling trap. And the more I read — across Reflexion, ERL, Letta’s stateful agent work, and Google Research’s recent findings on scaling agent systems — the more I realized this is the most important unsolved problem in multi-agent architecture today.

What Vertical Scaling Actually Means

The industry has concentrated its investment on making individual agents more capable in isolation — longer context windows, stronger reasoning models, richer tool sets, more compute per inference call. This is vertical scaling: more depth, more power, more intelligence concentrated in a single node.

Vertical scaling has delivered real gains. Modern LLM-based agents can handle significantly longer reasoning chains, maintain larger working memories, and invoke more complex tool sequences than agents from two years ago. The benchmark numbers confirm this.

But vertical scaling has a ceiling, and that ceiling is architectural, not computational. No matter how capable a single agent becomes, a system of agents that starts each run from a blank slate cannot accumulate collective intelligence over time. Every execution is, in a meaningful sense, the first time that system has encountered the problem.

That is the definition of a system that does not learn.

The Statefulness Illusion

This was the part that clarified the problem most for me. LLM agents are stateless by design. The model itself has no memory between API calls — every inference starts fresh, bounded by what exists inside the current context window. What looks like agent memory in most production frameworks is actually infrastructure built around the model: conversation history injected into the prompt, vector stores queried at retrieval time, workflow state persisted in an external database.

The agent does not remember. The infrastructure remembers. And the agent only knows what the infrastructure decides to surface at inference time.

This distinction matters because it exposes the scope of what is currently being solved. Stateful agent frameworks — LangGraph, MemGPT/Letta, Amazon Bedrock AgentCore Memory, and others — address continuity within a workflow and within a user session. They do not address what happens between runs, across agent instances, or across different executions of the same workflow by different users.

Each agent run, regardless of the framework, is still largely isolated from the accumulated experience of every run that came before it.

The Horizontal Scaling Problem

Horizontal scaling in multi-agent systems means something different from what the term usually implies in infrastructure. It is not about running more agent instances in parallel — that is a load distribution problem, and it is solved. The horizontal scaling problem I’m describing is about propagating learned competence across agents and across runs.

When I mapped the gap concretely, it looked like this:

Capability	Current State
Agents share state within a run	Solved
Agents communicate within a workflow	Solved
Agent learns within a run (self-reflection)	Partial — Reflexion
Successful strategy propagates to next run	Not solved
Knowledge discovered by one agent available to others	Not solved
Collective intelligence accumulates over time without retraining	Not solved

The bottom three rows represent the horizontal scaling gap. It is not a matter of framework maturity — it is an architectural primitive that does not yet exist in production multi-agent systems.

What the Field Has Built as Workarounds

Research and engineering teams have made partial progress, and it’s worth naming what exists honestly.

Shared episodic memory stores. Agents can write successful reasoning traces or strategy summaries to a vector database that future agent instances retrieve via RAG. This is useful, but the memory is static once written. It does not update based on outcomes, and retrieval quality determines whether the right experience surfaces at the right moment.

Reflexion and its descendants. Reflexion (Shinn et al., NeurIPS 2023) introduced a framework where agents verbally reflect on task feedback and store those reflections in an episodic memory buffer to improve decision-making in subsequent trials — without modifying model weights. This is a genuine step forward, and it’s the work that first made me think seriously about this problem. But Reflexion is fundamentally a within-run or within-session mechanism. The reflective memory does not propagate across agent instances or persist as a shared resource across independent runs.

ExpeL and Experiential Reflective Learning. More recent work, including ExpeL (Zhao et al., 2024) and ERL (2025), extracts reusable heuristics by comparing successful and failed trajectories, then injects the most relevant heuristics into future agent contexts via retrieval. This is directionally correct. ERL reports a +7.8% improvement over a ReAct baseline on complex agentic benchmarks precisely because failure-derived heuristics provide negative constraints that prune ineffective strategies. But even here, the experience pool is curated offline, retrieval is still prompt injection, and the feedback loop is not real-time.

Prompt distillation and fine-tuning. Successful agent runs can generate training data that feeds a fine-tuning pipeline. This is horizontally scalable in principle — the knowledge of one run eventually improves the base model that all agents use. But the feedback loop is slow, expensive, requires human curation, and operates offline. It is not collective learning; it is deferred knowledge consolidation.

Workflow libraries and pattern registries. Teams manually curate successful workflow templates. This is human-mediated knowledge transfer, not agent-mediated. It does not scale.

None of these close the gap. They are engineered workarounds for the absence of a proper horizontal learning primitive.

What Is Actually Missing

The architectural primitive that does not yet exist is a persistent, agent-writable, outcome-weighted knowledge layer — one where agents contribute strategy signals after a run completes, and those signals influence future agent behavior without requiring a full retraining cycle or human curation.

The biological analogy came back to me here from the swarm intelligence research I covered in Part II: pheromone trails in ant colonies are not just a communication mechanism — they are a distributed, incrementally updated knowledge store. Shorter, higher-quality paths accumulate stronger signals through positive feedback. Failed paths evaporate. The swarm’s collective intelligence is encoded in the medium itself, not in any individual. No central controller decides which trails are “good.” The outcome does.

What that looks like for LLM-based multi-agent systems is still an open design problem, but the requirements I’ve been able to identify are:

Outcome-weighted writes. Agent runs that complete successfully contribute to the shared knowledge layer with positive weight; failed runs contribute negative constraints. Both are useful — ERL’s results show that failure-derived heuristics often outperform success-derived ones on search tasks.
Decentralized propagation. The update mechanism cannot require a human in the loop or an offline batch process. Strategy signals need to propagate in something close to real time across agent instances.
Relevance-gated retrieval. Future agents need to surface relevant prior experience without injecting everything into context. This is partially addressed by LLM-based retrieval scoring, but remains unsolved at scale.
No weight updates required. The mechanism needs to operate within the context engineering layer, not through gradient descent. Retraining is too slow and too expensive for real-time collective learning.

Why the Industry Has Not Solved It

The more I thought about it, the more I realized the incentive structure explains the gap more than the technical difficulty does.

Vertical scaling — a bigger model, a stronger benchmark score, a longer context window — has a clear commercial lever. It is attributable to a specific product release and easy to market. Horizontal knowledge propagation is architecturally harder, requires runtime infrastructure that does not exist yet, and the value it generates is distributed across runs and users rather than attributable to a single capability upgrade.

Google Research’s recent work on scaling agent systems found that adding more agents does not consistently improve performance — multi-agent coordination yields substantial gains on parallelizable tasks but can actually degrade performance on sequential workflows. More agents is not the answer. Smarter knowledge transfer is. But that is a harder problem to benchmark and a harder story to sell.

The Architectural Opportunity

The systems that will win over the next two to three years will not be the ones with the largest individual agents. They will be the ones that figure out how to make collective experience accumulate efficiently across runs, across users, and across agent instances — without requiring a human editor or an offline training cycle to make it useful.

This is, in a meaningful sense, the missing layer of agentic AI infrastructure. The orchestration layer exists — I covered it in Part I. The communication protocols exist. The shared state store exists. The swarm coordination patterns exist — I covered those in Part II. What does not exist is a production-grade mechanism for collective learning that operates at runtime.

The research directions are beginning to converge on this problem — Reflexion, ERL, Collaborative Memory — but none has produced a general-purpose primitive that production systems can adopt. That gap is both the honest state of the art and the most interesting open problem in multi-agent architecture today.

References

Letta. “Stateful Agents: The Missing Link in LLM Intelligence.” letta.com
Shinn, N. et al. “Reflexion: Language Agents with Verbal Reinforcement Learning.” NeurIPS 2023. arxiv.org/abs/2303.11366
Rezazadeh, M. et al. “Collaborative Memory: Multi-User Memory Sharing in LLM Agents with Dynamic Access Control.” 2025. arxiv.org/abs/2505.18279
“Experiential Reflective Learning for Self-Improving LLM Agents.” 2025. arxiv.org/abs/2603.24639
Google Research. “Towards a Science of Scaling Agent Systems: When and Why Agent Systems Work.” 2026.
Part I: Super Agents and Multi-Agent Communication — orchestration, structured communication, and the single source of truth
Part II: Swarm Intelligence — The Opposite Architectural Bet — decentralized coordination and emergent intelligence
Reasoning Models and Deep Reasoning in LLMs — the reasoning strategies that power individual agents
The Obsolescence Paradox: Why the Best Engineers Will Thrive in the AI Era — engineering judgment in the age of autonomous AI systems

Swarm Intelligence: The Opposite Architectural Bet

contact@corebaseit.com (Vincent Bevia) — Sat, 28 Mar 2026 10:00:00 +0100

This is Part II of a two-part series on multi-agent AI architecture. Part I covered the super agent pattern: centralized orchestration, structured communication, and a single source of truth. This post explores the opposite approach.

Everything I described in Part I assumes a central orchestrator that owns workflow visibility and decision authority. Swarm intelligence is the opposite architectural bet — and understanding the contrast changed how I think about multi-agent design.

When I started reading about swarm intelligence after writing the orchestrator post, I expected a niche optimization technique. What I found instead was a fundamentally different philosophy of coordination — one where global competence emerges from local interactions, with no central controller and no global plan. The more I dug in, the more I realized this isn’t just an alternative pattern. It’s a direct challenge to some of the assumptions I laid out in Part I, and understanding where each approach wins (and fails) is what separates a good multi-agent architecture from an overengineered one.

What Is Swarm Intelligence?

Swarm intelligence is the study and engineering of collective behavior that emerges from many simple agents interacting locally, with no central controller and no global plan. Each agent operates on partial information and follows simple local rules. Global-level competence — efficient foraging, optimal routing, adaptive task allocation — emerges from those local interactions rather than being imposed from above.

What struck me about this definition is how directly it inverts the super agent model. In Part I, I described a system where the orchestrator is the only node with full workflow visibility, and specialist agents receive scoped inputs and produce scoped outputs. In a swarm, no agent has full visibility. There is no orchestrator. And yet the collective solves problems that exceed the capability of any individual member.

Three properties define the pattern:

Decentralization. There is no leader node. No single agent has full workflow visibility, and none can issue authoritative commands to others. Coordination is a byproduct of local interaction, not a product of centralized planning. This is the property that makes swarms inherently fault-tolerant — remove any individual agent and the system continues functioning, because no agent was indispensable to begin with.

Self-organization. Coherent global patterns arise spontaneously from local rules. No agent is told “build this structure” or “follow this path.” The structure and the paths emerge from thousands of independent decisions, each one simple, each one local, each one informed only by the agent’s immediate environment. The global order was never specified — it assembled itself.

Emergent intelligence. The collective solves problems that exceed the capability of any individual agent. This is the part that I found genuinely surprising when I started looking at the research: the group is, in a meaningful sense, smarter than its members. Not because the agents secretly share a global model, but because local interactions produce feedback loops that concentrate collective effort on high-quality solutions over time.

From Biology to Algorithms

The canonical biological examples are not just illustrations — they directly inspired the computational methods in use today. Understanding the biology helps explain why the algorithms work.

Ant colonies are the most studied example. An individual ant has no map, no plan, and no knowledge of the colony’s global state. It follows simple rules: wander randomly, and when you find food, return to the nest while depositing pheromone. Other ants are biased toward following stronger pheromone trails. Shorter paths between food and nest get traversed more frequently, accumulate more pheromone, and attract more ants — creating a positive feedback loop that converges on efficient routes. Meanwhile, pheromone evaporates over time, which means abandoned or suboptimal paths fade naturally. The colony’s routing network self-assembles from thousands of individual deposit-and-evaporate decisions.

What I found remarkable is how robust this is. Block a path, and the colony reroutes within minutes — not because any ant “knows” the path is blocked, but because pheromone stops accumulating on the blocked segment and alternative routes gain relative strength. The system adapts to disruption without any agent being aware of the disruption at a global level.

Bee colonies use a different coordination mechanism: the waggle dance. Scout bees evaluate potential food sources or nest sites, then return to the hive and communicate their findings through a dance whose duration and direction encode the distance and quality of the source. Other bees probabilistically follow the more enthusiastic dancers. Over rounds of scouting and reporting, the colony converges on the best available option — a decentralized decision process that has been shown to rival the accuracy of optimal mathematical models.

Bird flocks and fish schools demonstrate a third variant: alignment-based coordination. Each individual follows three simple rules — separation (don’t crowd), alignment (match direction with neighbors), and cohesion (stay close to the group). The stunning visual coherence of a starling murmuration or a sardine ball emerges entirely from these local rules. No bird leads. No fish coordinates. The collective pattern is an emergent property of individual behavior.

These aren’t metaphors. They are the direct inspiration for the algorithms.

The Two Dominant Algorithms

Two metaheuristics dominate applied swarm AI, and both map directly from the biological mechanisms above.

Ant Colony Optimization (ACO)

ACO, introduced by Marco Dorigo in 1992, translates the ant foraging model into a general-purpose optimization algorithm. Artificial agents (“ants”) traverse a solution space — typically modeled as a graph — and deposit virtual pheromone on the edges they traverse. The pheromone strength on each edge influences the probability that subsequent ants will choose that edge. Better solutions accumulate stronger pheromone over time through positive feedback, while evaporation ensures the algorithm doesn’t lock permanently onto early suboptimal solutions.

The algorithm is straightforward:

Initialize pheromone levels uniformly across all edges
Each ant constructs a complete solution by traversing the graph, with transition probabilities biased by pheromone strength and a heuristic desirability function
After all ants complete their tours, update pheromone: deposit proportional to solution quality, evaporate a fixed fraction globally
Repeat for a fixed number of iterations or until convergence

ACO has been applied successfully to the Traveling Salesman Problem, vehicle routing, network routing, job-shop scheduling, and protein folding. What I found interesting from an engineering perspective is that ACO handles dynamic problems well — if the graph changes during execution (a link goes down, a cost changes), the pheromone distribution naturally adapts over subsequent iterations without requiring a restart.

Particle Swarm Optimization (PSO)

PSO, introduced by Kennedy and Eberhart in 1995, takes inspiration from bird flocking and fish schooling rather than ant foraging. Each “particle” in the swarm represents a candidate solution in a continuous search space. Each particle has a position and a velocity, and it maintains two pieces of memory: its own best-known position (pbest) and the global best position found by any particle in the swarm (gbest).

At each iteration, each particle updates its velocity as a weighted combination of three forces:

Inertia — continue in the current direction
Cognitive pull — move toward pbest (the agent’s own best experience)
Social pull — move toward gbest (the collective’s best experience)

The balance between cognitive and social pull determines the exploration-exploitation trade-off. Heavy cognitive pull means particles explore independently; heavy social pull means the swarm converges quickly on the current best. Tuning these weights is the primary design decision in PSO.

PSO is widely used in continuous optimization, neural network training, feature selection, and engineering design optimization. Unlike ACO, PSO operates in continuous space rather than on graphs, which makes it a natural fit for problems where solutions are represented as real-valued vectors.

What I found appealing about both algorithms is their simplicity. The core logic of ACO or PSO fits in a few dozen lines of code. The intelligence doesn’t come from the complexity of the individual agent — it comes from the interaction dynamics of the population.

A Minimal PSO Example

To make this as concrete as I did for the orchestrator pattern in Part I, here’s a minimal PSO implementation. The swarm searches for the minimum of a simple 2D function:

import random

def objective(position: list[float]) -> float:
 x, y = position
 return x ** 2 + y ** 2

class Particle:
 def __init__(self, bounds: list[tuple[float, float]]):
 self.position = [random.uniform(lo, hi) for lo, hi in bounds]
 self.velocity = [random.uniform(-1, 1) for _ in bounds]
 self.best_position = list(self.position)
 self.best_score = objective(self.position)

def run_pso(
 n_particles: int = 20,
 bounds: list[tuple[float, float]] = [(-10, 10), (-10, 10)],
 iterations: int = 50,
 w: float = 0.7,
 c1: float = 1.5,
 c2: float = 1.5,
) -> list[float]:
 particles = [Particle(bounds) for _ in range(n_particles)]
 global_best = min(particles, key=lambda p: p.best_score)
 gbest = list(global_best.best_position)
 gbest_score = global_best.best_score

 for _ in range(iterations):
 for p in particles:
 for i in range(len(bounds)):
 r1, r2 = random.random(), random.random()
 p.velocity[i] = (
 w * p.velocity[i]
 + c1 * r1 * (p.best_position[i] - p.position[i])
 + c2 * r2 * (gbest[i] - p.position[i])
 )
 p.position[i] += p.velocity[i]
 p.position[i] = max(bounds[i][0], min(bounds[i][1], p.position[i]))

 score = objective(p.position)
 if score < p.best_score:
 p.best_score = score
 p.best_position = list(p.position)
 if score < gbest_score:
 gbest_score = score
 gbest = list(p.position)

 return gbest, gbest_score

best_pos, best_score = run_pso()
print(f"Best position: {best_pos}")
print(f"Best score: {best_score}")

Twenty particles, each starting at a random position, each pulled toward its own best experience and the swarm’s collective best. No particle knows the objective function’s landscape. No particle directs the others. Yet within 50 iterations, the swarm converges on the minimum — not because any individual found it deliberately, but because the interaction dynamics between personal memory and social influence concentrate the swarm’s exploration on progressively better regions of the space.

Compare this to the orchestrator pattern from Part I: there, a coordinator explicitly assigned work to specialist agents and tracked the workflow state. Here, there is no coordinator. The “coordination” is an emergent property of the velocity update rule. Both patterns produce useful collective behavior — through fundamentally different mechanisms.

Swarm vs. Orchestrator: The Architectural Trade-Off

This is the comparison I kept coming back to as I read through both bodies of literature:

Property	Super Agent (Orchestrator)	Swarm
Control	Centralized	Decentralized
State visibility	Full (single source of truth)	Partial (local only)
Coordination	Explicit assignment and gating	Emergent from local rules
Failure mode	Orchestrator is a single point of failure	Robust to individual agent loss
Predictability	High — deterministic workflow graph	Lower — emergent behavior
Debuggability	High — inspect the state store	Harder — behavior is a collective property
Best suited for	Complex workflows with strict ordering and accountability	Search, optimization, and exploration under uncertainty

The orchestrator pattern wins when you need auditability, sequential dependencies, and defined handoffs — a software delivery pipeline, a compliance workflow, a multi-step API integration. When someone asks “what happened and why,” you can trace the answer through the state store and the orchestrator’s decision log. That’s essential in regulated domains like payments, healthcare, or finance, where I spend most of my time.

The swarm pattern wins when the problem is fundamentally one of parallel exploration, where no single agent can know the right answer in advance and the solution space is too large for a directed search. Routing optimization, hyperparameter tuning, resource allocation under dynamic constraints, adversarial search — these are problems where the strength of the swarm is that it doesn’t commit to a single path early. It explores broadly, converges gradually, and adapts to changes in the landscape without requiring a central replanning step.

The failure modes are equally instructive. An orchestrator system that loses its coordinator loses everything — the workflow stops, the state becomes ambiguous, and recovery requires restarting from a checkpoint. A swarm that loses 20% of its agents barely notices — the remaining agents continue interacting, and the collective behavior degrades gracefully rather than collapsing. On the other hand, a swarm that converges on a suboptimal solution can be hard to diagnose, because the “decision” was never made by any single agent — it emerged from the collective dynamics, and there’s no decision log to inspect.

The Hybrid: Where Both Patterns Meet

What I found most interesting — and most relevant to real-world systems — is that the best architectures don’t choose one pattern exclusively. They combine both.

The emerging production pattern looks like this: a super agent orchestrates the high-level workflow and enforces policy, while swarm-style sub-networks handle search, ranking, or optimization sub-problems where emergent behavior is an asset rather than a liability.

Consider a concrete example: a multi-agent system for automated code review. The orchestrator (super agent) manages the workflow — receive a pull request, assign analysis tasks, collect results, enforce quality gates, produce a final report. That’s a sequential, auditable pipeline. But within the analysis stage, you might deploy a swarm of lightweight agents, each examining the code from a different angle — style, security, performance, correctness, test coverage — with their findings aggregated through a voting or ranking mechanism rather than a centralized decision. The orchestrator owns the workflow. The swarm owns the search.

This hybrid is not theoretical. It shows up in retrieval-augmented generation (RAG) pipelines where an orchestrator manages the query-retrieve-generate flow while a swarm of retrieval agents explores different index partitions in parallel. It shows up in automated trading systems where a central risk engine enforces position limits while swarm-based signal generators explore the market independently. It shows up in robotics where a planner coordinates high-level task sequences while swarm algorithms handle local path planning and obstacle avoidance.

The architectural insight is that orchestration and emergence are not competing philosophies — they are complementary tools for different layers of the same system. The orchestrator provides structure, accountability, and policy enforcement. The swarm provides exploration, resilience, and adaptive search. Using both, at the right layers, gives you something that neither alone can achieve.

What I Took Away from All of This

Across both posts, the thread that connects everything is that multi-agent AI is fundamentally a systems engineering problem. Whether you’re building a centralized orchestrator with a shared state store or a decentralized swarm with emergent coordination, the design questions are the same ones that distributed systems engineers have been wrestling with for decades: how do agents communicate? Who owns state? How do you handle failure? How do you debug collective behavior?

The super agent pattern gives you control, auditability, and predictability. The swarm pattern gives you resilience, adaptability, and the ability to solve problems that are too large or too dynamic for a directed search. The best systems use both — orchestration where you need accountability, emergence where you need exploration.

If Part I was about understanding how to make agents work together under a coordinator, this post is about understanding when to let agents work independently — and trusting that the collective behavior will be smarter than any individual plan.

The models handle the reasoning. The architecture handles the reliability. And the choice between orchestration and emergence determines the shape of that architecture.

References

Wikipedia. “Swarm Intelligence.” en.wikipedia.org
Vation Ventures. “Swarm Intelligence: Definition, Explanation, and Use Cases.” vationventures.com
Scholarpedia. “Swarm Intelligence.” scholarpedia.org
HPE. “What is Swarm Intelligence?” hpe.com
Ultralytics. “Swarm Intelligence in Vision AI.” ultralytics.com
ScienceDirect Topics. “Swarm Intelligence.” sciencedirect.com
Dorigo, M. “Optimization, Learning and Natural Algorithms.” PhD Thesis, Politecnico di Milano, 1992.
Kennedy, J. & Eberhart, R. “Particle Swarm Optimization.” IEEE International Conference on Neural Networks, 1995.
Part I: Super Agents and Multi-Agent Communication — the orchestrator pattern, communication mechanisms, and a minimal Python implementation
Reasoning Models and Deep Reasoning in LLMs — the reasoning strategies that power individual agents in both patterns
The Obsolescence Paradox: Why the Best Engineers Will Thrive in the AI Era — engineering judgment in the age of autonomous AI systems

Super Agents and Multi-Agent Communication: Architecture That Actually Scales

contact@corebaseit.com (Vincent Bevia) — Fri, 27 Mar 2026 22:00:00 +0100

This is Part I of a two-part series on multi-agent AI architecture. This post covers centralized orchestration. Part II explores the opposite approach: swarm intelligence.

I’ve been reading a lot about “super agents” lately — and once I got past the marketing noise, I found a genuinely useful architectural pattern underneath.

The term gets thrown around loosely, but the more I dug into it — across AWS documentation, IBM’s multi-agent research, LangGraph’s implementation guides, and a handful of practical engineering write-ups — the more I realized it maps cleanly onto problems that single-model, turn-by-turn systems simply cannot solve reliably: multi-step workflows with branching logic, delegated expertise, and external system integration. The concept is not new — multi-agent coordination has decades of research behind it — but LLMs have made it practically viable in ways that weren’t possible three years ago.

This post is my attempt to organize what I’ve learned: what the term actually means, how agents communicate in practice, and a minimal Python implementation I put together to make the pattern concrete before reaching for a framework.

What Is a Super Agent?

The clearest definition I found across the literature: a super agent is an autonomous AI system capable of interpreting a high-level goal, decomposing it into sub-tasks, orchestrating tools and specialist agents, and executing a multi-step workflow with minimal human intervention. That’s the architectural distinction that separates it from a standard chatbot — a chatbot responds turn-by-turn; a super agent plans, delegates, acts, and adapts.

What struck me when I started pulling the concept apart is how concrete the capabilities actually are:

Decompose goals — translate a high-level objective (“Audit our Q2 pipeline and notify the reps”) into a sequenced set of executable tasks.
Orchestrate tools and sub-agents — coordinate search, code execution, external APIs, CRM writes, and domain-specific agents as a unified workflow.
Maintain long-horizon context — preserve memory of the user, the project state, and intermediate results across multiple reasoning steps.
Act in external systems — send emails, update records, generate documents, and book reservations — not just describe how to do those things.
Support human-in-the-loop — pause for confirmation, accept corrections, and revise plans accordingly.

The framing that resonated most with me is that a super agent functions as a digital teammate that can plan, decide, and act — not a passive assistant that generates single responses.

Do Agents Actually Talk to Each Other?

This was the question that pulled me deeper into the topic. The answer is yes — and the way they do it is where the architecture gets interesting. In multi-agent systems, agents communicate via structured messages to coordinate work, share intermediate results, and negotiate task ownership.

Communication Mechanisms

From what I found, three mechanisms dominate in practice:

Message passing. Agents exchange typed messages (request, result, status, feedback) over a bus, queue, or shared memory store. The message structure includes sender, receiver, intent, payload, and timestamp, so both sides can route and act on messages reliably. This is the most flexible mechanism and the one that most closely resembles traditional distributed systems communication — which, coming from a systems engineering background, immediately made sense to me.

Shared state. Rather than direct peer-to-peer calls, agents read from and write to a single authoritative state object. This is the foundation of LangGraph-style graphs and is the pattern most relevant to in-process agent systems. The state object becomes both the communication channel and the coordination mechanism — agents don’t need to know about each other, only about the state contract.

Natural language over a structured envelope. LLM-based agents can exchange plain-text prompts and responses, but production systems wrap those in a JSON schema or DSL to reduce ambiguity and enable deterministic parsing. The natural language carries the semantic content; the envelope carries the routing and type information that machines need to act on it reliably.

Coordination Patterns

The coordination patterns I kept seeing across the literature include request–response, broadcast, task announcement and bidding, and peer-to-peer collaboration where agents refine each other’s outputs. The coordination role is explicit: either a planner agent delegates to workers, or agents operate in a fully collaborative graph where outputs flow through defined contracts.

What I found particularly useful to think about is how the choice of coordination pattern has direct architectural consequences. A centralized planner is simpler to reason about and debug, but creates a single point of failure. A fully distributed collaboration graph is more resilient but harder to monitor and control. Most production systems seem to land somewhere in between — a planner that delegates to autonomous agents, with guardrails and fallback logic at the orchestration layer.

A Minimal In-Process Pattern

To make this concrete for myself, I put together a minimal example. The cleanest starting point I could find for understanding agent-to-agent communication requires only three components: a shared state object, two agent functions, and a lightweight orchestrator that sequences them.

from dataclasses import dataclass, field
from typing import List, Dict

@dataclass
class State:
 user_goal: str
 messages: List[Dict[str, str]] = field(default_factory=list)
 draft: str = ""
 review: str = ""

def writer_agent(state: State) -> None:
 state.draft = f"Draft for goal: {state.user_goal}"
 state.messages.append({
 "from": "writer",
 "to": "reviewer",
 "type": "draft",
 "content": state.draft,
 })

def reviewer_agent(state: State) -> None:
 incoming = state.messages[-1]["content"]
 state.review = f"Reviewed version of: {incoming}"
 state.messages.append({
 "from": "reviewer",
 "to": "writer",
 "type": "review",
 "content": state.review,
 })

def run_workflow(goal: str) -> State:
 state = State(user_goal=goal)
 writer_agent(state)
 reviewer_agent(state)
 return state

state = run_workflow("Create a short API integration summary")
print(state.messages)
print(state.review)

writer_agent() produces a draft and appends a typed message targeted at the reviewer. reviewer_agent() reads that message and writes its response back into the same structure. Both agents live in the same process, yet the message list enforces a clean protocol boundary — which is exactly what makes the design debuggable and extensible.

Why This Pattern Scales

What I like about this design is that the agents are loosely coupled: they do not invoke each other’s business logic directly; they communicate through state and message contracts. That separation makes it straightforward to insert a supervisor, add retries, inject validation, or introduce checkpointing without rewriting each agent’s core responsibility.

When I later looked at LangGraph, I found this same idea formalized as graph nodes that receive state and return a Command specifying which node runs next and what state updates to apply. The plain Python example above maps directly to START → writer → reviewer → END, with shared state as the communication channel. Building the minimal version first helped me understand what the framework is actually abstracting.

The Super Agent as Orchestrator

One pattern that came up consistently across everything I read: in production multi-agent systems, the super agent is the orchestrator — not another worker. This distinction matters more than it sounds.

The orchestrator does not perform domain work. It decomposes the user goal and assigns sub-tasks to specialist agents. It tracks workflow state, evaluates intermediate results, and decides on next steps, retries, or fallbacks. It enforces policies, cost boundaries, and safety checks at a single control point. Every specialist agent has a scoped responsibility; the orchestrator has workflow-level visibility.

I sketched out two diagrams to think through how this works in practice. The first illustrates a software delivery context: a single Super Agent at the top of the hierarchy delegates to five specialized agents — Requirements, Coder, Refactor, Test, and Documentation — each with a clearly scoped responsibility and no direct coupling to the others.

The second diagram scales the same pattern to a broader engineering context. Here the orchestrator coordinates six agents covering the full stack — Requirements, Architecture, Frontend, Backend, Test, and Security — and what I noticed is that the hierarchy holds regardless of how many specialists you introduce.

What stays constant across both diagrams — and what I think is the key insight — is that the orchestrator is the only node with full workflow visibility. Specialist agents receive scoped inputs and produce scoped outputs. They do not need to know what the other agents are doing. That coordination burden belongs entirely to the super agent.

The practical three-layer production pattern that I kept seeing emerge:

Layer	Role
Orchestrator / super agent	Owns the workflow graph, task assignment, and gate logic
Shared context store	Versioned state or artifacts (DB, files, or structured in-memory state) — the single source of truth
Specialist agents	Read from the store, produce outputs into it, never assume hidden state

This layering felt immediately familiar to me. It mirrors how well-designed distributed systems have always worked: a coordinator with global visibility, workers with local scope, and a shared data layer that keeps everyone honest.

Single Source of Truth: Non-Negotiable

One thing that stood out across nearly every resource I read: multi-agent systems fail when each agent builds its own version of reality. The mature architectures all anchor the entire system to a single source of truth — whether that is a shared in-process state object, a central database, or a versioned artifact store.

The benefits are concrete, and they’re the same benefits I’ve seen in any well-designed distributed system:

Consistency. No diverging world-views across agents running in parallel. When the coder agent writes a function and the test agent writes assertions against it, both are working from the same artifact — not from separate memories of what the specification said.

Debuggability. One place to inspect current state across the entire workflow. When something goes wrong — and in multi-agent systems, something always goes wrong — you need a single pane of glass to understand what each agent saw, what it produced, and where the chain broke.

Clean handoffs. Agents know exactly which fields or artifacts they are responsible for updating. They do not invent state. They do not carry assumptions from a previous run. They read, process, and write — through the central store.

Agents may maintain local working memory or intermediate caches for their own reasoning steps, but they must reconcile through the central truth store before producing outputs that other agents depend on. This is the difference between a system that works reliably and one that works until the agents’ internal models diverge — which, without a single source of truth, they eventually will.

The Bigger Picture

After going through all of this, my takeaway is that the super agent concept is not hype — if you ground it in architecture. The key properties are clear: a goal-decomposing orchestrator, loosely coupled specialist agents, structured inter-agent communication, and a single authoritative state store. The Python pattern in this post is deliberately minimal — I wanted to see the essential reasoning surface before layering on a framework.

If you are building toward a LangGraph or similar implementation, the concepts translate directly: nodes map to agents, edges map to message contracts, and the graph state is your single source of truth. The abstraction is different. The architecture is the same.

The broader realization I came away with is that the hard problem in agentic AI is not making individual agents smarter. It is making multiple agents coordinate reliably — which is, fundamentally, a systems engineering problem. The same principles that make distributed systems work — clear contracts, shared state, scoped responsibility, centralized coordination — are exactly the principles that make multi-agent systems work.

The models handle the reasoning. The architecture handles the reliability.

But centralized orchestration is not the only way to coordinate agents. In Part II, I explore the opposite architectural bet — swarm intelligence — where there is no orchestrator, no global plan, and global competence emerges from local interactions. Understanding when each pattern wins is what makes the difference between a good multi-agent design and an overengineered one.

References

Attention.com. “Introducing Super Agent: Your AI Teammate for Revenue Execution.” 2025.
IBM Think. “What is a Multi-Agent System.” ibm.com
AWS Prescriptive Guidance. “Agentic AI: Multi-Agent Collaboration Patterns.” docs.aws.amazon.com
GeeksforGeeks. “Multi-Agent System in AI.” geeksforgeeks.org
SmythOS. “Agent Communication in Multi-Agent Systems.” smythos.com
ApXML. “Communication Protocols for LLM Agents.” 2025.
DigitalOcean. “Agent Communication Protocols Explained.” digitalocean.com
LangChain. “LangGraph Multi-Agent Systems Overview.” langchain-ai.github.io
LangChain. “Multi-Agent Collaboration Tutorial.” langchain-ai.github.io
VentureBeat. “How Architectural Design Drives Reliable Multi-Agent Orchestration.” 2025.
IBM Community. “Agentic Multi-Cloud Infrastructure Orchestration.” 2025.
Latenode Community. “How Separate Agents Share a Single Memory.” 2025.
Part II: Swarm Intelligence — The Opposite Architectural Bet — decentralized coordination, emergent intelligence, and when to choose swarm over orchestrator
AI Sycophancy — why confident-looking AI output still requires verification, even from autonomous agents
Reasoning Models and Deep Reasoning in LLMs — the reasoning strategies that power individual agents
The Obsolescence Paradox: Why the Best Engineers Will Thrive in the AI Era — engineering judgment in the age of autonomous AI systems

Software-Engineering on Corebaseit — POS · EMV · Payments · AI

I Spent Years on Adaptive Filters. I Was Already Training Neural Networks.

LMS Is Not a Metaphor for Training — It Is the Algorithm

The Engineering Trade-Offs Are the Same Trade-Offs

Non-Stationarity Was Always the Real Problem — and Still Is

The Bigger Picture

References

Further reading

Multi-Agent Systems Scale Vertically. They Need to Scale Horizontally.

What Vertical Scaling Actually Means

The Statefulness Illusion

The Horizontal Scaling Problem

What the Field Has Built as Workarounds

What Is Actually Missing

Why the Industry Has Not Solved It

The Architectural Opportunity

References

Swarm Intelligence: The Opposite Architectural Bet

What Is Swarm Intelligence?

From Biology to Algorithms

The Two Dominant Algorithms

Ant Colony Optimization (ACO)

Particle Swarm Optimization (PSO)

A Minimal PSO Example

Swarm vs. Orchestrator: The Architectural Trade-Off

The Hybrid: Where Both Patterns Meet

What I Took Away from All of This

References

Super Agents and Multi-Agent Communication: Architecture That Actually Scales

What Is a Super Agent?

Do Agents Actually Talk to Each Other?

Communication Mechanisms

Coordination Patterns

A Minimal In-Process Pattern

Why This Pattern Scales

The Super Agent as Orchestrator

Single Source of Truth: Non-Negotiable

The Bigger Picture

References