Self-organizing memory systems: how ART, SOM, and GNG adapt without starting over
By Edward Izgorodin
Self-organizing memory systems face the same hard tradeoff. If they adapt too fast, new input can distort what they already store. If they adapt too slow, they stop tracking the world.
Stephen Grossberg framed this as the stability-plasticity dilemma: a learning system must remain plastic enough to absorb new patterns, yet stable enough to preserve older ones (Grossberg, 1987). That tension still matters for AI agents that keep memory across long-lived sessions, not just one training run.
Three classic self-organizing families approach the problem in different ways.
TL;DR
- ART learns categories online and protects older categories with a vigilance test; it updates memory only when input and stored pattern resonate (Carpenter & Grossberg, 1987).
- SOM maps high-dimensional input onto a fixed low-dimensional grid; nearby inputs land near each other because the winner and its neighbors update together (Kohonen, 1982; Kohonen, 1990).
- GNG does not keep a fixed map. It grows its own graph with competitive Hebbian learning, edge aging, and node insertion near high-error regions (Fritzke, 1995).
- GNG-U adds utility-based pruning, which removes low-utility nodes so the network can follow non-stationary input distributions over time (Fritzke, 1997).
The useful comparison is not which one is "best." It is what each one treats as adaptive. ART adapts categories. SOM adapts prototypes on a fixed grid. GNG adapts the graph itself.
Adaptive Resonance Theory (ART)
Adaptive Resonance Theory (ART) is a family of neural models that learns new categories online while protecting older categories through a vigilance test that controls when learning is allowed.
Gail Carpenter and Stephen Grossberg introduced ART as a way to learn from a stream of inputs without collapsing old categories into new ones (Carpenter & Grossberg, 1987). Grossberg's broader framing in the same period was explicit: stable learning needs a mechanism that avoids catastrophic interference while still admitting novelty (Grossberg, 1987).
The central ART mechanism is vigilance. Vigilance is a threshold on how closely an input must match an existing category. If vigilance is high, the network forms more, narrower categories. If vigilance is low, it accepts broader matches and keeps fewer categories (Carpenter & Grossberg, 1987).
That threshold matters because ART does not learn by default. The architecture runs a closed loop between bottom-up input and a top-down expectation: a winning category projects a stored prototype back toward the input, and the two are compared. Writing the input as I, the prototype as w, and the vigilance threshold as ρ, the match test has the form |I ∧ w| / |I| ≥ ρ. If the test passes, the system reaches resonance and updates that category. If it fails, the winning node is reset and the system searches for another category. If none qualifies, it commits a new category instead (Carpenter & Grossberg, 1987).
This is the key design choice. ART is match-based rather than purely error-based. It does not force a poor fit into an existing category just to reduce immediate error. By allocating a new category for a dissimilar input instead of forcing an existing one to accommodate it, ART resists catastrophic forgetting (Carpenter & Grossberg, 1987).
For memory engineers, that makes ART notable for two reasons.
First, it is built for online learning. Inputs arrive one at a time. The system does not depend on a separate offline phase to rebuild memory from scratch (Carpenter & Grossberg, 1987).
Second, category granularity is explicit. The vigilance parameter is a direct control over how coarse or fine the memory index becomes. That is a deliberate answer to the stability-plasticity problem, not an accidental side effect.
The family then branches by input type and task:
- ART-1 works with binary input patterns (Carpenter & Grossberg, 1987).
- ART-2 extends the approach to continuous or analog inputs (Carpenter & Grossberg, 1987).
- Fuzzy ART uses fuzzy-set operations to handle analog patterns (Carpenter, Grossberg & Rosen, 1991).
- ARTMAP is a supervised extension that maps input categories to output labels (Carpenter, Grossberg & Reynolds, 1991).
The shortest useful summary: ART changes stored structure only after a good enough match.
Self-Organizing Maps (SOM)
A Self-Organizing Map (SOM) is a neural network that projects high-dimensional input onto a low-dimensional grid while preserving topology, so nearby inputs tend to map to nearby nodes.
Teuvo Kohonen introduced SOM as a way to form topologically ordered feature maps from input data (Kohonen, 1982). The later synthesis is consistent across his work: SOM combines competitive learning with neighborhood-based adaptation to preserve local structure during mapping (Kohonen, 1990; Kohonen, 2001).
The mechanics are straightforward.
Each node in the map holds a weight vector. When an input arrives, nodes compete to find the closest one. The winner is the best-matching unit, or BMU. For an input x, the BMU is the node c whose weight vector w_j is closest to the input: c = argmin_j ||x − w_j||, the node minimizing the Euclidean distance over all nodes j (Kohonen, 1990; Kohonen, 2001).
SOM then updates more than the winner alone. It also updates nodes near the BMU on the grid. This is the neighborhood function. Early in training the neighborhood is broad. Over time it typically shrinks, along with the learning rate (Kohonen, 1990; Kohonen, 2001).
That shrinking neighborhood is what gives SOM its topological behavior. Similar inputs pull nearby grid locations in related directions, so neighborhood relations in the input space are reflected on the map. Kohonen described this as the formation of topologically correct feature maps (Kohonen, 1982).
Another way to say it is that SOM performs vector quantization with structure. Each node becomes a prototype for some region of the input space, but those prototypes are arranged on a visible grid rather than as an unstructured list (Kohonen, 1990; Kohonen, 2001).
That has two practical implications.
One is interpretability. A low-dimensional grid can be inspected as a map of how inputs cluster and relate, which made SOM a common tool for dimensionality reduction and visualization (Kohonen, 1990; Kohonen, 2001).
The other is an architectural limit that matters for the comparison with GNG. In a classical SOM, the grid topology is fixed in advance. The nodes move in weight space, but the map does not invent new adjacency relations beyond the grid chosen at the start (Kohonen, 1990).
That fixed topology is the sharp contrast. SOM adapts prototypes, not the underlying graph. Variants such as classical SOM, Growing SOM, and hierarchical SOM relax this in part, but standard SOM starts from a predefined map geometry, and learning mainly changes weights within it (Kohonen, 1990; Kohonen, 2001).
Growing Neural Gas (GNG)
Growing Neural Gas (GNG) is a self-organizing network that learns the topology of an input distribution by incrementally adding nodes and edges, rather than fitting inputs onto a fixed grid.
Bernd Fritzke introduced GNG as a network that learns topologies by growth (Fritzke, 1995). The design changes the key assumption SOM makes. Instead of choosing the map structure first, GNG lets the structure emerge during learning, starting from a minimal two-node graph.
The process still starts with competition. For each input, the network finds the two nearest nodes. The system creates or refreshes an edge between them. Fritzke describes this as competitive Hebbian learning: co-activated units become linked in the graph (Fritzke, 1995).
That graph does not stay static.
Edges carry an age. When an edge is refreshed by new co-activation, it resets to current. When it is not refreshed, its age increases, and edges past a maximum age are removed (Fritzke, 1995). This gives the network a direct mechanism for dropping stale local relationships.
GNG also tracks accumulated error at nodes. Periodically it inserts a new node near the region with the highest accumulated error (Fritzke, 1995). In effect, the network grows where its current representation is poorest.
Those two mechanisms work together:
- edge aging removes stale structure
- node insertion adds capacity where the fit is weak
This is why GNG is a closer match than SOM to problems where the underlying structure is not known in advance. It can expand and reshape its graph as the data distribution reveals itself.
GNG-U and utility-based pruning
Fritzke later extended this idea with GNG-U, a version of GNG that adds a utility measure for each node (Fritzke, 1997).
The motivation is concrete. Classical GNG inserts nodes near high-error regions, so it keeps adding nodes wherever data is currently active. But when a region of the input space goes quiet, the nodes already there no longer accumulate error, so they stay in place and hold capacity in an area the data has left behind.
GNG-U addresses that directly. Utility quantifies how much a node contributes to reducing overall error. Nodes whose utility falls below a threshold can be removed entirely, even if they still possess edges (Fritzke, 1997).
That is the utility-based pruning mechanism.
The consequence matters for changing environments. Fritzke presented GNG-U as a network that can follow non-stationary distributions (Fritzke, 1997). When the data shifts away from an old region, low-utility nodes in that region can be pruned, and capacity can move toward regions that now matter.
For AI agent memory, utility-based pruning addresses a practical requirement: a long-lived system must reclaim space occupied by structures that no longer earn their keep. Durable memory should not only add structure. It should also retire structure that stops earning its keep.
Stability vs plasticity: the common thread
These architectures look different on the surface, but they share a small set of design ideas.
The first is winner-take-all competition. Each system identifies a best current match to the input, whether that is a category in ART, a BMU in SOM, or the nearest nodes in GNG (Carpenter & Grossberg, 1987; Kohonen, 1990; Fritzke, 1995).
The second is Hebbian-style adaptation. Co-activation changes what is stored. In GNG, that is explicit in edge creation and refresh. In SOM, the winner and its neighbors move toward the input. In ART, matching input updates the resonant category (Carpenter & Grossberg, 1987; Kohonen, 1990; Fritzke, 1995).
The third is an explicit answer to the stability-plasticity dilemma. The table below summarizes how each family treats structure and what holds it stable.
| Dimension | ART | SOM | GNG / GNG-U |
|---|---|---|---|
| What adapts | Categories, allocated one at a time (Carpenter & Grossberg, 1987) | Node weights on a fixed grid (Kohonen, 1990) | The graph itself: nodes and edges (Fritzke, 1995) |
| Stability mechanism | Vigilance test gates when a prototype may change (Carpenter & Grossberg, 1987) | Shrinking neighborhood and decaying learning rate (Kohonen, 1990) | Edge aging, error-driven insertion, and utility-based pruning in GNG-U (Fritzke, 1995; Fritzke, 1997) |
| Structural plasticity | New categories committed on demand; existing grid not required (Carpenter & Grossberg, 1987) | None in the classical model; grid is fixed in advance (Kohonen, 1990) | Continuous growth and pruning of nodes and edges (Fritzke, 1995; Fritzke, 1997) |
| Input modality | Binary (ART-1), continuous (ART-2, Fuzzy ART) (Carpenter & Grossberg, 1987; Carpenter, Grossberg & Rosen, 1991) | Continuous vectors (Kohonen, 1990) | Continuous vectors over an unconstrained graph (Fritzke, 1995) |
| Non-stationary data | Adds categories without erasing old ones (Carpenter & Grossberg, 1987) | Fixed grid does not retire stale regions on its own (Kohonen, 1990) | GNG-U prunes low-utility nodes to follow shifting distributions (Fritzke, 1997) |
That comparison also clarifies where each one fits.
If the problem is category formation without overwriting older categories, ART is the cleanest statement of the issue.
If the problem is topology-preserving projection and interpretable mapping, SOM is the canonical design.
If the problem is learning structure when the topology itself should adapt over time, GNG is the more direct answer.
For a broader memory-science view, it helps to place these systems next to older associative approaches such as Hopfield associative memory and cluster-level overviews such as the AI memory landscape. They address different mechanisms, but the same larger question: how should memory retain useful structure while still adapting to new experience?
Why this matters for AI agent memory
Agent memory is not static. Useful facts recur. Context drifts. Old associations go cold. New patterns appear that do not fit the current index.
That is why these older self-organizing systems still matter. They offer clear design primitives for adaptive memory:
- competition to find the current best match
- local update rules instead of full retraining
- structure growth where representation is weak
- pruning where structure no longer contributes
- an explicit control over the stability-plasticity tradeoff
Those primitives are more informative than vague claims about retrieval quality. They specify how a memory can change over time.
These ideas inform adaptive agent memory in a modest, practical sense. The Mnemoverse engine does not implement ART, SOM, or GNG directly. But Hebbian association, clustering, and utility-based pruning are the kinds of mechanisms that a long-lived agent memory needs when it must keep useful structure and shed dead structure across sessions.
Common questions
What are self-organizing memory systems?
Self-organizing memory systems are neural architectures that adapt their internal categories, prototypes, or graph structure as new inputs arrive. ART, SOM, and GNG all address the same core problem: how to learn online without overwriting useful older structure.
What is adaptive resonance theory?
Adaptive Resonance Theory is a family of neural models that learns categories online while protecting prior categories through a vigilance test. A category updates only when the input and a stored prototype match well enough to produce resonance; otherwise the system searches again or commits a new category.
What is a self-organizing map?
A self-organizing map is a neural network that projects high-dimensional input onto a low-dimensional grid while preserving topology. The best-matching unit and its grid neighbors move toward each input, so nearby inputs tend to map to nearby nodes.
What is growing neural gas?
Growing Neural Gas is a self-organizing network that learns topology by adding nodes and edges over time instead of using a fixed grid. It uses competitive Hebbian learning, edge aging, and node insertion near high-error regions to fit its graph to the input distribution.
What is utility-based pruning in GNG-U?
Utility-based pruning in GNG-U adds a utility measure to each node and removes nodes whose contribution to reducing error is low. This lets the network reclaim capacity from stale regions and follow non-stationary input distributions.
What is the stability-plasticity dilemma?
The stability-plasticity dilemma is the problem of staying plastic enough to learn new patterns while staying stable enough to retain older ones. Grossberg framed this tension directly, and ART, SOM, and GNG each handle it with a different control mechanism.
Sources
- Carpenter, G. A. & Grossberg, S. (1987). "A massively parallel architecture for a self-organizing neural pattern recognition machine." Computer Vision, Graphics, and Image Processing.
- Grossberg, S. (1987). "Competitive learning: from interactive activation to adaptive resonance." Cognitive Science.
- Carpenter, G. A., Grossberg, S. & Rosen, D. B. (1991). "Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system." Neural Networks.
- Carpenter, G. A., Grossberg, S. & Reynolds, J. H. (1991). "ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network." Neural Networks 4(5).
- Kohonen, T. (1982). "Self-organized formation of topologically correct feature maps." Biological Cybernetics 43.
- Kohonen, T. (1990). "The self-organizing map." Proceedings of the IEEE 78(9).
- Kohonen, T. (2001). Self-Organizing Maps. Springer.
- Fritzke, B. (1995). "A growing neural gas network learns topologies." Advances in Neural Information Processing Systems (NIPS).
- Fritzke, B. (1997). "A self-organizing network that can follow non-stationary distributions." ICANN'97 (Lecture Notes in Computer Science 1327).
Related
- Hopfield associative memory
- Kinds of memory
- Working memory
- Schema formation
- Episodic and semantic memory
- AI memory landscape
Published in the Mnemoverse Library — research and documentation for persistent memory in AI systems.
