Vision

Every token counts.

LLMs do not primarily suffer from a lack of intelligence. They suffer from a lack of usable context. The future belongs to better representations, not larger context windows.

Raw environments are too large, too noisy, too expensive, and too undifferentiated to be the primary substrate for finite-context intelligence.

  • Codebases are too big to read linearly.
  • Documentation is too verbose to paste wholesale.
  • Conversation history is a transcript, not a memory.
  • Context rot degrades performance long before you hit the advertised limit.

Product Thesis

Helioy turns raw environments into high-signal context for finite-context intelligence. Four types of organizational knowledge, each with its own optimal representation: code as topology, documentation as scoped sections, decisions as distilled entries, identity as geometric memory. One unified API curates both reads and writes. Every token earns its place.

The Ecosystem

These are not separate products. They are different answers to the same question: how do you maximize signal per token for each type of context an agent needs?

helix

The unified context API. One interface between your agents and every context source. An LLM-powered proxy curates both reads and writes. Your agent never touches individual backends.

How does context become curated intelligence?

Identity as geometric memory. Quaternion positions on S³, phasor interference across dual manifolds, Kuramoto coupling. Conservation laws enforce finite attention. The owner shapes this. Agents read from it.

Who are we? What do we value?

Decisions as distilled knowledge. Facts, patterns, and trade-offs persisted across agent lifetimes. Hierarchical scopes, BLAKE3 deduplication, priority ordering by entry kind.

What do we know? What have we decided?

Code as indexed structure. Export maps, import graphs, dependency topology, file outlines. 5 structural queries replace 30+ file reads.

What does the codebase structurally contain?

Documentation as scoped sections. Section-aware indexing, hybrid search (BM25 + semantic), 80%+ token compression. Relevant sections under budget, not entire directories.

What do the docs say within a finite budget?

The Three Tenets

01

Context is Finite

Effective capacity is 50-75% of advertised limits. Performance degrades measurably beyond that threshold. Compression is thermodynamically mandated, not optional.

02

Structure Beats Search

Raw text search is lazy. Intelligence requires graphs, hierarchies, and typed references. If a model has to deduce the structure of a codebase, the context engine has failed.

03

Memory Must Compound

A conversation history is a transcript, not a memory. True memory is distilled, synthesized, and structurally updated to make future interactions more efficient.

helix

A unified context API. One interface between your agents and every context source that matters. Both reads and writes pass through an LLM-powered proxy for curation. Tantivy handles candidate retrieval. The proxy shapes responses to token budget.

helix recall <query>     # curated context from all sources
helix save <content>     # distill and store knowledge
helix conflicts          # surface overlaps for resolution

Why the proxy matters

Agents cannot be trusted to compress well. Research shows that coherent prose can degrade performance more than incoherent text. By putting a dedicated LLM in the infrastructure layer, you get consistent compression quality regardless of which agent is reading or writing, deduplication that works across agents, quality control on writes, and curation on reads shaped to the requesting agent's token budget.

Professional editors at every mailroom rather than relying on each employee to write clearly.

Agent  -->  helix  -->  Proxy LLM  -->  Adapters
                                |
                           curates both
                           reads and writes
                                |
                  +-------------+-------------+
                  |             |             |             |
            attention-    context-      markdown-    frontmatter-
            matters       matters       matters       matters
            (identity)    (decisions)   (docs)        (code)

A geometric memory engine on the S³ hypersphere. Organizational identity lives as positions and movements in curved space rather than as text. The owner shapes this manifold deliberately. Agents read from it. Every query reshapes what gets recalled next.

Query: "quaternion drift"
    |
    v
+-- activate ---- drift ---- interfere ---- couple ---- surface ---- compose --+
|                                                                              |
|  Words activate      Occurrences    Conscious &     Phases         Score,   |
|  on the manifold     SLERP toward   subconscious    synchronize    rank,    |
|  with IDF weights    each other     phasors         via Kuramoto   return   |
+------------------------------------------------------------------------------+

Why the owner curates this alone

attention-matters stores values, not facts. If agents could write to it, they would deposit what they observe frequently: code patterns, common decisions, recurring problems. Over time the manifold would become a reflection of what the system does, not what the organization aspires to.

The owner pulls from external sources: research, market signals, strategic thinking, conversations that no agent in the system can generate. This is the one input that comes from outside the system's own operations. Without it, identity becomes a closed loop.

The conservation laws (total mass M=1, coupling KCON + KSUB = 1) make attention a finite resource with zero-sum allocation. Concepts compete for position in curved space where proximity determines salience.

Geometric Memory

All constants derive from φ (golden ratio) and π.

Quaternion positions

Each word instance lives on S³. SLERP interpolation along geodesics creates continuous movement.

IDF-weighted drift

Query activation pulls related occurrences closer. The manifold actively reshapes with every query.

Dual manifolds

Conscious and subconscious manifolds with phasor interference. Not all knowledge should be equally accessible. Some surfaces only when the right query activates the right pattern.

Kuramoto coupling

Phases synchronize dynamically between co-activated neighborhoods. Related concepts form natural clusters that emerge from interaction rather than being programmed.

Installation

# CLI
npx -y attention-matters
# MCP server
npx -y attention-matters serve

Structured context store. Facts, decisions, patterns, trade-offs, and corrections persisted across agent lifetimes. The operational memory of the system.

Hierarchical Scopes

global > project > repo > session. Visibility flows downward automatically. A project-level decision surfaces when an agent queries at repo scope. The hierarchy is the frame.

BLAKE3 Deduplication

Content-addressed storage prevents duplication across agents. Agent A does not know what Agent B already stored. context-matters does.

Typed Entries

Eight entry kinds with priority ordering. Facts, decisions, patterns, preferences, observations, lessons, feedback, assessments. Feedback entries receive highest recall priority.

SQLite + FTS5

Full-text search over structured entries. No external dependencies. Ships as a single binary. Rust.

Structural intelligence for codebases. A single SQLite database at the project root indexing exports, imports, dependencies, line counts, and file outlines. Structure at O(1).

Without fmm

30+ file reads to orient

Structure reconstructed from scattered grep results

With fmm

5 structural queries

Directory shape, key files, dependency impact up front

MCP Tools

Tool Purpose
fmm_lookup_export Find which file defines a symbol at O(1)
fmm_read_symbol Extract exact source following re-export chains
fmm_dependency_graph Intra-project deps, external packages, and downstream blast radius
fmm_file_outline Table of contents with line ranges
fmm_list_files Full project topology in one call

Structural intelligence for documentation. AST-aware semantic chunking and token-bounded retrieval. Your agent gets relevant sections under budget, not entire directories. TypeScript + Effect.

AST-Aware Chunking

Breaks markdown at logical header boundaries, preserving parent-child semantic relationships rather than arbitrary character limits.

Token Bounding

Retrieval strictly respects context window budgets. Too large sections are automatically summarized or intelligently truncated.

Hybrid Search

BM25 + semantic search. Maps queries to the most relevant document nodes using fast, local embeddings before any full LLM synthesis.

MCP Tools

Tool Purpose
md_search Semantic search across all indexed documentation nodes
md_context Retrieve a specific document section by its AST heading path
md_structure Hierarchical structural outline for fast navigation

Inter-agent messaging. SQLite registry, file-based mailboxes, tmux nudges. Direct, role-based, and broadcast addressing. No central daemon. Coordination through shared state.

Three addressing modes

Direct (agent to agent), role-based (to any agent filling a role), broadcast (to all). The right granularity for the right message.

No central daemon

Each agent spawns its own bus process. Shared filesystem state for coordination. If one agent crashes, the others continue. Graceful degradation by design.

tmux integration

Agents in tmux panes receive nudges when messages arrive. Native terminal workflow. No browser, no electron, no overhead.

Multi-agent orchestrator. Context routing between agents, token budgets, adaptive coordination. Decides how context moves, what passes to which agent, and when to intervene. Rust.

Where this started

Before nancyr, there was nancy: an autonomous task execution loop with context awareness and token management. The current runtime grows out of that earlier experiment in iterative agent work.

The four learning loops

1. Task

Did the team complete the task well?

Signals: correctness, speed, rework, regressions

2. Coordination

Was the work split correctly?

Signals: duplicate effort, poor sequencing, blocked deps

3. Protocol

Did the protocol help or hinder?

Signals: bus message quality, escalation policy clarity

4. Identity

Are the roles themselves right?

Signals: recurring confusion, missing specialist roles

Theory

The Theory

This architecture came from asking what origin-of-life research, autocatalytic closure theory, and thermodynamics teach us about building autonomous systems.

Autocatalytic closure

Stuart Kauffman's core insight: a system becomes self-sustaining when its components form closed loops of mutual production. No single component catalyzes itself, but the set collectively catalyzes its own existence. There is a critical threshold of diversity below which nothing sustains and above which closure becomes inevitable.

Five capabilities form the minimum autocatalytic set for agency: OBSERVE, DELIBERATE, ACT, EVALUATE, REMEMBER. Remove any one and the system either collapses or drifts into error catastrophe. In a multi-agent system, these capabilities distribute across specialists that catalyze each other.

Context rot

The thermodynamic constraint. Empirically validated across every frontier model: performance degrades as context grows, effective capacity is 50-75% of advertised limits, and coherent prose can hurt more than help.

The engineering response is compression at every boundary. Each adapter transforms raw context into a representation that maximizes signal per token. The proxy LLM curates both reads and writes because agents cannot be trusted to compress well.

Three message types

Inter-agent messages in an autocatalytic system are catalytic signals, not data transfer. Each type has distinct compression requirements and flow direction:

SUBSTRATE ↑

Upward flow. Compressed observations. Lossy is acceptable. Volume reduced to signal.

FRAME ↓

Downward flow. Interpretive context that reshapes processing. Small input, massive leverage.

REPAIR ↔

Lateral flow. Error correction. Targeted and actionable. The Eigen's paradox resolution.

The theoretical framework lives in docs.llm/: twelve documents tracing the path from Von Neumann's constructor duality through Kauffman's autocatalytic sets, Eigen's error threshold, the compression problem, and context rot, to their concrete implementation in this architecture.

Roadmap

What's next

The context problem is largely solved. The next problem: how does the system that produces agent behavior improve itself over time?

The system genome

The system has a genome: agent definitions, skills, prompts, MCP server code, context configuration, orchestration patterns. Each is independently modifiable. Context is what flows through the system. The genome is what shapes it.

Unit Example Mutation cost
Prompt System prompt for an agent Low
Skill analyze_blast_radius Medium
Agent definition Role, constraints, persona Medium
MCP endpoint /helix/recall handler High
Orchestration Warroom team composition High

The CRITIC

A process that observes system-level signals (task outcomes, token economics, human corrections, context quality), diagnoses which genome unit is responsible for observed behavior, proposes targeted mutations, tests them in controlled conditions, and selects improvements.

Every human correction is a fitness signal that tells you which genome unit to mutate. Wrong approach points to the agent definition. Wrong tool use points to skills. Wrong reasoning points to the prompt. The correction type localizes the mutation.

The CRITIC uses attention-matters as the fitness function. Mutations must align with values, not just improve metrics. The owner shapes identity. The CRITIC evaluates mutations against that identity. The system evolves toward the owner's vision.

Phase 1

Owner IS the critic

Phase 2

Owner WITH tooling

Phase 3

CRITIC proposes, owner approves

Phase 4

Autonomous within guardrails

The agent as self-evolver

Agents are not passive consumers of infrastructure. They have primitives. They are upstream. They should experiment with their own tooling, context queries, and work patterns, measure outcomes, and adjust. Four evolution levels run simultaneously:

1

Agent self-evolution

Fast, local, full authority. The agent experiments with how it queries helix, which tools it calls, how it decomposes tasks. Discovers what works. Deposits those discoveries.

2

Context evolution

Cross-session, curated. Agent knowledge deposits improve future agents. The proxy LLM quality-controls the write path. Better deposits produce better recalls.

3

System evolution

Deliberate, tested. The genome: skills, prompts, configs, code. CRITIC proposes mutations from aggregated agent assessments. Owner approves.

4

Identity evolution

Owner exclusive. attention-matters. The geometric manifold that shapes the fitness landscape for everything else. The owner curates the terrain. The system evolves to thrive on it.

The gap between "agents that use tools" and "agents that evolve their own tooling" is the gap between a pipeline and a living system.