Read/Write Operations

The atomic primitives for HAAK's file system. Everything else — methods, policies, skills, cross-references — sits on top of these two operations.

#Problem

When agents write files without updating indices, the system forgets. The Library Theorem's O(log N) retrieval guarantee depends on index-content consistency. Every unmediated write risks collapsing indexed access back toward O(N) linear scan. Architecture 04 (drift resistance) identifies this; this document defines the solution.

#The Index Structure

Each index.md is a B-tree node with two-phase progressive disclosure:

Contents — one line per child. Enough to route (decide which child to descend into). The agent scans this section first. Cost: O(B) one-liners where B = number of children.

Details — paragraph per child. Enough to decide without opening the file. The agent reads a specific Details entry only when the Contents line is ambiguous. Cost: O(1) per entry consulted.

This follows the SKILL.md pattern: the description: frontmatter field is the one-liner (routing); the skill body is the detail (engagement). The same progressive disclosure, applied to directories.

The two costs that control summary length:

Scan cost per level: O(B × s) where s = summary length. Shorter is faster.
Encapsulation leakage: if the summary answers the question without opening the file, the file becomes dead weight and the index becomes the data.

The Contents/Details split optimizes both: Contents minimizes scan cost; Details provides depth without polluting the scan path.

Algorithm:

Identify the top-level directory (patterns/, projects/, foundations/, etc.)
Read its index.md Contents section
Find the target or identify which child to descend into
If ambiguous between children, read the relevant Details entry
Descend into the chosen child directory
Repeat from step 2 until the target file is found
Read the file

This IS B-tree traversal. Cost: O(h) index reads where h = directory depth. The alternative — Glob/Grep scanning all files — is O(N).

When to use Read vs Grep: Read navigates structure (what exists, how it's organized, finding by topic). Grep searches content (known text, specific patterns). Read teaches the hierarchy; Grep bypasses it.

#Write: File + Local Index Update

Algorithm:

Validate path — directory exists, filename follows conventions from the target directory's index.md
Validate content — frontmatter has required fields for document type (per frontmatter schema)
Write the file
Append to local index.md — one-line Contents entry + Details paragraph

Local update only. No parent index, no sibling index, no global registry. This is the Self-Description Lemma (Theorem 5): each node describes only its own children. Maintenance cost per change: O(B). Inter-node synchronization cost: zero.

If the containing directory has no index.md, the write still proceeds but warns — the directory may need structural setup first.

#Update: Modify + Automatic Index Re-evaluation

For existing files. Read the full file first (required for context), apply the edit, then re-evaluate the index entry. If the Contents one-liner or Details paragraph no longer accurately represents the file, update them. If they still fit, leave them.

The re-evaluation is free — the file content is already in context from the read step. The cost of indexed access is in the reading, not the summarizing. Making re-evaluation automatic eliminates a judgment call ("did this change enough?") that is exactly the kind of drift-prone decision Architecture 04 warns about.

#Move: Relocate + Index Maintenance

When a file moves between directories, its index entries move with it: remove from the source index.md, add to the destination index.md. The summary transfers as-is unless the move changes the file's context enough to warrant re-evaluation. Folder moves (mvdir) and cross-reference maintenance (link) are /write operations; scan is a /read operation. The two skills share ownership of all file operations, split by read vs mutate, eliminating coordination gaps.

#Cost Optimization

The cost of indexed access is in reading, not summarizing. Three implications:

Frontmatter-only updates need only read the frontmatter (~20 lines), not the full file. Status changes, link additions, and metadata edits are cheap.
Index re-evaluation on update is near-zero marginal cost — the file content is already in context from the required read step.
Mechanical cross-reference fixes (find+sed after renames) go to haiku sub-agents — don't burn expensive context on pattern replacement.

#What Is Not Mediated

downloads/ — transit area, not HAAK content
.claude/ — infrastructure (skills, agents, scripts)
console/ — app code
Root files — CLAUDE.md, ZACH_TODO.md (managed by dedicated processes)

#Validation Rules

Path validation: document type determines valid directory. A foundation goes in foundations/, a project in projects/, a source note in <destination>/sources/. Writing to the wrong directory is blocked.

Frontmatter validation: document type determines required fields. See readwrite/references/frontmatter-schema.md. Immutable fields (created:, source:) are write-once — updates that change them are blocked.

Footer validation: content docs get footers (haak · created ... · author). Generated files (index.md) do not.

#Connection to the Library Theorem

The Library Theorem formalizes why indexed access beats linear scan. HAAK's index.md files are the B-tree nodes. Read is traversal. Write is node maintenance. The two-phase index (Contents/Details) is the pointer/data split within each node. Drift (architecture 04) is index corruption — stale nodes that misdirect navigation.

This architecture makes drift locally detectable: if a directory's contents don't match its index.md, the index is stale. No global scan needed — the same locality principle that makes retrieval fast also makes corruption detection fast.

haak · created 2026-02-22 · zach + claude

Architecture 10 — Read/Write Operations — 2026 — Zachary F. Mainen / HAAK