The HAAK repo has two layers: a system layer (how to do science with AI) and a data layer (one group's actual science). The system layer should be forkable — someone clones the repo, deletes the data directories, and has a working system for their own projects.
#The two layers
System (portable, shareable):
patterns/— foundations, policies, methods, features, architecture, apps.claude/— agents, skills, hooks, scripts, referencespersonas/— researcher personas (shared templates)console/— web interfaceCLAUDE.md,README.md
Data (instance-specific):
projects/— manuscripts, reviews, opinions, proposalsdownloads/— transit area
#Design constraint
Changes to the system layer should not assume anything about the data layer beyond the type system. A process that hardcodes projects/ms-2026-02-inscription/ is coupled to instance data. A process that operates on projects/<any-slug>/ is portable.
Concretely:
- Skills and processes reference types and patterns, not specific projects
- File-paths.md defines conventions, not inventories
/run-audit rebuildgenerates indices from filesystem state, not hardcoded lists- Personas are system-layer (reusable templates) even though they accumulate project-specific review history in frontmatter
#Personas: system or data?
Personas sit at the boundary. The persona definition (field expertise, writing style, review tendencies) is system — useful to any fork. But review history (scopes: in frontmatter) is instance data. Current design keeps them in the system layer because the definition is the valuable part. A fork would clear the scope history.
#What this means for new work
When adding a process, skill, or policy: ask "would this work in a fresh fork with no projects?" If yes, it belongs in the system layer. If it references specific project content, it belongs in the data layer or needs parameterization.
#Flat tags over hierarchy
Prefer tags and flat structure over deep directory nesting. Agent definitions and agent I/O should mirror each other by name/tag:
agents/ → a1, a2, a3 (definitions — system layer)
agent-io/ → a1, a2, a3 (inputs/outputs — data layer)
The mirroring makes the split concrete: the agents directory is public/forkable, the I/O directory is project-specific. Tags (frontmatter domains:) handle cross-cutting classification that hierarchy can't.
#Relationship to multi-repo
Forkability is the single-repo version. Multi-repo (separate repos for system vs data, or per-project repos) is the next step — see architecture/multi-repo/.
#Origin
Handwritten note (Feb 19 2026, IMG_8979): "separate agents/patterns from output docs so that the code can be easily forked." The split already existed in practice; this document names it as a design constraint.
haak · created 2026-02-22 · zach + claude
Architecture 02 — Forkability — 2026 — Zachary F. Mainen / HAAK