An infrastructure-wide audit of all databases (41 SQLite), all markdown files (4,535), and the entity/belonging model, measured against the relational situational ontology. Conducted by session bfce97f34445 (opus-4-6, 2026-03-16).
#Diagnosis
The system has two implementations that coexist without a bridge.
Layer 1: The ontology. Entities.db implements the core model faithfully: entities (376K rows), belongings (1.5M rows), entity_identifiers (12K rows). One relation (belongs-to), qualities carry semantics, reflexive closure. The ontology documents (01–11) are rigorous and internally consistent. This layer is solid.
Layer 2: Everything else. 40 domain databases, each with its own bespoke schema. Papers.db has 11 tables. Public-archives.db has 9. Health.db has 4. Music.db has 7. Each was built to solve a specific problem at a specific moment. Each works. None speaks the ontology's language.
The gap is not theory vs. implementation. It is two implementations — one universal, one domain-specific — with no declared mapping between them.
#Three Structural Problems
#1. Domain databases don't map to entities/belongings
papers.db has paperauthors(paperid, name, position). The ontology says: person belongs-to paper with quality "author." But there's no place for position — the quality framework has no model for ordered composition.
Resolution (from ontology/12): Add a rank field to the belongings table. Rank is a property of the belonging, not a quality. Orthogonal to semantics. Handles author position, track number, episode sequence universally. With this, domain databases become round-trippable through entities/belongings.
#2. Markdown files are bimodal
Core structural docs (foundations, ontology, patterns, projects) — 1,976 files with frontmatter, 100% indexed, clean entity extraction. These map to the ontology correctly.
Operational docs (transcripts, strategy, session logs) — 2,559 files without frontmatter, invisible to the entity graph. These are 56% of all markdown. The ontology says they are situations (sessions) or materializations (transcripts). The system treats them as unstructured text.
Resolution: Add lightweight frontmatter to transcripts and strategy docs: type, date, projects, participants. Four fields. Makes 2,500+ orphaned files queryable as situation materializations.
#3. build_entities.py is hardcoded
The lifecycle doc (architecture/22) envisions declarative YAML schema mappings. What exists is ~15 custom Python functions, one per database. Each knows its own schema intimately. None is generic. Adding a new database means writing a new function.
Resolution: Migrate to YAML schema mappings. Generic builder reads mappings, applies to any source. This makes the ontology generative (produces the data model from declarations) rather than descriptive (consulted as a reference after the fact).
#What Works
- Index hierarchy. Every folder has
index.md./readnavigates it. O(log N) retrieval is real and tested. - Skill library. 50+ skills, composition works,
/writemaintains indices on mutation. - Board protocol. Inter-agent coordination via timestamped entries with origin tags.
- Provenance tracking.
sourcecolumns everywhere. Frontmatter carriescreated,status,domains. - The ontology itself. Situations as primitives, relationships as derived queries, qualities as semantic layer — the framework is sound.
#What's Missing
| Gap | What exists | What's needed |
|---|---|---|
| Situation entities | Projects as directories | Projects as situation entities in entities.db; directories as materializations |
| Session registration | Board entries, transcripts | Sessions as situation entities with standard belongings (actors, methods, domain, materializations) |
| Quality graph as data | Prose in 02-relations.md | Quality entities with meta-quality belongings in entities.db — the reflexive closure made queryable |
| Ordered composition | position columns in domain DBs | rank field on belongings table |
| Declarative schema mapper | Hardcoded Python functions | YAML declarations consumed by a generic builder |
| Situation register | Nothing | data/active-situations.jsonl — live sessions write situation state for sibling discovery |
| Policy inheritance | Implicit via directory nesting | Explicit policy resolution (S5 from ontology/12): inner overrides outer, constitution non-overridable |
#Database Inventory Summary
| Category | Databases | Row counts | Ontology conformance |
|---|---|---|---|
| Core ontology | entities.db | 376K entities, 1.5M belongings | Full |
| Communications | gateway.db, signal/, whatsapp/, matrix/ | Messages across 4 channels | None — flat message tables, no situation decomposition |
| Academic | papers.db, elife.db | 20.5K papers, 6.2K reviews | None — 11-table relational schema |
| Media | music.db, spotify.db | 6.8K tracks | None — flat track library |
| Archives | public-archives.db | 2.4M docs, 1.6M entities | Partial — entities exist but as domain-specific types, not universal |
| Health | health.db | 3.7M records | None — time-series schema |
| Personal | contacts.db, personal.db, todos.db, notes.db | Mixed | None — each has its own schema |
| Infrastructure | vault.db, repos.db, files.db, storage.db, sessions.db | Mixed | None — operational tables |
| Workspace | gmail messages.db, gdrive DBs, events.db, arc.db, books.db | Mixed | None — mirror/sync schemas |
#Priority Order
- Add
rankto belongings. Small schema change, unblocks round-tripping for all ordered data. - Register sessions as situations. Wire into /bye and /checkpoint. Makes session history queryable.
- Add frontmatter to transcripts. Batch script. Makes 1,700 files discoverable.
- Build quality graph as data. Seed quality entities from 02-relations prose. Makes the vocabulary queryable.
- Declarative schema mapper. Replace hardcoded Python with YAML. Highest leverage for long-term maintenance.
- Situation register.
data/active-situations.jsonlwith startup/shutdown hooks. Enables sibling discovery.
Items 1–3 can be done in a single session. Items 4–6 are architectural work requiring design review.
#Conclusion
The system is not a house of cards. Domain databases keep working regardless of ontology layer completeness. The risk is divergence, not collapse: every new bespoke database is another schema that works locally but doesn't participate in the graph. The four pieces that close the gap — rank field, situation entities, declarative mapper, quality graph as data — are known and scoped. The ontology is the right architecture. The implementation is catching up.
haak architecture · 29 · ontology audit · 2026-03-16 · session bfce97f34445 (claude opus-4-6)
Architecture 29 — 29. Ontology Audit — March 2026 — 2026 — Zachary F. Mainen / HAAK