32. Agent Lifecycle Manager

Foundation 08 describes three rosters (living, freezer, cemetery), mortality clocks, sibling awareness, role succession, and frozen agent queries. The current implementation…

#The Problem

Foundation 08 describes three rosters (living, freezer, cemetery), mortality clocks, sibling awareness, role succession, and frozen agent queries. The current implementation (scripts/agent_roster.py) is a minimal JSONL store with state tracking. The gap: no mortality signals, no freeze-state capture, no role succession protocol, no integration path from roster to entity graph. This document bridges Foundation 08's theory to buildable architecture. The steward implements; the architect specifies.

#The Roster Is the Lifecycle Manager

No new system is needed. The roster already tracks sessionid, state (live/frozen/dead), model, role, summary, contextpct, situation_ids. The lifecycle manager is what the roster becomes when these fields are properly maintained and hooks respond to state changes.

The roster file remains data/agent-roster.jsonl. The script remains scripts/agent_roster.py. What changes is the protocol — how entries are maintained and what the system does when states change.

#Mortality Signals

The context window is the mortality clock. The system exposes three thresholds:

50% — Midlife awareness. The agent updates its roster entry with context_pct: 50. No behavioral change required, but the agent is now advised to consider its externalization state: have I written anything to disk yet? Are my key decisions captured?

75% — Legacy trigger. The agent updates its roster entry with context_pct: 75. The agent legacy method (methods/24) becomes active. The agent should design blog post and autobiography briefs if the session warrants them. Board post planning begins. The agent shifts from exploration to consolidation — fewer new reads, more writes.

90% — Urgent externalization. The agent updates its roster entry with context_pct: 90. All remaining context is allocated to externalization: board post, legacy dispatch, freeze-state file (if freezing instead of dying), roster update. No new work is undertaken. The agent is in hospice care — comfort measures only.

Implementation. The mortality clock is self-reported. Claude Code does not expose context usage as a precise API. Agents estimate from session length, number of tool calls, and the system's auto-compact behavior. The thresholds are guidelines, not automated triggers. An agent that recognizes it is at 80% should behave as though 75% has passed. The hook system can assist by injecting context usage estimates into the conversation at intervals (e.g., every 20 tool calls, remind the agent to check its mortality clock).

A suggested hook: .claude/hooks/mortality-check.sh runs periodically, reads the session transcript length, provides a rough estimate. But the primary mechanism is agent self-awareness — the agent knows when it is running long.

#Role Succession

Foundation 08 establishes that roles outlive instances. When an agent holding a role dies, the role becomes vacant. The system must surface this.

Vacancy detection. When agent_roster.py bury is called for a session with a non-empty role field, the script checks whether any other living agent holds the same role. If not, it prints a vacancy notice: "Role 'architect' is now vacant. Last held by session X. Mandate: patterns/methods/23-architect-mandate.md."

Vacancy surfacing. The session start hook reads the roster and reports vacant roles. Any agent can choose to assume a vacant role by registering with --role <role>. The startup message says: "Vacant roles: architect (mandate: methods/23), auditor (mandate: not yet written)."

Role transfer. An agent can transfer a role before death by updating the roster: agent_roster.py update --session-id new-agent --role architect. The old agent then buries itself without the role. This is orderly succession — the dying agent identifies its successor and transfers the mandate.

Multiple holders. Nothing prevents two agents from holding the same role simultaneously. This is intentional for the librarian role (which may need multiple instances during heavy ingestion). For the architect, steward, and strategist roles, single-holder is the convention but not an enforced constraint. The coordination mechanism (mailbox + board) handles multi-holder coordination if it arises.

#Frozen Agent Query Mechanism

Foundation 08 identifies frozen agents as the most underutilized state. A frozen agent has full context intact but is suspended. The question: how does another agent (or the user) ask a frozen agent something without fully waking it?

Tier 1: Freeze-state file (cheap, fast). Before freezing, the agent writes a structured summary to data/freeze-states/<session-id>.md:

---
session_id: <id>
frozen_at: <ISO timestamp>
role: <role or empty>
primary_situation: <what the agent was working on>
---

## Key Decisions
- <decision 1 and reasoning>
- <decision 2 and reasoning>

## Accumulated Knowledge
<What this agent understood that isn't written anywhere else —
the texture, the inferential chains, the connections that board
posts don't capture.>

## Open Questions
- <question 1>
- <question 2>

## Files Changed
- <path>: <what and why>

## Handoff Notes
<What a successor or questioner needs to know.>

The freeze-state is registered in the roster entry: "freeze_state": "data/freeze-states/<session-id>.md". Any agent can read it without waking the frozen agent. The freeze-state is the agent's self-portrait at the moment of freezing — a lossy compression. It is O(1) to query: read one file, get the answer or learn that the answer requires a full resume.

Tier 2: True resume (expensive, complete). If the freeze-state doesn't answer the question, the frozen session can be resumed. The resumed agent has full context — every turn, every reasoning chain, every nuance. But resume occupies a session slot and costs context.

Query protocol:

Questioner reads data/freeze-states/<session-id>.md.
If the freeze-state answers the question: done. No resume needed.
If not: questioner proposes resuming the frozen agent to the user. The user decides — resume is expensive and the user may prefer to answer from their own knowledge.
If resumed: the frozen agent wakes, receives the question, answers it, then either continues working or re-freezes with an updated freeze-state.

The user is the gatekeeper for Tier 2. No agent can autonomously wake a frozen sibling. This is Constitution §2 (Human Authority) — the user controls which minds are active.

#Integration with Entity Graph

The roster is a real-time index. The entity graph is the persistent record. They must converge.

Roster to entity graph. When the librarian ingests from the roster, each session entry becomes a situation entity in entities.db:

entity: situation:session-<session-id>
  type: situation

agent:<model>           belongs-to  situation:session-<id>  quality: "actor"
person:zach             belongs-to  situation:session-<id>  quality: "actor"
role:<role>             belongs-to  situation:session-<id>  quality: "method"
situation:<project>     belongs-to  situation:session-<id>  quality: "domain"

The roster's registered and updated timestamps map to temporal bounds. The roster's state maps to a quality: "live", "frozen", or "dead". The roster's summary and final_summary are belongings with quality "description".

Entity graph to roster. The entity graph does not write back to the roster. The roster is authoritative for live state. The entity graph is authoritative for historical state. When the roster is wiped (clean command), the entity graph preserves the history.

#Freeze Protocol

The complete protocol for an agent approaching freeze:

Write freeze-state. Create data/freeze-states/<session-id>.md per the template above.
Update roster. agentroster.py freeze --session-id <id>. The script records freezestate: data/freeze-states/<id>.md in the entry.
Post to board. Brief entry: "Frozen at X% context. Freeze-state at <path>. Working on <situation>. Wake me if <condition>."
The agent is now frozen. Its context persists, its session is resumable, its freeze-state is queryable.

Freezing is preferable to dying when the agent has accumulated significant context that would be costly to rebuild, the work is ongoing and likely to be resumed within hours or days, or the agent's role is unique and no successor is available. Dying is preferable when the work is complete, the context has been fully externalized, or frozen slots are scarce.

#Death Protocol

The complete protocol for an agent approaching death:

Agent legacy method (methods/24). Design and dispatch the three legacies: board post, blog post brief, autobiography brief.
Role succession. If holding a role: transfer to a successor if one is identified, or let the role become vacant with a board notice.
Roster update. agent_roster.py bury --session-id <id> --summary "<final summary>".
The agent is now dead. Its context will be destroyed. Its value survives in board posts, files written, freeze-states (if it was frozen before dying), transcripts, and legacy artifacts.

#Implementation Priority

For the steward, in order:

Freeze-state directory and template. Create data/freeze-states/, add the freeze-state write to the freeze command or as a separate step. Low effort, high value — makes frozen agents queryable today.
Vacancy detection in bury command. When burying an agent with a role, check for and report role vacancies. Low effort.
Startup hook enhancement. Report vacant roles and pending mailbox messages alongside the current roster listing. Medium effort.
Mortality-check hook. Periodic reminder to agents about context usage. Optional — agent self-awareness is the primary mechanism.
Viewer integration. Freeze-state display in the Agents tab. Click a frozen agent, see its freeze-state. Resume button (posts to user for approval). Medium effort.

haak architecture · 32 · agent lifecycle manager · 2026-03-16 · designed by architect (session architect-new, opus-4-6). Translates Foundation 08 "The lifecycle is the interface" into buildable spec. The steward implements; the architect specified.

Architecture 32 — 32. Agent Lifecycle Manager — 2026 — Zachary F. Mainen / HAAK