ADR-OA01.1: TNH-Conductor β Provenance-Driven AI Workflow Coordination (v2)¶
Strategic architecture for coordinating external AI agents through bounded, auditable, human-supervised workflows.
- Status: Accepted
- Type: Strategy ADR
- Date: 2026-01-29
- Owner: Aaron Solomon
- Author: Aaron Solomon, GPT 5.2, Claude Opus 4.5
- Supersedes: ADR-OA01
Context¶
TNH-Scholar is a long-lived system for AI-assisted study, translation, and analysis of the teachings of Thich Nhat Hanh and the Plum Village tradition. The project explicitly embraces AI leverage not only for content work, but for building and evolving the system itself.
The possibility exists to coordinate AI coding agents exposed via official CLI interfaces (e.g., Claude Code CLI, Codex CLI) in a way that:
- Enables semi-autonomous progress
- Preserves human authority and review
- Avoids brittle "autonomous agent" designs
- Produces auditable, intelligible work products
- Fits naturally into an engineering workflow (git, branches, reviews)
These agents are treated not as APIs to be wrapped, but as already-agentic systems invoked through stable command-line surfaces, consistent with standard developer tooling.
The Problem: "Agents are Hard"¶
Early experimentation β and industry analysis (see Armin Ronacher's "Agents are Hard") β shows that treating LLMs as either:
- purely conversational assistants, or
- fully autonomous agents
both fail at scale.
Autonomous agents fail because they:
- Lose the "vibe" β Context drift over long sessions
- Get stuck in loops β Recursive hallucination without external grounding
- Lack visibility β No way to understand why decisions were made
- Cannot recover β No rollback when things go wrong
Traditional agent systems store state in chat history, making debugging impossible and recovery unreliable.
The Opportunity: Provenance-First Coordination¶
TNH-Scholar's existing tnh-gen infrastructure provides the hardest prerequisite for reliable agent coordination: provenance tracking. Every transformation is logged with why, what, who, and what changed.
This makes TNH-Scholar uniquely positioned to introduce a coordinating component whose role is interpretation, supervision, and sequencing β not execution or autonomy.
The Insight: "You Might Not Need MCP"¶
Mario Zechner's analysis ("What if you don't need MCP?") identifies a complementary principle: prefer repo-local CLI tools over heavyweight tool servers as the agent integration surface.
- Tool servers (like MCP) add lifecycle complexity, schema maintenance, and runtime dependencies
- CLI commands are composable, versioned in git, self-documenting via
--help, and testable like normal software - CLI invocations + generated artifacts become the durable, auditable contract β not server state
This insight shapes how tnh-conductor integrates with sub-agents: Claude Code CLI and Codex CLI are invoked as command-line tools, and their transcripts, diffs, and artifacts form the stable handoff for supervision, evaluation, and provenance β rather than long-lived tool servers or UI-bound integrations.
Core Valuation: The System is Written in English¶
tnh-conductor is a minimal enforcement kernel that executes versioned prompt-programs. The system's behavior is defined in English; code exists only for capture, enforcement, and execution.
This valuation optimizes for:
- Bootstrapped semi-autonomy β Prompt libraries enable rapid iteration without code changes
- Fast experimentation β New workflows, policies, and evaluation criteria are prompt edits, not deployments
- Auditability β Humans can read and understand system behavior by reading prompts
- Leverage β AI agents can help write and improve the prompts that govern them
What This Means in Practice¶
Traditional orchestration systems encode behavior in code: routing logic, validation rules, decision trees, output formatting. This creates a high barrier to iteration and makes the system opaque to non-programmers.
tnh-conductor inverts this: behavior lives in versioned prompts; code provides the execution substrate.
| Traditional Orchestration | Prompt-Program Runtime |
|---|---|
| Routing logic in code | Triage prompts select workflows |
| Hardcoded validation rules | Policy prompts define allowed/forbidden behaviors |
| Decision trees in conditionals | Evaluation prompts determine status and next steps |
| Template-based output | Generation prompts produce journals and reports |
| Fixed agent selection | Capability prompts guide agent routing |
The kernel becomes small (300-1k lines) and stable. The prompt library becomes the "standard library" of system behavior β versioned, auditable, and evolvable.
Tools as Commands, Not Servers¶
"Prefer a small, composable CLI opcode surface (repo-local tools) over tool servers; treat CLI invocations + artifacts as the durable contract."
The same principle extends to agent tooling. Rather than building heavyweight tool servers (like MCP), we prefer repo-local CLI commands as the agent-facing tool surface:
| Property | Description |
|---|---|
| Composable | Pipe-friendly, Unix-style; can chain with other tools |
| Versioned | Tools live in-repo and evolve with the codebase |
| Low token overhead | Agent doesn't need full tool schemasβjust CLI help text |
| Stable and testable | Normal CLI software with tests, not ephemeral server state |
This is the "bash equivalent" of MCP: tools exist as code + help text, not as long-lived server interfaces. Tools are documented via --help, not separate schema files.
Decision¶
We introduce a new strategic component called tnh-conductor.
tnh-conductor is a provenance-driven workflow coordinator that supervises bounded AI agents, evaluates their outputs (including conversational transcripts), and advances work through explicit, human-reviewable steps.
It is not an autonomous agent. It does not "do the work." It does not replace human judgment.
Core Thesis¶
We will not ask an LLM to "do the work." We will ask an LLM to emit a plan, execute that plan via CLI tools (Claude Code, Codex), and validate the outputs.
This separation of Conductor (supervisor) and Sub-Agent (performer) is critical. The conductor reasons about intent, sequences steps, evaluates results, and records everything β while humans remain responsible authors of the system.
Design Intent¶
tnh-conductor Does¶
- Coordinate AI agents as skilled but bounded performers
- Execute predefined workflows step-by-step
- Capture full conversational transcripts and workspace effects
- Evaluate outcomes using a trusted planner model
- Queue work for periodic human review
- Record all actions and decisions in provenance (tnh-gen)
tnh-conductor Does NOT¶
- Perform open-ended task discovery
- Maintain conversational memory across runs
- Make irreversible decisions
- Guarantee reproducibility (we claim "auditable," not "deterministic")
- Bypass git-based review and rollback
- Own project-level decisions
- Build or maintain heavyweight tool servers (prefer CLI opcodes)
- Attempt to capture agent UI panes (VS Code panels, Claude UI)
Kernel vs Prompt-Program Layer¶
The system splits cleanly into two layers with distinct responsibilities:
Kernel Layer (Code β Minimal, Stable)¶
The kernel handles hard requirements that cannot be expressed in prompts:
| Responsibility | Description |
|---|---|
| Work-branch management | Create/switch branches; prevent commits to main |
| Transcript capture | stdout/stderr capture (primary); PTY fallback for interactivity |
| Workspace diff/status | Capture git state before/after each step |
| Policy enforcement | Post-hoc diff checks against allowed/forbidden paths (code-based) |
| Validator execution | Run tests, lint, typecheck; capture results |
| Event/provenance writes | Record all actions to tnh-gen ledger |
| Schema validation | Validate workflow, prompt, and policy definitions |
| Opcode execution | Execute workflow steps sequentially per definition |
The kernel is enforcement and capture. It does not decide; it executes and records.
Prompt-Program Layer (English β Expressive, Evolvable)¶
All system intelligence lives in versioned prompts:
| Prompt Type | Purpose |
|---|---|
| Task prompts | Instructions for sub-agents (the "work") |
| Policy prompts | Allowed/forbidden behaviors, workspace constraints |
| Evaluation prompts | Planner criteria for success/partial/blocked/needs_human |
| Triage prompts | Route tasks to appropriate workflows |
| Risk assessment prompts | Classify changes by risk level |
| Journal prompts | Generate human-readable daily summaries |
| Agent capability prompts | Describe agent strengths for routing decisions |
Prompts are versioned artifacts stored in-repo. Workflows reference prompts by id.version. The planner can only select next steps from the allowed set β no open-ended branching.
The Split in Action¶
User submits task: "Implement ADR-AT03"
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PROMPT-PROGRAM LAYER β
β β
β triage.route_task.v1 β selects workflow: implement_adr β
β planner.evaluate_step.v1 β determines status β
β risk.assess_changes.v1 β flags breaking API change β
β journal.generate_daily.v1 β produces review summary β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β KERNEL LAYER β
β β
β Creates work branch: task/adr-at03-impl β
β Executes opcode: RUN_AGENT(claude-code, prompt_id) β
β Captures: transcript.md, git diff, test results β
β Enforces: diff only touches allowed paths β
β Records: all events to provenance ledger β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Architecture¶
High-Level Flow¶
Intent / ADR / Task
β
tnh-conductor (workflow coordinator)
β
Sub-Agent Invocation (Claude Code / Codex / Gemini)
β
Captured Outputs (Transcript + Workspace Effects)
β
Planner Evaluation
β
Provenance Ledger (tnh-gen)
β
Daily / Periodic Human Review
Component Diagram¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β tnh-conductor β
β βββββββββββββββββ βββββββββββββββββ βββββββββββββββββββββ β
β β Workflow β β Prompt β β Planner β β
β β Definitions β β Library β β Evaluator β β
β βββββββββββββββββ βββββββββββββββββ βββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββ β
β β Protocol Layer β β
β β (stdout/stderr capture, git diff/status, progress) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββΌββββββββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Claude Code β β Codex β β Gemini β
β (Sub-Agent) β β (Sub-Agent) β β (Sub-Agent) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
βββββββββββββββββββββββ΄ββββββββββββββββββββββ
β
βββββββββββββββββββββββ΄ββββββββββββββββββββββ
βΌ βΌ
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
β Validation Layer β β Provenance Ledger β
β (tests, lints, checks) β β (tnh-gen) β
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
Component Responsibilities¶
| Component | Responsibility |
|---|---|
| Conductor | Workflow sequencing, step coordination, human escalation |
| Planner Evaluator | Interprets sub-agent outputs, determines status, proposes next steps |
| Protocol Layer | Captures transcript + workspace effects; emits progress events |
| Sub-Agents | Perform bounded tasks; emit artifacts (diffs, reviews, transcripts) |
| Validation Layer | Runs tests/lints; feeds errors to Planner (not sub-agent) |
| Provenance Ledger | Records everything; enables audit, replay, and rollback |
Core Concept: Dual-Channel Sub-Agent Output¶
Sub-agents (Claude Code, Codex) are treated as already-agentic systems. Their output is therefore not reduced to file diffs alone.
Each sub-agent run produces two first-class outputs:
Transcript Channel¶
- Full conversational log
- Self-reported success or failure
- Discovered blockers
- Suggested side paths
- Risk or uncertainty statements
This channel is treated as semantic signal, not noise.
Workspace Channel¶
- Git diff (patch)
- New or modified files
- Tool and test output
The system never assumes success based on either channel alone.
The Planner Evaluator interprets both channels to determine actual status.
Contradiction Detection¶
The Planner Evaluator must explicitly check for consistency between channels:
| Contradiction Type | Example | Classification |
|---|---|---|
| Transcript claims success, diff is empty | "Completed refactor" but no files changed | partial or unsafe |
| Transcript claims tests run, no validation artifacts | "All tests pass" but no test output captured | partial with risk_flags |
| Workspace shows changes outside stated scope | Diff includes files not mentioned in transcript | unsafe or needs_human |
| Transcript reports error, workspace shows commits | "Encountered blocker" but files modified | needs_human with risk_flags |
Rule: When the Planner detects contradictions between transcript claims and workspace reality, it must flag the status as partial or unsafe and emit risk_flags describing the discrepancy.
Planner Evaluation Loop¶
After each sub-agent step, tnh-conductor invokes a planner evaluation using a trusted, higher-level model.
Planner is Stateless (No Conversational Memory)¶
The Planner does not maintain conversational memory across evaluations. Each evaluation is independent β the Planner does not "remember" previous interactions or accumulate context over time.
However, to support multi-step workflows and prevent retry loops, the Planner receives a provenance window: explicit historical context from the last K steps (typically 2β3).
| Input | Source | Purpose |
|---|---|---|
provenance_ids |
Last 2β3 step records | Prevent retry loops, understand multi-step context |
prior_statuses |
Previous step outcomes | Detect repeated failures |
prior_blockers |
Blockers from recent steps | Avoid re-attempting known blockers |
This is not memory β it is explicit, bounded context fed as structured inputs. The Planner cannot access arbitrary history; it receives only what the kernel explicitly provides.
Planner Consumes¶
- Step intent (what we asked for)
- Transcript channel (what the agent said/did)
- Workspace diff summary (filesystem changes)
- Validation results (tests, lints)
- Provenance window (last K step records, for multi-step context)
Planner Emits¶
| Output | Description |
|---|---|
| status | success / partial / blocked / unsafe / needs_human |
| blockers | What prevented completion |
| side_paths | Discoveries or alternatives found |
| risk_flags | Concerns requiring attention |
| next_step | Proposed next step ID |
| review_entries | Items for human review queue |
This is the only decision-making locus in the system. Sub-agents are performers; the Planner is the evaluator.
Workflow Bytecode Model¶
Workflows are compiled intent β explicit, declarative sequences that the kernel executes step-by-step with full provenance.
They define:
- Ordered steps (opcodes)
- Which sub-agent is invoked per step
- Required validation
- Review or gating semantics
Workflows do not self-modify and do not branch implicitly. Intelligence lives in the prompts they reference; control lives in the kernel.
Opcode Layers¶
The system distinguishes two opcode surfaces:
Kernel Opcodes (Orchestration Primitives)¶
The kernel executes a small, fixed set of orchestration opcodes:
| Opcode | Description |
|---|---|
RUN_AGENT |
Invoke sub-agent with prompt; capture transcript + workspace |
RUN_VALIDATION |
Execute tests/lint/typecheck; capture results |
EVALUATE |
Invoke planner to assess status and determine next step |
GATE |
Queue for review or block for approval |
ROLLBACK |
Reset work branch to pre-step state (deterministic cleanup) |
STOP |
Halt workflow (success, failure, or needs_human) |
Workflow YAML compiles to this opcode sequence. The kernel executes opcodes; it does not interpret intent.
Agent-Facing CLI Opcodes (Tool Surface)¶
The agent-facing tool surface is a set of repo-local CLI commands that sub-agents (or the kernel on their behalf) can invoke:
| Command | Purpose |
|---|---|
tnh context |
Print minimal repo context + ADR pointers |
tnh diff |
Diff stat + key hunks for current changes |
tnh test |
Run canonical test suite |
tnh lint |
Run linting checks |
tnh typecheck |
Run type checker |
tnh adr open <id> |
Open/display ADR by ID |
tnh adr list |
List available ADRs |
tnh runlog append |
Append structured entry to run log |
These commands are:
- Composable: Pipe-friendly, Unix-style
- Versioned: Live in-repo, evolve with codebase
- Self-documenting:
--helpprovides all context agents need - Testable: Normal CLI testing patterns apply
Rationale: This keeps the tool surface small, auditable, and avoids the complexity of maintaining server-based tool interfaces.
ROLLBACK Semantics¶
ROLLBACK is a deterministic cleanup opcode triggered when the Planner returns unsafe or when policy violations are detected. It is narrowly scoped to git operations:
| Trigger | ROLLBACK Action |
|---|---|
unsafe status from Planner |
Reset work branch to pre-step commit |
| Policy violation in diff | Discard worktree changes on work branch |
| Explicit workflow definition | Reset to specified checkpoint |
Constraints:
- ROLLBACK only affects the work branch β never main or protected branches
- ROLLBACK is deterministic: same inputs produce same git state
- ROLLBACK does not "undo" arbitrary state β it resets to a known git checkpoint
- All ROLLBACK actions are recorded in provenance
Not a time machine: ROLLBACK is a hygiene mechanism for recovering from unsafe steps, not a general-purpose "undo to any point" capability.
Example Workflow Definition¶
workflow: implement_adr
version: 1
description: "Implement an ADR with review and approval gates"
steps:
- id: implement
opcode: RUN_AGENT
agent: claude-code
prompt: task.implement_adr.v2
inputs: [adr_id, target_paths]
policy: policy.workspace_safety.v1
- id: validate
opcode: RUN_VALIDATION
run: [tests, lint, typecheck]
on_fail: STOP
- id: evaluate
opcode: EVALUATE
prompt: planner.evaluate_step.v1
# Planner determines: success β continue, blocked β STOP, needs_human β GATE
- id: review
opcode: RUN_AGENT
agent: claude-code
prompt: task.review_diff.v1
inputs: [diff, adr_id]
- id: approval
opcode: GATE
gate: queue_for_review
Schema Versioning¶
Workflow and prompt schemas are explicitly:
- Versioned in the schema definition
- Validated before execution
- Stored (schema version) in provenance records
Workflow as "Bytecode"¶
This framing clarifies responsibilities:
- Workflow YAML = source code (human-readable intent)
- Opcode sequence = bytecode (kernel-executable steps)
- Prompts = the "functions" called by opcodes
- Kernel = the runtime that executes bytecode
- CLI opcodes = the "syscalls" available to sub-agents
The system is auditable because you can read the workflow (what we intended), the prompts (how we instructed), and the provenance (what actually happened).
Human-in-the-Loop Model¶
Human oversight is periodic and asynchronous by default.
- Sub-agent work accumulates in a daily or periodic review journal
- Humans review diffs, planner assessments, and risk flags
- Humans decide whether to merge, revise, discard, or redirect work
Real-time approval is optional and explicit.
Gate Types¶
| Gate | Behavior |
|---|---|
queue_for_review |
Non-blocking; continues workflow, adds to journal |
requires_daily_review |
Same as above, explicit naming |
blocking_approval |
Halts workflow until human approves |
Daily Review Journal¶
# TNH-Journal: 2026-01-14
## Completed Tasks
- [x] Implemented ADR-AT03 refactor (3 files changed)
- [x] Fixed type errors in text_object.py
## Pending Review
- [ ] Diff: src/tnh_scholar/ai_text_processing/ (142 lines)
- [ ] Risk flag: Breaking change to public API
## Planner Assessments
β implement_adr.v1 / step: implement
Status: partial
Blocker: "Unclear which error handling pattern to use"
Side path: "Suggested Result type over exceptions"
## Provenance
- Workflow: implement_adr.v1
- Agent: claude-code (claude-sonnet-4)
- Transcript: .tnh-gen/transcripts/2026-01-14-001.md
Rollback and Safety Model¶
Rollback is defined strictly via git.
- All work occurs on designated work branches
- Code-level guardrails prevent commits to main
- Recovery uses standard git operations
- No additional snapshot or sandbox system is required
This keeps the safety model simple, reliable, and aligned with standard engineering practice.
Role of tnh-gen¶
tnh-gen is the provenance and narrative substrate of the system.
It records:
- Workflow executions
- Sub-agent transcripts
- Workspace diffs
- Planner decisions
- Human review outcomes
- Human feedback events
It enables auditability, review generation, and long-term system memory without agent memory.
tnh-gen does not make decisions. It is the ledger, not the judge.
HUMAN_FEEDBACK Event Type¶
To accumulate human preferences and decisions without introducing "agent memory," the provenance ledger includes a dedicated event type:
event_type: HUMAN_FEEDBACK
timestamp: 2026-01-14T10:30:00Z
workflow_id: implement_adr.v1
step_id: review
content:
decision: "approved_with_changes"
comments: "Good approach, but prefer Result type over exceptions"
rationale: "Aligns with project error-handling patterns"
related_items:
- provenance_id: abc123
- provenance_id: def456
Purpose:
- Capture human decisions, comments, and rationale from daily journal reviews
- Provide structured feedback that can inform future triage and evaluation prompts
- Enable pattern discovery over time (e.g., "human frequently overrides X classification")
- Avoid agent memory while preserving institutional knowledge in the ledger
Protocol Layer Specification¶
The Protocol Layer is explicitly bounded to:
| Responsibility | Description |
|---|---|
| Transcript capture | stdout/stderr capture (primary); PTY fallback for unexpected interactivity |
| Workspace capture | git diff / git status before/after |
| Progress events | Heartbeats, completion signals |
| Heartbeat monitoring | Detect stalled agents; kill and capture on timeout |
| Negative path capture | Handle hangs, prompts, crashes gracefully |
The Protocol Layer is NOT responsible for:
- Enforcing prompt contracts (post-hoc verification instead)
- Ensuring determinism (we claim "bounded," not "reproducible")
- Parsing structured output from agents (transcripts are semantic, not structured)
- Capturing agent UI panes (VS Code panels, Claude UI β too fragile)
Rationale: This avoids architectural collapse due to CLI idiosyncrasies across different agent tools.
Artifact Contract¶
The integration point between conductor and sub-agents is durable repo artifacts, not UI scraping:
- Required per run: AGENTLOG entry + run summary + git diffs
- Artifacts are the stable handoff for meta-agent evaluation and journaling
- CLI invocations are captured in provenance alongside artifacts
This ensures the system remains portable across different agent UIs and versions.
Negative Path Handling¶
The kernel must handle failure modes that prevent normal completion:
| Failure Mode | Detection | Response |
|---|---|---|
| Agent hang | No stdout/stderr output for N seconds | Kill process, capture transcript tail, mark blocked |
| Interactive prompt | Detected Y/N, auth, 2FA, confirmation patterns | Kill process, mark blocked, flag for review |
| Tool crash | Non-zero exit code | Capture stderr, mark blocked with cause |
| Timeout | Wall-clock limit exceeded | Kill process, capture state, mark blocked |
Heartbeat Monitor¶
The kernel implements a heartbeat monitor with the following behavior:
- Monitor interval: Configurable timeout (default: 60 seconds)
- Signal: Any stdout/stderr output or filesystem event resets the timer
- On timeout:
- Kill the sub-agent process
- Capture last N lines of transcript
- Write
blockedprovenance record with cause classification - Queue for human review
Rule: If no output or events for the configured interval, the kernel kills the process, marks the step as blocked, captures the transcript tail, and queues for review.
Prompt Library as Standard Library¶
The prompt library is the standard library of system behavior. Just as a programming language's stdlib provides reusable functions, the prompt library provides reusable behavior definitions.
Prompts are named, versioned artifacts organized by type:
Prompt Types¶
| Type | Namespace | Purpose |
|---|---|---|
| Task | task.* |
Instructions for sub-agents (the actual work) |
| Policy | policy.* |
Allowed/forbidden behaviors, workspace constraints |
| Evaluation | planner.* |
Criteria for assessing outcomes |
| Triage | triage.* |
Route tasks to workflows |
| Risk | risk.* |
Classify changes by risk level |
| Journal | journal.* |
Generate human-readable summaries |
| Agent | agent.* |
Agent capability descriptions for routing |
Example: Task Prompt¶
prompt: task.implement_adr
version: 2
type: task
inputs:
- name: adr_id
type: string
- name: target_paths
type: list[path]
instruction: |
Implement the specified ADR according to its design decisions.
Focus on:
- Following existing code patterns in the target paths
- Minimal changes to achieve the ADR's goals
- Clear commit messages referencing the ADR
Do NOT:
- Expand scope beyond what the ADR specifies
- Refactor unrelated code
- Add features not in the ADR
outputs:
- name: diff
type: unified_diff
Example: Policy Prompt¶
prompt: policy.workspace_safety
version: 1
type: policy
description: |
Defines allowed and forbidden workspace operations.
allowed_paths:
- "src/tnh_scholar/**"
- "tests/**"
- "docs/**"
forbidden_paths:
- "*.lock"
- "pyproject.toml"
- ".github/**"
- "*.env*"
forbidden_operations:
- "Creating new top-level directories"
- "Deleting existing files without explicit instruction"
- "Modifying CI/CD configuration"
# Kernel enforces via post-hoc diff check
Example: Evaluation Prompt¶
prompt: planner.evaluate_step
version: 1
type: evaluation
instruction: |
Evaluate the sub-agent's work based on transcript and workspace diff.
inputs:
- step_intent: "What we asked the agent to do"
- transcript: "Full conversational output"
- diff_summary: "Files changed, lines added/removed"
- validation_results: "Test/lint/typecheck output"
criteria:
success: |
The diff directly addresses the step intent.
Tests pass. No new lint warnings.
Agent reported completion without uncertainty.
partial: |
Some progress toward intent, but not complete.
Agent reported blockers or expressed uncertainty.
Tests pass but work is incomplete.
blocked: |
Agent could not proceed.
Hard blocker: missing information, architectural decision needed,
dependency issue, or unclear requirements.
needs_human: |
Changes touch sensitive areas (public API, security, config).
Agent expressed low confidence.
Risk flags present in diff.
unsafe: |
Policy violations detected in diff.
Forbidden paths modified.
Scope expansion beyond intent.
outputs:
status: "success | partial | blocked | needs_human | unsafe"
blockers: "List of blocking issues"
side_paths: "Discovered alternatives or opportunities"
risk_flags: "Concerns requiring attention"
next_step: "Proposed next step ID or STOP"
review_entries: "Items for human review queue"
Example: Triage Prompt¶
prompt: triage.route_task
version: 1
type: triage
instruction: |
Given a task description, determine which workflow applies.
routing_rules: |
- If task references an ADR and asks to implement it β implement_adr
- If task is a bug fix with clear reproduction β fix_bug
- If task asks for code review β review_code
- If task asks for analysis or research (no code changes) β research
- If task involves documentation only β update_docs
- If unclear or ambiguous β needs_human (queue for clarification)
outputs:
workflow_id: "Selected workflow"
confidence: "high | medium | low"
reasoning: "Why this workflow was selected"
Example: Risk Assessment Prompt¶
prompt: risk.assess_changes
version: 1
type: risk
instruction: |
Classify the risk level of proposed changes.
criteria:
high_risk: |
- Changes to authentication, authorization, or security code
- Modifications to public API signatures
- Database schema changes
- Changes affecting data integrity or persistence
- Dependency version changes
medium_risk: |
- Breaking changes to internal APIs
- New dependencies added
- Configuration changes
- Changes to CLI interfaces
low_risk: |
- Documentation changes
- Test additions or fixes
- Internal refactoring with no API changes
- Comment and docstring updates
outputs:
risk_level: "high | medium | low"
risk_factors: "List of specific concerns"
recommended_review: "blocking | daily | none"
Example: Journal Generation Prompt¶
prompt: journal.generate_daily
version: 1
type: journal
instruction: |
Generate a human-readable daily journal from workflow executions.
inputs:
- executions: "Today's workflow runs with outcomes"
- assessments: "Planner evaluations"
- review_items: "Queued review entries"
format: |
# TNH-Journal: {date}
## Completed Tasks
[List tasks with status=success, one line each with file counts]
## In Progress
[List tasks with status=partial, noting blockers]
## Pending Review
[List items queued for review, with paths and risk levels]
## Blockers & Decisions Needed
[List items with status=blocked or needs_human]
## Planner Assessments
[Summarize key evaluations, especially for non-success outcomes]
## Provenance
[Workflow IDs, agent versions, transcript paths]
Verification is Post-Hoc¶
Prompts define intent and expectations. Enforcement happens after execution:
- Diff the filesystem before/after
- Check for policy violations (forbidden paths, operations)
- Validate outputs against expected structure
- Flag discrepancies for human review
No sandbox in Phase 1. Rely on branch isolation + diff-policy + human review.
Policy Enforcement Model¶
Policy prompts are English definitions of allowed and forbidden behaviors. However, enforcement is code-based:
| Layer | Role |
|---|---|
| Policy prompts | Human-readable definitions of constraints (the "what") |
| Kernel enforcement | Code that checks diffs against policy rules (the "how") |
Enforcement flow:
- Sub-agent completes step, producing workspace diff
- Kernel parses policy prompt's
allowed_paths,forbidden_paths,forbidden_operations - Kernel checks diff against these rules (deterministic code check)
- If violations detected: kernel auto-marks status as
unsafebefore Planner evaluation - Planner receives violation report as input; cannot override kernel safety decisions
Key principle: The kernel can autonomously mark a step unsafe when forbidden paths are touched β this happens before and independent of Planner/human review. Policy definitions are English; enforcement is code.
Anti-Goals and Constraints¶
To prevent scope creep, the following are explicit anti-goals:
Anti-Goal: No Heavyweight Tool Servers¶
Don't build or maintain heavyweight "tool servers" (like MCP) unless there is a clear win.
Prefer "tools as commands" first. Only escalate to server-based tooling when:
- Streaming/async is required
- Stateful service is unavoidable
- Performance demands it
Anti-Goal: No UI Scraping¶
Don't attempt to capture agent UI panes (VS Code panels, Claude UI). This is fragile and non-portable. Rely on artifacts and transcripts instead.
Anti-Goal: No Agent Memory¶
Don't introduce conversational memory that persists across runs. Use explicit provenance windows instead.
Anti-Goal: No API Codex Runner¶
Don't build API-based Codex execution surfaces. Phase-0 targets CLI-based invocation (Codex CLI) for consistency with the "tools as commands" principle.
Implementation Roadmap¶
Phase 0: Protocol Layer Spike (De-risking)¶
Goal: Prove headless agent invocation + transcript capture works reliably
Rationale: The Protocol Layer is the highest-risk component. If we cannot reliably capture stdout/stderr and git diffs from headless agent sessions, the entire architecture is blocked. This spike de-risks before committing to full implementation. API-based Codex runner experiments are superseded; Phase-0 now explicitly targets Codex CLI as the production execution surface.
Note on PTY: Early spikes explored PTY capture for transcript completeness. Current findings show both Claude Code CLI and Codex CLI emit bounded, non-interactive stdout/stderr in normal operation. PTY is now a fallback only β retained to detect/kill unexpected interactive prompts (auth, Y/N), but not a primary dependency.
Spike Scope:
- Headless invocation of Claude Code CLI (
claude --print) and Codex CLI - Transcript capture (raw + normalized)
- Git diff capture before/after
- Heartbeat monitoring + inactivity timeout kill
- Minimal provenance event emission into
tnh-gen - Work-branch isolation for each run
Target Surfaces:
| Agent | Invocation Method | Capture Method |
|---|---|---|
| Claude Code CLI | claude --print mode |
stdout/stderr (PTY fallback) |
| Codex CLI | CLI invocation | stdout/stderr (PTY fallback; confirm in spike) |
Spike Deliverable:
# Minimal CLI that proves the capture chain
tnh-conductor-spike run --agent claude-code --task "List files in src/"
# Outputs:
# transcript.md (full session)
# diff.patch (git changes, if any)
# run.json (metadata)
# events.ndjson (provenance stream)
Success Criteria (Pass/Fail):
- Headless invocation completes without manual interaction
- Full transcript captured (not truncated)
- Git diff accurately reflects workspace changes
- Provenance record written with correct metadata
- Handles agent hang: kills process after timeout
- Captures last N lines of transcript on failure
- Writes
blockedprovenance record with cause classification on failure - Emits progress events independent of filesystem changes
Spike Deliverables:
| Artifact | Description |
|---|---|
tnh_conductor_spike.py |
CLI module implementing the spike |
transcript.md |
Full stdout/stderr session log |
diff.patch |
Git changes (if any) |
run.json |
Run metadata |
events.ndjson |
Event stream (newline-delimited JSON) |
SPIKE_REPORT.md |
Findings, gotchas, recommendations |
Decision Point: If spike fails, evaluate alternative approaches (SDK, API-only, different agent surface) before proceeding.
Phase 1: Headless Controller (Walking Skeleton)¶
Goal: CLI command that wraps a claude-code session with full kernel capabilities
Deliverables:
tnh-conductorCLI entry point- stdout/stderr capture for transcripts (PTY fallback for interactivity)
- Git diff capture for workspace changes
- Work-branch creation and management
- Provenance indexing
Walking Skeleton:
tnh-conductor --task "Summarize current progress"
# Creates work branch: task/summarize-progress-001
# Sends to Claude Code via CLI
# Captures transcript + diff
# Records in provenance store
# Returns to original branch
Success Metric: Output recorded in provenance without manual copy-paste; work isolated on branch
Phase 2: Planner Evaluation Loop¶
Goal: Implement the decision-making component
Deliverables:
- Planner that consumes transcript + diff + validation
- Status classification logic
- Next-step determination
- Review queue generation
Phase 3: Workflow Engine¶
Goal: Declarative workflow execution
Deliverables:
- YAML workflow parser
- Step sequencing
- Validation integration
- Gate semantics
Phase 4: Prompt Library¶
Goal: First-class prompt management
Deliverables:
- Prompt versioning and storage
- Template rendering
- Post-hoc contract verification
Phase 5: Multi-Agent Support¶
Goal: Coordinate multiple agent types
Deliverables:
- Codex integration
- Gemini integration (optional)
- Agent capability routing
Phase 6: VS Code Integration¶
Goal: Visual control surface
Deliverables:
- Provenance graph visualization
- Real-time workflow monitoring
- Review journal browser
Future: Prompt Regression Testing¶
Since prompts are "code" in this architecture, prompt changes can shift system behavior in unexpected ways. A future capability (Phase 7+) is prompt regression testing:
Concept:
- Maintain a test bench of recorded transcripts, diffs, and expected classifications ("golden runs")
- When evaluation or policy prompts are updated, run them against golden runs
- Detect if prompt changes shift classifications unexpectedly (e.g., previously
successnowpartial) - Flag regressions for human review before deploying prompt updates
Goal: Treat prompt versioning with the same rigor as code versioning β changes are testable and regressions are detectable.
This is a planned capability, not Phase 1 scope.
Consequences¶
Positive¶
- Clear separation of concerns β Conductor coordinates; agents perform; humans decide
- Robust to agent quirks β Dual-channel output handles diverse agent behaviors
- Scales with complexity β Workflow model supports arbitrarily complex sequences
- Provider-agnostic β Same architecture works with Claude Code, Codex, Gemini
- Auditable by design β Full provenance trail for every action
- Aligns with engineering practice β Git branches, async review, standard tooling
- Smaller tool surface β CLI opcodes are simpler than server-based tool interfaces
- Better provenance β CLI invocations + artifacts are first-class, versioned, auditable
- Less brittleness β No server schemas to drift; tools are self-documenting via
--help
Tradeoffs¶
- Less "magical" autonomy β Explicit workflows over emergent behavior
- Slower than unsupervised agents β Human review adds latency
- Requires disciplined workflow definitions β No ad-hoc task discovery
These tradeoffs are intentional. The goal is reliable, auditable progress β not maximum speed.
Risks and Mitigations¶
| Risk | Mitigation |
|---|---|
| CLI wrapping fragility | Protocol Layer only captures transcript + diffs; no structured parsing |
| Planner evaluation quality | Use higher-capability model; iterate on evaluation prompts |
| Workflow rigidity | Support workflow versioning and evolution; human can redirect |
| Review backlog | Risk flags prioritize urgent items; blocking gates for critical paths |
Open Questions¶
1. Claude Code CLI Wrapping¶
Question: Can Claude Code be reliably wrapped for headless operation?
Options: --print mode (confirmed), SDK approach, PTY fallback
Decision needed by: Phase 1
2. Planner Model Selection¶
Question: Should the Planner use the same model as sub-agents or a different one?
Options: Same (consistency), cheaper (cost), more capable (quality)
Decision needed by: Phase 2
3. Progress Signal Thresholds¶
Question: What constitutes a "stalled" agent?
Options: Time-based, event-based, agent-reported
Decision needed by: Phase 2
Terminology¶
| Term | Definition |
|---|---|
| Conductor | Workflow coordinator and supervisor (tnh-conductor) |
| Kernel | Minimal code layer: enforcement, capture, opcode execution |
| Prompt-Program | Versioned prompts that define system behavior (the "English code") |
| Sub-Agent | External AI system performing bounded tasks (Claude Code, Codex) |
| Planner | Trusted evaluator model that interprets outcomes |
| Kernel Opcode | Orchestration primitive: RUN_AGENT, VALIDATE, EVALUATE, GATE, STOP |
| CLI Opcode | Agent-facing tool: tnh test, tnh diff, tnh context, etc. |
| Transcript Channel | Conversational output from sub-agent |
| Workspace Channel | Filesystem effects (diffs, new files) |
| Daily Review | Periodic human gating and approval |
| Provenance Ledger | tnh-gen's record of all actions and decisions |
| Policy Prompt | English definition of allowed/forbidden behaviors |
| Evaluation Prompt | English criteria for planner status classification |
| Artifact Contract | Required outputs per run: AGENTLOG + summary + diffs |
ADR Roadmap¶
This strategy ADR establishes the foundation. Implementation details are captured in follow-on ADRs.
Current / Active¶
| ADR | Title | Scope |
|---|---|---|
| ADR-OA02 | Phase 0 Protocol Spike | Headless capture contract and safety controls |
| ADR-OA03 | Agent Runner Architecture | Kernel + adapter pattern, runner contracts |
| ADR-OA03.1 | Claude Code Runner | --print mode, stdout/stderr-first capture |
| ADR-OA03.2 | Codex Runner (API) | Responses API approach (historical) |
| ADR-OA03.3 | Codex CLI Runner | CLI-based execution via codex exec; supersedes OA03.2 |
| ADR-OA04 | Workflow Schema + Opcode Semantics | YAML format, opcode definitions, ROLLBACK semantics |
| ADR-OA05 | Prompt Library Specification | Prompt artifact format, versioning, template rendering |
| ADR-OA06 | Planner Evaluator Contract | Input/output schemas, contradiction checks, provenance window |
| ADR-OA07 | Diff-Policy + Safety Rails | Allowed/forbidden paths, dependency changes, escalation rules |
| ADR-OA08 | Prompt Regression Testing Harness | Golden runs, classification drift detection (future) |
Historical / Superseded¶
| ADR | Title | Status |
|---|---|---|
| ADR-OA01 | TNH-Conductor Strategy (v1) | Superseded by this document |
| ADR-OA03.2 | API-Based Codex Runner | Superseded by ADR-OA03.3 (CLI approach) |
Each ADR will be created as implementation progresses through the phases defined in this strategy.
Related ADRs¶
Related Existing ADRs¶
- ADR-OA01 β Original strategy (superseded by this document)
adr-pv01-provenance-tracing-strat.mdβ Foundation provenance infrastructureadr-tg01-cli-architecture.mdβ CLI patterns for tnh-genadr-at04-ai-text-processing-platform-strat.mdβ Related orchestration patterns
References¶
- Armin Ronacher, "Agents are Hard" β https://lucumr.pocoo.org/2025/11/21/agents-are-hard/
- Mario Zechner, "What if you don't need MCP?" β https://mariozechner.at/posts/2025-11-02-what-if-you-dont-need-mcp/
- Claude Code CLI documentation β https://code.claude.com/docs/en/cli-reference
- OpenAI Codex CLI documentation β https://developers.openai.com/codex/cli
Summary¶
tnh-conductor is a prompt-program runtime: a minimal enforcement kernel that executes workflows defined in English.
The system is written in English. Code exists for capture, enforcement, and execution β not for encoding behavior.
Key principles:
- Kernel is minimal (~500 lines): branch management, stdout/stderr capture, diff capture, policy enforcement, opcode execution
- Behavior lives in prompts: task instructions, policies, evaluation criteria, triage rules, journal formats
- Workflows are bytecode: YAML compiles to opcodes; intelligence lives in the prompts they reference
- CLI opcodes over tool servers: prefer small, composable repo-local commands as the agent tool surface
- Artifact contract: CLI invocations + durable artifacts are the stable handoffβnot UI scraping
- Enforcement is post-hoc: verify diffs against policies after execution, not during
- Humans remain authors: daily review, blocking gates for high-risk changes, full provenance trail
tnh-conductor enables semi-autonomous progress without surrendering control. It coordinates, listens, evaluates, and records β while humans remain responsible authors of the system.
This ADR establishes the strategic foundation for TNH-Scholar's agent coordination system: a prompt-program runtime enabling bounded, auditable, provenance-driven development workflows with CLI opcode tooling.