Skip to content

OA07.1 PR-8 Bootstrap Headless Entry Plan

Purpose: define the implementation plan for PR-8 in TODO.md: one maintained local/headless entry point that loads a workflow, creates a managed worktree context, executes the maintained kernel end to end, and returns canonical run outputs without pulling GitHub review automation into the bootstrap slice.

This note is intentionally detailed so the implementation can be reviewed against repo standards before coding starts.

Standards Check

The implementation plan below is constrained by the current project rules in:

The main standards implications are:

  • application-layer entry points must stay thin and call typed services rather than accumulate orchestration logic inline,
  • domain/service orchestration belongs in typed services and protocols, with infrastructure isolated behind adapters,
  • no dict-shaped ad hoc app logic,
  • config-at-init and params-per-call should be explicit,
  • mutable execution must stay in the managed worktree root, while canonical provenance stays in the run directory,
  • git safety rules prohibit destructive cleanup shortcuts; the bootstrap path must rely on the managed worktree lifecycle already landed in PR-7.

PR-8 Goal

PR-8 should land exactly one maintained operational loop:

  1. load one workflow definition from disk,
  2. compose the maintained kernel dependencies,
  3. create the managed worktree context from a committed base ref,
  4. execute the workflow end to end,
  5. leave canonical artifacts, metadata, manifests, events, and final state in the run directory,
  6. return a stable machine-readable run summary from one local/headless entry point.

Out of scope:

  • commit creation,
  • push,
  • PR open/update,
  • richer OA05 prompt-library integration,
  • deeper OA06 planner intelligence beyond a bootstrap-safe deterministic default,
  • additional rollback checkpoints beyond pre_run.

Current Code Reality

PR-7 landed the workspace runtime boundary, but the maintained codebase still has no real composition root for the OA07.1 path.

What already exists:

  • src/tnh_scholar/agent_orchestration/kernel/service.py
  • src/tnh_scholar/agent_orchestration/kernel/adapters/workflow_loader.py
  • src/tnh_scholar/agent_orchestration/workspace/service.py
  • src/tnh_scholar/agent_orchestration/runners/adapters/codex_cli.py
  • src/tnh_scholar/agent_orchestration/runners/adapters/claude_cli.py
  • src/tnh_scholar/agent_orchestration/validation/service.py
  • src/tnh_scholar/agent_orchestration/run_artifacts/filesystem_store.py

What is still missing:

  • one maintained application-layer service that wires those pieces together,
  • one maintained CLI entry point for local/headless execution,
  • one stable bootstrap configuration model,
  • one stable result model for the headless entry surface,
  • a real end-to-end test that exercises the maintained composition using the real git worktree service.

Proposed Design

1. Add a maintained application-layer bootstrap package

Create a new thin application-layer package under:

  • src/tnh_scholar/agent_orchestration/app/

Planned contents:

  • models.py
  • service.py
  • __init__.py

Reasoning:

  • This keeps the CLI thin, which matches ADR-OS01 and the repo’s “thin app / orchestration in service layer” guidance.
  • It prevents the maintained CLI from becoming an ad hoc dependency assembler with mixed parsing, config, and runtime logic.

2. Use explicit typed config and params

Planned models:

  • HeadlessBootstrapConfig
  • HeadlessBootstrapParams
  • HeadlessBootstrapResult

Expected responsibilities:

  • HeadlessBootstrapConfig: construction-time values such as repo_root, runs_root, workspace_root, base_ref, branch prefix, optional runner executable overrides, and bootstrap-safe planner/gate defaults.
  • HeadlessBootstrapParams: per-call inputs such as the workflow path to load.
  • HeadlessBootstrapResult: stable summary fields such as run_id, workflow_id, status, run_directory, metadata_path, final_state_path, and the persisted workspace_context.

This follows the project’s Settings/Config/Params split without overbuilding an environment-settings layer before it is needed.

3. Keep kernel contracts unchanged where possible

The preferred PR-8 shape is composition, not another contract rewrite.

The application service should assemble the existing maintained collaborators:

  • YamlWorkflowLoader
  • KernelRunService
  • FilesystemRunArtifactStore
  • GitWorktreeWorkspaceService
  • DelegatingRunnerService
  • ValidationService

New code should avoid widening the kernel surface unless an actual bootstrap blocker appears during implementation.

4. Add deterministic bootstrap defaults for planner and gate

The maintained kernel requires PlannerEvaluatorProtocol and GateApproverProtocol, but PR-8 does not need full OA06 sophistication.

Plan:

  • provide a deterministic bootstrap evaluator that returns a typed PlannerDecision,
  • provide a deterministic bootstrap gate approver that returns a typed GateOutcome,
  • keep both in the application layer unless reuse pressure makes a separate package clearly worth it.

Bootstrap-safe defaults:

  • planner default: PlannerStatus.success
  • planner next_step: None unless explicitly overridden later
  • gate default: GateOutcome.gate_approved

Rationale:

  • this satisfies the maintained kernel contract,
  • keeps bootstrap local/headless,
  • avoids pretending OA06 is complete,
  • preserves the later path to replace these defaults with richer evaluator/gate implementations without reworking the CLI surface.

5. Provide interpreter-backed builtin validator mappings

The maintained validation service already supports builtin validator identifiers, but PR-8 needs a maintained resolver mapping for actual bootstrap runs.

Planned approach:

  • resolve builtin validators to sys.executable -m ... style commands,
  • avoid shell execution,
  • keep the mapping typed via BuiltinCommandEntry.

Initial planned mappings:

  • tests -> python -m unittest discover -s tests -p test_*.py -q
  • lint -> python -m ruff check .
  • typecheck -> python -m mypy src

Reasoning:

  • avoids relying on shell aliases,
  • keeps execution service constraints intact,
  • works with the repo’s typed execution subsystem,
  • stays local/headless.

6. Default bootstrap execution policy to real mutable worktree use

The bootstrap entry must actually permit mutable execution in the managed worktree.

Plan:

  • assemble ExecutionPolicySettings in the application service,
  • default the bootstrap run to workspace_write,
  • include a non-empty allowed path scope rooted at the managed workspace root so policy validation does not fail before execution,
  • keep approval posture headless-safe.

Important nuance:

  • this plan is for the bootstrap application surface only,
  • it should not silently weaken the existing policy contract,
  • if policy composition proves more nuanced during implementation, the fix should stay typed and explicit rather than falling back to literals or adapter-local hacks.

7. Add one maintained CLI entry point

Create a new CLI package:

  • src/tnh_scholar/cli_tools/tnh_conductor/

Planned files:

  • __init__.py
  • tnh_conductor.py

Planned command shape:

tnh-conductor run --workflow <path> [--repo-root <path>] [--runs-root <path>] [--workspace-root <path>] [--base-ref <ref>]

Planned behavior:

  • load one workflow,
  • run the application bootstrap service,
  • emit a concise machine-readable summary to stdout,
  • exit non-zero on bootstrap failure.

This is the maintained replacement for “there is no headless entry point yet”. It should not inherit the spike CLI’s architecture or names beyond what is genuinely useful.

Expected File Touches

Primary new code:

  • src/tnh_scholar/agent_orchestration/app/__init__.py
  • src/tnh_scholar/agent_orchestration/app/models.py
  • src/tnh_scholar/agent_orchestration/app/service.py
  • src/tnh_scholar/cli_tools/tnh_conductor/__init__.py
  • src/tnh_scholar/cli_tools/tnh_conductor/tnh_conductor.py

Likely supporting changes:

  • pyproject.toml

Tests:

  • new maintained bootstrap service test under tests/agent_orchestration/
  • new CLI test under tests/cli_tools/

Potential minor export updates if needed:

  • src/tnh_scholar/agent_orchestration/__init__.py
  • src/tnh_scholar/cli_tools/__init__.py

Testing Plan

1. Real worktree bootstrap service test

Add one focused test that:

  1. creates a temporary git repo,
  2. commits a base state,
  3. writes a minimal workflow YAML,
  4. uses the real GitWorktreeWorkspaceService,
  5. uses a stubbed headless Codex executable,
  6. runs the maintained bootstrap service,
  7. asserts:
  8. the run completes,
  9. metadata includes the persisted workspace context,
  10. the worktree path is distinct from the run directory,
  11. canonical artifacts exist,
  12. mutable output landed in the worktree, not the run directory root.

This is the main proof that PR-8 actually creates the missing operational loop.

2. Thin CLI test

Add one CLI test that:

  • invokes the new tnh-conductor run entry point,
  • passes explicit repo/workflow/stub executable paths,
  • asserts successful exit,
  • parses the emitted summary,
  • confirms it points at the expected run metadata/final state.

This keeps CLI coverage aligned with the maintained entry point without depending on real Codex/Claude binaries in test.

3. Changed-file checks after implementation

Before considering the slice done:

  • run focused tests for the new bootstrap modules,
  • run the changed-file checks recommended by make pr-check,
  • run broader project checks only if the diff or touched surfaces justify it.

Risks and Watchpoints

1. CLI bloat risk

The easiest mistake is to put all bootstrap wiring directly in the CLI module.

Mitigation:

  • keep CLI argument parsing and stdout formatting in the CLI,
  • keep runtime assembly in the application service.

2. Policy mismatch risk

The kernel’s policy assembly has real constraints around workspace_write and allowed_paths.

Mitigation:

  • make the bootstrap default policy explicit in typed config,
  • prove it in the end-to-end bootstrap test,
  • avoid patching around policy failures inside runner adapters.

3. Overreaching into OA06 risk

It would be easy to overbuild evaluator/gate behavior while chasing “complete” runtime fidelity.

Mitigation:

  • keep deterministic bootstrap defaults,
  • document them clearly in code and tests,
  • defer richer evaluator intelligence to the intended follow-on work.

4. Repo-surface creep risk

Adding the CLI may tempt broader documentation or packaging cleanup in the same slice.

Mitigation:

  • keep PR-8 centered on the maintained bootstrap path only,
  • avoid unrelated docs cleanup or review automation work in the same implementation.

Implementation Sequence

Recommended coding order:

  1. Add typed application-layer models.
  2. Implement the bootstrap service and deterministic planner/gate collaborators.
  3. Wire maintained runner, validation, artifact, loader, and workspace dependencies in that service.
  4. Add the thin tnh-conductor CLI.
  5. Register the script entry in pyproject.toml.
  6. Add the real-worktree bootstrap service test.
  7. Add the CLI test.
  8. Run focused validation and adjust only if the tests expose a real contract gap.

This order keeps the first proof point in the service layer and makes the CLI a small final wrapper, not the driver of the design.

Exit Criteria

PR-8 should be considered done when:

  • the repo contains one maintained local/headless tnh-conductor entry point,
  • that entry point loads one workflow definition from disk,
  • it composes the maintained kernel with the real worktree service,
  • it executes a workflow end to end and returns a stable typed summary,
  • canonical run metadata, manifests, events, and final state are written to the run directory,
  • the bootstrap path is covered by at least one real-worktree service test and one CLI test,
  • no PR-9 review automation behavior has been smuggled into the slice.

Non-Goals For This Document

This note is a coding workplan only. It does not:

  • approve implementation details beyond the planned slice,
  • change any ADR authority,
  • replace later design decisions for OA05, OA06, or OA07 review automation,
  • authorize destructive git operations outside the repo’s safety rules.