Skip to content

TNH Scholar TODO List

Roadmap tracking the highest-priority TNH Scholar tasks and release blockers.

Last Updated: 2026-04-26 (CI workflow cleanup and docs validation split landed; notebook repo cleanup and release validation next) Version: 0.4.2 (Alpha) Status: Active Development - bootstrap viable, release validation and packaging phase

Style Note: Tasks use descriptive headers (not numbered items) to avoid renumbering churn when reorganizing. Use #### (h4) for task headers within priority sections.


Progress Summary

Bootstrap Path Status: ✅ COMPLETE — VS Code integration working, AI-assisted development enabled.

Agent-Orch Bootstrap Status: ✅ USABLE PROTOTYPE — the maintained tnh-conductor path has now produced and landed a bounded repo-native implementation through the worktree-backed headless path.

Next Steps:

  1. 🚧 Prepare 0.4.0 bootstrap release framing for tnh-conductor prototype alpha
  2. 🚧 Run 0.4.0 release validation pass and finalize version-bump scope/release notes
  3. 🚧 Reduce notebook/test clutter before release: archive junk notebooks and move real test coverage into pytest where practical
  4. 🔮 JVB VS Code Parallel Viewer (P1, design phase) — ADR-JVB02 strategy + UI-UX design
  5. 🔮 Finish yt-dlp reliability suite + monthly ops trigger (P1, reliability)
  6. 🔮 Finish ytt-fetch robustness hardening (P1, reliability)
  7. 🚧 GenAIService Final Polish - promote policy_applied typing (P1, minor)
  8. 🚧 Prompt Catalog Safety - manifest validation + schema docs (P2, critical infrastructure)
  9. 🚧 Knowledge Base Implementation (P2, design complete)
  10. 🚧 Expand Test Coverage with refreshed baseline and current gaps (P2)
  11. 🚧 OpenAI registry-driven request profiles for model-specific controls (P2, follow-up hardening)

Recent release hardening completed:

  • ✅ CI workflow cleanup split GitHub validation into PR, main, scheduled full-test, and docs-specific flows; docs PR validation is now read-only
  • ✅ Legacy directory-tree generation/drift checks were removed from routine CI and local validation; tnh-tree remains available only as a manual developer utility
  • ✅ Added make branch-preflight so new work can start from a clean branch based on current origin/main
  • ✅ Docs tooling aligned to restore stable API docs builds under the current CI/read-only MkDocs path
  • ✅ Final Dependabot remediation slice removed prototype-only LangChain packages from the root Poetry graph, patched yt_dlp in the standalone audio env, and refreshed the remaining vulnerable manifests
  • ✅ Removed dead query/v2_cleaning_scripts.py artifact and dropped its unused transformers dependency
  • ✅ Dependabot Stage 6 retired the legacy local Whisper / torch audio path so maintained audio support is limited to current API-backed surfaces
  • ✅ Dependabot Stage 5 optional GUI graph refresh landed for Poetry-managed langchain, langchain-community, langchain-core, langchain-text-splitters, and aiohttp
  • pattern_share app explicitly marked as an exploratory legacy prompt-sharing prototype with prompt terminology restored in the UI
  • ✅ Dependabot Stage 4 pattern_share manifest refresh landed for streamlit, langchain, and langchain-community
  • ✅ Dependabot Stage 3 dev/tooling refresh landed for flask, jinja2, black, pytest, and patched notebook/tooling transitive packages
  • ✅ Repo-wide Ruff backlog reduced to zero; make lint now passes
  • ✅ Repo-wide mypy backlog reduced to zero; make type-check now passes

For completed items: See Archive section at end.


Priority Roadmap

This section organizes work into three priority levels based on criticality for production readiness.

Priority 1: VS Code Integration Enablement (Bootstrap Path)

Goal: Enable AI-assisted development of TNH Scholar itself via VS Code extension. Prioritizes foundational work for tnh-gen + extension integration.

Status: Foundation Complete (tnh-gen CLI ✅, Registry System ✅)

✅ tnh-gen CLI Implementation — See Archive

✅ File-Based Registry System (ADR-A14) — See Archive

✅ VS Code Extension Walking Skeleton — See Archive

✅ Pattern→Prompt Migration — See Archive

✅ Provenance Format Refactor (YAML Frontmatter) — See Archive

🚨 Agent-Orch OA07 Runtime Implementation Sequence

  • Status: IN PROGRESS - maintained execution/validation/kernel slice landed and tested
  • Priority: HIGH (foundation work for durable MVP)
  • Context: The accepted OA07 ADR set defines the maintained runtime architecture. The current conductor_mvp/ and spike/ code remains useful as migration source/reference, but should not receive forward-path feature growth.
  • Why This Matters:
  • current implementation readiness is medium, but in-place extension readiness is low
  • the highest-risk boundary is still subprocess execution and typed validation/runner contracts
  • coding should proceed by subsystem extraction, not by continuing prototype package growth
  • Implementation Order:
  • Build agent_orchestration/execution/
    • typed invocation families
    • cwd/env/timeout policy
    • termination/result taxonomy
    • final argv rendering boundary
  • Build agent_orchestration/validation/ on top of execution/
    • preserve OA04 external YAML compatibility by normalizing source shapes into typed internal models
    • migrate behavior out of conductor_mvp/providers/validation_runner.py
  • Extract agent_orchestration/kernel/
    • WorkflowCatalog
    • WorkflowValidator
    • KernelState
    • KernelRunService
  • Introduce agent_orchestration/workspace/ and agent_orchestration/run_artifacts/
    • move rollback/state capture and durable run record ownership out of prototype packages
  • Migrate maintained runner behavior into agent_orchestration/runners/
    • use reference/spike/ only as reference material
    • no new forward-path runner work in spike code
  • Current Slice Completed:
  • Added maintained execution/, validation/, kernel/, workspace/, run_artifacts/, and runners/ package scaffolding
  • Added focused OA07 regression coverage and validated the new slice plus legacy conductor_mvp kernel tests
  • Sourcery installed successfully via poetry install --with local, but the CLI currently hangs even for --help, so local Sourcery review remains blocked by Sourcery runtime behavior rather than repo config
  • Migration Rules:
  • Do not add substantive new feature work to conductor_mvp/
  • Do not add new forward-path implementation work to spike/
  • Treat conductor_mvp/ as a temporary migration-source package to be deleted after subsystem extraction
  • Treat codex_harness/ and spike/ as reference packages during OA07 migration
  • Initial Files in Scope:
  • src/tnh_scholar/agent_orchestration/conductor_mvp/
  • src/tnh_scholar/agent_orchestration/spike/
  • src/tnh_scholar/agent_orchestration/common/
  • tests/agent_orchestration/
  • docs/architecture/agent-orchestration/adr/adr-oa03-agent-runner-architecture.md
  • docs/architecture/agent-orchestration/adr/adr-oa03.1-claude-code-runner.md
  • docs/architecture/agent-orchestration/adr/adr-oa03.3-codex-cli-runner.md
  • docs/architecture/agent-orchestration/adr/adr-oa04-workflow-schema-opcode-semantics.md
  • docs/architecture/agent-orchestration/adr/adr-oa04.1-implementation-notes-mvp-buildout.md

🚨 OA07.1 Bootstrap Worktree Slice

  • Status: MILESTONE REACHED — PR-7 and PR-8 are merged on main, and the first bounded bootstrap-proof workflow outcome is now landed through tnh-conductor status --watch
  • Priority: HIGHEST (prove real maintained bootstrap usefulness)
  • Context: The maintained OA04.x runtime contracts now include the real OA07.1 worktree runtime boundary and the maintained headless entry path. Bootstrap is no longer blocked on substrate. The next blocker is proving one useful repo-native workflow through the maintained path. Follow ADR-OA07 and ADR-OA07.1.
  • Bootstrap Goal:
  • create a managed git worktree from a committed base ref
  • run RUN_AGENT and RUN_VALIDATION against the worktree root
  • keep canonical run artifacts in the run directory
  • support ROLLBACK(pre_run) to recorded base state
  • establish the headless path needed for later commit/push/PR automation
  • Why This Is Next:
  • the worktree runtime boundary and maintained headless app-layer entry are now implemented on main
  • the system still needs one clean end-to-end proof that it can complete a useful repo task through the maintained path
  • OA05/OA06 depth work should follow a live bootstrap proof, not precede it
  • Recent related docs work:
  • documented the current low-noise Codex headless path, native subagent confirmation, and first supervisory shell-trial findings in /docs/architecture/agent-orchestration/notes/experiments/ and /docs/architecture/agent-orchestration/supervisory-shell-trial/
  • SPIKE-10 comparison result now records the current practical recommendation: keep tnh-conductor as the main coordination substrate, harden native subagent launch reliability, and treat codex-assistant / claude-assistant worker paths as experimental until runtime bootstrap and auth are dependable
  • direct-vs-conductor follow-up review selected the direct-arm tnh-conductor status --watch implementation as the merge candidate while preserving the maintained conductor run as the bootstrap-viability proof
  • next focus is release-prep cleanup and packaging for a 0.4.0 prototype-alpha tnh-conductor milestone, not more orchestration-path comparison spikes first
  • Recommended PR sizing:
  • Prefer 2 PRs to stay comfortably under diff-size guidance
  • A single PR is possible only if the implementation stays narrow and avoids CLI/app-layer work
  • PR Sequence:
  • PR-7 feat/oa07.1-worktree-workspace-service — Worktree runtime boundary (medium)
    • replace NullWorkspaceService as the forward-path maintained implementation with a real git-backed workspace service
    • add typed workspace context models: repo_root, worktree_path, branch_name, base_ref, base_sha
    • implement managed branch + worktree creation from committed base ref
    • update the workspace protocol so pre-run setup returns structured workspace context and does not rely on the run directory as the workspace handle
    • pass the worktree root as working_directory to runner and validation services for mutable steps
    • implement ROLLBACK(pre_run) by discarding and recreating the managed worktree from recorded base_sha
    • persist workspace context into canonical run artifacts or run metadata extension
    • tests for worktree creation, mutable-step execution in the worktree root, recorded base state, and ROLLBACK(pre_run) semantics
    • keep NullWorkspaceService only for tests or explicit non-operational contexts
  • PR-8 feat/oa07-bootstrap-headless-entry — Maintained headless bootstrap entry (small/medium, next)
    • load one workflow
    • create worktree context
    • execute workflow end to end
    • write canonical artifacts and final state
    • keep the initial entry local/headless; no GitHub automation required
  • Bootstrap Proof feat/tnh-conductor-status-watch — Real repo-task bootstrap proof (small/medium)
    • add one maintained workflow definition for a narrow useful repository task
    • exercise the current maintained subset: RUN_AGENT, RUN_VALIDATION, STOP, with ROLLBACK(pre_run) available only as fallback
    • prove the run yields a reviewable repo diff plus canonical metadata, manifests, events, and final state
    • keep semantic-control depth and review automation out of scope unless they become true blockers
  • Release Prep feat/oa07-bootstrap-release-prep — Prototype-alpha cleanup and packaging (small/medium, next)
    • document the bootstrap milestone and known limitations clearly
    • add maintained tnh-conductor CLI reference and operator-facing usage docs
    • reduce notebook/test clutter so the release ships with a cleaner repo state
    • prune stale temporary artifacts and clarify operator workflow defaults
    • decide exact 0.4.0 scope before version bump and release notes
  • Claude CLI worker hardening — Robust non-interactive execution and write scoping (small/medium)
    • confirm current claude CLI flags and non-interactive behavior still match maintained adapter assumptions
    • add explicit write-scoping / permission-mode policy so bounded docs and code tasks do not stall on unexpected prompts
    • persist raw stream-json output and termination details in canonical artifacts for easier failure triage
    • add regression coverage for stalled-write / permission-request paths in the Claude runner adapter
  • PR-9 feat/oa07-review-automation — Commit/push/PR automation (optional, small/medium)
    • create local commits on the managed branch
    • push the work branch
    • open or update a PR
    • keep protected-branch merge human-only
  • Explicit deferrals for this slice:
  • commit/push/PR automation if it causes PR-7 or PR-8 to exceed preferred diff size
  • strict OA05 compile-validation as a blocker for bootstrap
  • full OA06 planner fixture/vector suite beyond the bootstrap path
  • maintained EVALUATE / GATE support before the first useful bootstrap proof
  • maintained tnh-gen evaluator or review-agent integration before the current orchestrator comparison proves out the control-surface path
  • non-script harness backends
  • stacked PR orchestration
  • multi-agent mutable collaboration inside one worktree
  • pre_step rollback and named checkpoints
  • Files likely in scope:
  • src/tnh_scholar/agent_orchestration/workspace/
  • src/tnh_scholar/agent_orchestration/kernel/service.py
  • src/tnh_scholar/agent_orchestration/run_artifacts/
  • tests/agent_orchestration/test_oa07_execution_validation_kernel.py
  • docs/architecture/agent-orchestration/adr/adr-oa07-diff-policy-safety-rails.md
  • docs/architecture/agent-orchestration/adr/adr-oa07.1-worktree-lifecycle-and-rollback.md

✅ OA04 Contract Family — PR Sequence (Complete)

  • Status: COMPLETE — contract ADRs implemented in maintained code; bootstrap remains blocked on OA07.1 worktree execution
  • Context: OA04.2–OA04.5 are the contract-layer ADRs between the OA07 runtime foundations and the maintained runner/policy/provenance implementations. That contract family is now landed in code and should no longer be treated as pending. See implementation notes in ADR-OA04.1 Addendum 2026-03-27 for the original scaffolding gaps and ADR-OA04.1 Addendum 2026-04-05 for the bootstrap-first reprioritization.
  • Dependency chain:
  • OA04.3 (run dir + manifests + evaluator evidence seam) → OA04.2 (runners normalize into canonical evidence)
  • OA04.4 (policy taxonomy + requested/effective split) → OA04.2 (runner request carries typed requested policy)
  • OA04.5 (harness backend) → validation/ subsystem (extends empty package)
  • OA04.2 (runner adapters) → milestone: first real agent invocations
  • Implementation Notes (default choices for implementers):
  • Apply OS01 pragmatically: add structure where it protects a real boundary or likely evolution seam, not just to mirror the taxonomy mechanically.
  • Prefer moving maintained code toward the ADR contracts when the migration path is clean; do not preserve stub shapes just for short-term compatibility inside maintained packages.
  • Treat run_artifacts/ as the canonical evidence boundary. If a choice arises between storing data in runner-local files versus canonical artifact roles + manifests, choose canonical artifact roles + manifests.
  • Keep evaluator assembly strict: evaluators read metadata.json, events.ndjson, manifest.json, and canonical artifact roles only. Do not add evaluator dependencies on adapter-local raw capture filenames.
  • Keep manifests thin and stable. Put compact cross-step evidence in evidence_summary; put detailed per-step policy data in canonical policy_summary.json.
  • Keep persistence ownership in run_artifacts/. Runner adapters and validation backends should return typed normalized outputs and artifact payloads; they should not own final manifest writing policy.
  • Evolve existing maintained code where it already matches the target shape. In particular, refactor validation/service.py toward the script backend/resolver seam rather than replacing it wholesale.
  • Expand kernel/service.py by extraction, not accretion. If per-step provenance writing starts to crowd the kernel, extract focused collaborators rather than growing one large procedural service.
  • Use explicit mapper/normalizer classes whenever native CLI or harness output is translated into maintained models. Do not hide parsing, normalization, termination mapping, and persistence decisions in one adapter class.
  • Keep policy taxonomy aligned with OS01: init-time settings/config, per-step requested policy, execution-time effective policy, persisted PolicySummary. Avoid “policy blob” models that mix those concerns.
  • Do not add ceremony without benefit: avoid speculative service/factory layers, unnecessary mappers for nearly identical shapes, or package splits that do not improve testability, replaceability, or clarity.
  • Existing thin models in run_artifacts/, runners/, and validation/ are scaffolding, not target architecture. It is acceptable to break those internal shapes in favor of cleaner maintained contracts during this implementation sequence.
  • PR Sequence:
  • PR-1 feat/oa04-contract-adrs — ADR acceptance (docs only)
    • Commit new OA04.2, OA04.3, OA04.4, OA04.5 files; later implementation has since moved those decimal ADRs to implemented
    • Carry in already-modified OA03.1/OA03.3 addendums + OA04 update + index.md
  • PR-2 feat/oa04.3-run-artifact-contract — Run-artifact domain contract + store (medium)
    • Expand run_artifacts/models.py: RunMetadata, RunEventRecord, ArtifactRole enum, StepArtifactEntry, StepManifest
    • Add manifest-level evidence_summary with compact canonical evidence references
    • Add canonical policy_summary artifact role for detailed requested/effective policy records
    • Expand run_artifacts/protocols.py: write_step_manifest, artifact_step_dir, canonical artifact persistence APIs
    • Update run_artifacts/filesystem_store.py to implement both
    • Keep filesystem concerns behind the store; no evaluator-facing filename dependencies
    • Tests for manifest writing, event stream fields, and canonical artifact-role lookup
  • PR-3 feat/oa04.3-kernel-provenance-integration — Kernel provenance integration (medium)
    • Update kernel/runtime services to write enriched run metadata, canonical events, and per-step manifests
    • Persist compact manifest summaries and canonical artifact references only; no adapter-local evidence lookup in evaluator assembly
    • Capture workspace diff/status and policy summary references through canonical artifact roles
    • Tests for manifest/event creation across RUN_AGENT, RUN_VALIDATION, EVALUATE, and GATE
    • Depends on PR-2
  • PR-4 feat/oa04.4-policy-contract — Execution policy package (medium)
    • New agent_orchestration/execution_policy/ package
    • models.py: ExecutionPolicySettings, RequestedExecutionPolicy, EffectiveExecutionPolicy, PolicyViolationClass, PolicyViolation, PolicySummary
    • assembly.py: ExecutionPolicyAssembler for system settings → workflow → step requested policy → runtime override/effective policy derivation
    • protocols.py: ExecutionPolicyAssemblerProtocol
    • Update runners/models.py: retire PromptInteractionPolicy stub; link RunnerTaskRequest to RequestedExecutionPolicy
    • Persist detailed policy_summary.json via canonical policy_summary artifact role; keep only compact summary data in manifests
    • Tests for assembly precedence, requested/effective policy derivation, and hard-fail behavior
    • Can run in parallel with PR-3
  • PR-5 feat/oa04.2-runner-adapters — Runner adapters (largest PR)
    • Expand runners/models.py: AdapterCapabilities (capability declaration per OA04.2 §3a)
    • Add explicit mapper/normalizer classes for native CLI output → maintained runner-domain models
    • Add runners/adapters/claude_cli.py: claude --print --output-format stream-json --permission-mode dontAsk, stream-json parsing, normalization, termination mapping
    • Add runners/adapters/codex_cli.py: codex exec --json --output-last-message, JSONL capture, normalization, termination mapping
    • Adapters return typed normalized artifact payloads; canonical persistence is owned by run_artifacts
    • Evaluators consume manifests and canonical artifact roles only, never runner-local raw capture files
    • Tests for both adapters (subprocess mocking, normalization, mapper behavior, termination paths)
    • Depends on PR-2, PR-3, and PR-4
  • PR-6 feat/oa04.5-harness-backend — Script harness backend (medium)
    • Build out agent_orchestration/validation/: BackendFamily enum, HarnessBackendRequest, HarnessBackendResult, HarnessBackendProtocol
    • backends/script.py: migrate from conductor_mvp/providers/validation_runner.py; normalize to validation_report/validation_stdout/validation_stderr artifact roles
    • Add backend resolver seam, but defer cli and web implementation until a concrete maintained consumer exists
    • Tests for script backend, resolver seam, and artifact role normalization
    • Depends on PR-2; independent of PR-4 and PR-5

🔮 JVB VS Code Parallel Viewer (ADR-JVB02)

  • Status: NOT STARTED (Design Phase)
  • Priority: HIGH (flagship feature, builds on VS Code integration foundation)
  • Context: The JVB (Journal of Vietnamese Buddhism) parallel viewer enables scholars to view scanned historical journal pages alongside OCR text and English translations. v1 was a bespoke browser-based prototype; v2 will integrate into the tnh-scholar VS Code extension.
  • Project Paused: This work was on hold while VS Code integration and tnh-gen were developed. Now that the walking skeleton is complete, we can resume with fresh design.

Related Documentation:

  • v1 As-Built: ADR-JVB01 — Browser-based prototype architecture
  • v2 Strategy (Draft): JVB Viewer V2 Strategy — Pre-ADR strategy note (good foundations, needs formalization)
  • VS Code Platform Strategy: VS Code as UI Platform — Overall UI-UX direction
  • VS Code Integration: ADR-VSC01 — CLI-first extension strategy (implemented)

Proposed ADR Structure:

docs/architecture/jvb-viewer/adr/
├── adr-jvb01_as-built_jvb_viewer_v1.md              # ✅ Exists
├── adr-jvb02-vscode-parallel-viewer-strategy.md     # 🆕 Main strategy ADR
├── adr-jvb02.1-ui-ux-design.md                      # 🆕 Mockups, pane layout, workflows
├── adr-jvb02.2-data-model-api-contract.md           # 🆕 JSON schema, extension↔backend API
└── adr-jvb02.3-implementation-guide.md              # 🆕 Phase-by-phase implementation

Key Design Decisions Needed:

  1. VS Code Pane Architecture: Which panes for scan overlay, text views, reconciliation controls, navigation?
  2. Webview vs Custom Editor: Custom editor for .jvb.json files or webview panel approach?
  3. Backend Integration: Python service via CLI (tnh-gen patterns) or dedicated HTTP service?
  4. Data Model: Refine per-page JSON schema from v2 strategy, define API contract
  5. Dual OCR Reconciliation UI: How users choose between Google OCR vs AI vision sources

Deliverables:

  • ADR-JVB02: Main strategy ADR (formalize v2 strategy, VS Code integration focus)
  • ADR-JVB02.1: UI-UX design with mockups/screen visualizations
  • ADR-JVB02.2: Data model and API contract specification
  • ADR-JVB02.3: Implementation guide with milestones
  • M0 Prototype: Static HTML mockup in VS Code webview (validate approach)

Implementation Milestones (from v2 strategy, to be refined):

  • M0: Static prototype — HTML showing page image, word bboxes, selectable sentences
  • M1: VS Code extension — load/save per-page JSON, overlay modes, section breadcrumb
  • M2: Dual-source UI — GOCR vs AI diff chooser, batch adoption, "reviewed" status
  • M3: Structure cues — columns, heading levels, emphasis flags captured and rendered
  • M4: Beta — section-level navigation, export HTML, light theming

✅ Add --prompt-dir Global Flag to tnh-gen — Completed 2026-04-18

  • Status: COMPLETE
  • Priority: HIGH (improves tnh-gen UX for one-off operations and testing)
  • Estimate: 1-2 hours
  • Context: Users need convenient way to override prompt catalog directory for one-off CLI calls without setting environment variables or creating temp config files
  • ADR: ADR-TG01 Addendum 2026-01-02
  • Why Important: Enables clean one-off operations (tnh-gen --prompt-dir ./test-prompts list) for testing, CI/CD, and development workflows
  • Current Workarounds:
  • Environment variable: TNH_PROMPT_DIR=/path tnh-gen list (awkward)
  • Temp config file: tnh-gen --config /tmp/config.yaml list (verbose)
  • Deliverables:
  • Add --prompt-dir flag to cli_callback() in src/tnh_scholar/cli_tools/tnh_gen/tnh_gen.py:26
  • Update config_loader.py to handle prompt directory override at CLI precedence level
  • Update ConfigData type to accept prompt_catalog_dir override
  • Add unit tests for flag precedence (CLI flag > workspace > user > env)
  • Update help text and CLI reference documentation
  • Update docs/cli-reference/tnh-gen.md global flags section
  • Files to Modify:
  • src/tnh_scholar/cli_tools/tnh_gen/tnh_gen.py (add flag)
  • src/tnh_scholar/cli_tools/tnh_gen/config_loader.py (precedence handling)
  • src/tnh_scholar/cli_tools/tnh_gen/types.py (type definitions)
  • tests/cli_tools/test_tnh_gen.py (unit tests)
  • docs/cli-reference/tnh-gen.md (documentation)
  • Testing: Verify --prompt-dir flag overrides all other config sources (workspace, user, env)

🔮 Full-Coverage yt-dlp Test Suite + Monthly Ops Trigger

  • Status: IN PROGRESS
  • Priority: HIGH (external dependency instability)
  • Goal: Add full coverage for all yt-dlp usage modules (transcript, audio, metadata, video download), then run a scheduled monthly ops test to surface breakage early.
  • Scope (Code):
  • src/tnh_scholar/video_processing/video_processing.py
  • src/tnh_scholar/cli_tools/ytt_fetch/ytt_fetch.py
  • src/tnh_scholar/cli_tools/audio_transcribe/audio_transcribe.py
  • src/tnh_scholar/cli_tools/audio_transcribe/version_check.py
  • Testing Strategy:
  • Add integration tests that exercise live yt-dlp behavior (guarded, opt-in)
  • Add unit tests for runtime env inspection + yt-dlp option injection
  • Add offline unit tests with recorded fixtures for metadata + transcript parsing
  • Add failure-mode tests (missing captions, private video, geo-blocked)
  • Monthly Ops Trigger:
  • Add cron-ready ops check script + validation URL list
  • Document monthly cron usage and log locations
  • Add freshness-gated local status wrapper for active repo use (make update-health-check) plus explicit make health-check execution
  • Add failure notification workflow (issue creation or alerting)
  • Acceptance Criteria:
  • Coverage for all yt-dlp entry points + error paths
  • Monthly ops check runs without manual intervention (cron)
  • Clear failure report includes test URL, date, yt-dlp version

🔮 Patch ytt-fetch Robustness

  • Status: IN PROGRESS
  • Priority: HIGH (frequent breakage path)
  • Goal: Make ytt-fetch resilient to upstream changes and failures.
  • Test URL: https://youtu.be/iqNzfK4_meQ
  • Deliverables:
  • Add runtime preflight + yt-dlp runtime option injection
  • Verify transcript fetch on test URL (manual + test)
  • Add retries / improved error reporting
  • Ensure metadata embed + output path handling remain stable
  • Update docs and CLI reference if flags or behaviors change

🚧 GenAIService Core Components - Final Polish

  • Status: PRELIMINARY IMPLEMENTATION COMPLETE ✅ - Needs Polish & Registry Integration
  • Priority: MEDIUM (minor cleanup, not blocking)
  • What: Core GenAI service components (params_policy, model_router, safety_gate, completion_mapper) are implemented and working, need minor polish
  • Components Implemented:
  • params_policy.py — Policy precedence implemented ✅
    • ✅ Policy precedence: call hint → prompt metadata → defaults
    • ✅ Settings cached via @lru_cache (excellent optimization)
    • ✅ Strong typing with ResolvedParams Pydantic model
    • ✅ Routing diagnostics in routing_reason field
    • Score: 95/100 - Excellent implementation
  • model_router.py — Capability-based routing implemented ✅
    • ✅ Declarative routing table with _MODEL_CAPABILITIES
    • ✅ Structured output fallback (JSON mode capability switching)
    • ✅ Intent-aware architecture foundation
    • ⚠️ Intent routing currently placeholder (line 98-101)
    • Score: 92/100 - Strong implementation
  • safety_gate.py — Three-layer safety checks implemented ✅
    • ✅ Character limit, context window, budget estimation
    • ✅ Typed exceptions (SafetyBlocked)
    • ✅ Structured SafetyReport with actionable diagnostics
    • ✅ Content type handling (string/list with warnings)
    • ✅ Prompt metadata integration (safety_level)
    • ⚠️ Price constant hardcoded (line 30: _PRICE_PER_1K_TOKENS = 0.005)
    • ⚠️ Post-check currently stubbed
    • Score: 94/100 - Excellent implementation
  • completion_mapper.py — Bi-directional mapping implemented ✅

    • ✅ Clean transport → domain transformation
    • ✅ Error details surfaced in policy_applied
    • ✅ Status handling (OK/FAILED/INCOMPLETE)
    • ✅ Pure mapper functions (no side effects)
    • ⚠️ policy_applied uses Dict[str, object] (should be more specific)
    • Score: 91/100 - Strong implementation
  • High Priority (Before Merging):

  • Add Google-style docstrings to public functions (see style-guide.md)
    • apply_policy(), select_provider_and_model(), pre_check(), post_check(), provider_to_completion()
  • Move _PRICE_PER_1K_TOKENS constant to Settings or registry (blocks ADR-A14)
    • Moved to Settings.price_per_1k_tokens; safety gate now consumes setting.
  • Type tightening in completion_mapper

    • Added PolicyApplied alias (dict[str, str | int | float]).
  • Medium Priority (V1 Completion):

  • Promote policy_applied typing to a shared domain type (CompletionEnvelope) to avoid loose dict usage across the service.

  • OpenAI registry-driven request profiles

    • follow current GPT-family adapter shaping with registry-backed request profiles instead of adapter-local matching
    • keep model-specific controls discoverable in provider metadata rather than CLI/runtime conditionals
  • Intent routing implementation
  • Post-check safety implementation

  • Low Priority (Future Work):

  • Warning enum system
    • Create typed warning codes instead of strings
    • Affects: safety_gate, completion_mapper, model_router
  • Enhanced diagnostics
    • More granular routing reasons
    • Detailed safety check diagnostics
  • Message.content Type Architecture Investigation (design quality, non-blocking)
    • Location: gen_ai_service/models/domain.py:92-96
    • Issue: Sourcery identifies Union[str, List[ChatCompletionContentPartParam]] as source of complexity
    • Context: Current design intentionally supports OpenAI's flexible content API (plain text OR structured parts with images/etc)
    • Investigation Areas:
    • Document current usage patterns across codebase
    • Assess downstream complexity: where are type checks needed?
    • Evaluate normalization strategies (always list? separate fields? utility methods?)
    • Consider provider compatibility (Anthropic, etc)
    • Draft ADR or addendum to existing GenAI ADRs if design change warranted
    • Impact: Affects message representation throughout GenAIService

⏸️ GenAIService Thread Safety and Rate Limiting (ADR-A15)

  • Status: DEFERRED - Not needed for VS Code integration (process isolation)
  • Priority: MEDIUM (revisit when building Python batch pipelines)
  • Issue: #22
  • ADR: ADR-A15: Thread Safety and Rate Limiting
  • Why Deferred: VS Code extension uses process isolation (each tnh-gen call = separate GenAIService instance). Thread safety only matters for Python-native batch pipelines.
  • When to Revisit: When implementing concurrent corpus processing loops or batch translation pipelines
  • Estimate: 3-6 hours (Phase 1: 1-2 hours, Phase 2: 2-4 hours)
  • Quick Summary: Add thread-safe retry state, locked cache, and optional rate limiting for high-throughput scenarios

Priority 2: Production Hardening (Post-Bootstrap)

Goal: Harden TNH Scholar for production use after VS Code integration enables AI-assisted development. Focuses on reliability, test coverage, and type safety.

🚧 OpenAI SDK 2.15.0 Validation (High Priority)

  • Status: NOT STARTED
  • Why: SDK bump impacts OpenAI adapter. (Codex harness suspended — see ADR-OA03.2 addendum)
  • Tasks:
  • Revalidate OpenAI adapter request/response mappings against 2.15.0
  • Update compatibility notes/docs if schema drift is found

🚧 Audio-Transcribe Service-Layer Refactor (P2)

  • Status: NOT STARTED
  • Goal: Align audio-transcribe with object-service pattern and ytt-fetch robustness.
  • Tasks:
  • Introduce typed service orchestrator + protocols (CLI becomes thin wrapper)
  • Extract audio source resolution into a typed resolver (yt_url/CSV/local file)
  • Replace dict options with Pydantic models (transcription + diarization params)
  • Move logging bootstrap out of module import path so audio-transcribe modules are import-safe in tests and sandboxed environments
  • Add runtime preflight (yt-dlp inspector + ffmpeg availability); keep version checks ops-only
  • Migrate CLI to Typer with minimal surface (smoke tests only)
  • Add service-layer tests for all audio-transcribe use cases

🚧 Fully Usable audio-transcribe Hardening Path (P2)

  • Status: NOT STARTED
  • Goal: Make audio-transcribe safe for routine real-world use by eliminating stack dumps from expected fault paths and hardening the tool against edge-case and provider-shape failures.
  • Tasks:
  • Build a full-spectrum regression matrix covering local files, YouTube URLs, CSV input, diarization on/off, Whisper, and AssemblyAI
  • Add end-to-end CLI regression coverage for normal runs plus representative operator workflows used in practice
  • Add fault-injection coverage for provider shape drift, empty/partial transcript results, chunk-level failures, file-write failures, temp-dir cleanup failures, and missing dependency/runtime preconditions
  • Ensure expected runtime faults exit with user-facing errors instead of Python tracebacks
  • Capture and preserve real-world reproductions as named regression fixtures/cases whenever a live failure is found
  • Run a post-hardening soak pass against representative real inputs before the next minor release

⏸️ Agent Orchestration - Codex Runner (ADR-OA03.2)

  • Status: TABLED (2026-01-25)
  • ADR: ADR-OA03.2
  • Why Tabled:
  • Scope: Spike revealed that a proper Codex harness requires implementing full client-side agent orchestration (the VS Code extension uses a proprietary app server, not raw API calls)
  • Cost-benefit: Current human-in-the-loop workflow with Claude Code + VS Code Codex extension is effective and cost-efficient for project needs
  • No compelling need: Investment not justified when manual workflow works well
  • Findings: Codex Harness Spike Findings
  • Preserved Artifacts: src/tnh_scholar/agent_orchestration/codex_harness/, src/tnh_scholar/cli_tools/tnh_codex_harness/
  • Conditions for Resumption: Further insight or clear business need that justifies full agent orchestration investment

🚧 Expand Test Coverage

  • Status: NOT STARTED
  • Current Coverage: Refresh baseline before planning. The old "~5% / 4 test modules" snapshot is obsolete.
  • Target: 50%+ for gen_ai_service
  • Tasks:
  • GenAI service flows: prompt rendering, policy resolution, provider adapters
  • CLI integration tests (option parsing, environment validation)
  • Configuration loading edge cases
  • Error handling scenarios
  • Pattern catalog validation
  • Full CLI test suite with refreshed priorities
    • focus on active CLI surfaces still lacking strong coverage, not already-hardened tnh-gen basics
    • prioritize tnh-conductor, audio-transcribe, ytt-fetch, and edge-case regression coverage

🚧 Consolidate Environment Loading

🚧 Configuration Tech Debt — Migrate to ADR-CF01/CF02 Three-Layer Model

Migration Phases:

  1. Phase 1: Extend TNHContext for Prompts ✅ COMPLETE
  2. Add PromptPathBuilder analogous to RegistryPathBuildersrc/tnh_scholar/configuration/context.py:165-191
  3. Define three-layer prompt discovery: workspace → user → built-in
  4. Create runtime_assets/prompts/ with minimal built-in set (3 prompts + _catalog.yaml)
  5. Unit tests for prompt path resolution — tests/configuration/test_prompt_discovery.py

  6. Phase 2: Migrate GenAISettings ✅ COMPLETE

  7. Update GenAISettings.prompt_dir to use lazy TNHContext resolution — settings.py:89-102
  8. Legacy TNH_DEFAULT_PROMPT_DIR constant removed from __init__.py
  9. tnh-gen config_loader works with new resolution

  10. Phase 3: Eliminate Module-Level Constants ✅ COMPLETE

  11. TNH_CONFIG_DIR, TNH_LOG_DIR, TNH_DEFAULT_PROMPT_DIR removed from __init__.py
  12. Only structural constants remain (TNH_ROOT_SRC_DIR, TNH_PROJECT_ROOT_DIR, TNH_CLI_TOOLS_DIR)
  13. No FileNotFoundError raises at import time for config paths

  14. Phase 4: Unify Subsystem Settings (Medium Priority) — NOT STARTED

  15. Audit all BaseSettings classes across subsystems
  16. Deprecate PromptSystemSettings.tnh_prompt_dir in favor of unified approach
  17. Standardize env var prefixes (e.g., TNH_GENAI_*, TNH_AUDIO_*)

  18. Phase 5: Propagate tnh-gen Config Pattern (Low Priority) — NOT STARTED

  19. Create shared CLIConfigLoader base for all CLI tools
  20. Add config show/get/set subcommands to major CLI tools
  21. Standardize workspace config file format

Success Criteria: - [x] No module-level config Path constants in __init__.py - [x] Prompt path discovery flows through TNHContext - [x] Prompt directories follow three-layer precedence (workspace → user → built-in) - [ ] At least tnh-gen and audio-transcribe share config loader pattern

🚧 Clean Up CLI Tool Versions

  • Status: PARTIAL (old versions removed, utilities pending)
  • Location: cli_tools/audio_transcribe/
  • Tasks:
  • Remove legacy audio_transcribe0.py
  • Remove audio_transcribe1.py
  • Remove audio_transcribe2.py
  • Keep only current version
  • Create shared utilities (argument parsing, environment validation, logging)

✅ Documentation Reorganization (ADR-DD01 & ADR-DD02) — See Archive

Phase 1 COMPLETE - Remaining Phase 2 tasks:

  • Doc metadata validation script (check_doc_metadata.py) - validate front matter
  • Docstring coverage (interrogate) - threshold on src/tnh_scholar
  • Archive index + legacy ADR migration to docs/archive/**
  • Backlog: populate docs/docs-ops/roadmap.md with missing topics
  • User guides for new features, architecture component diagrams

🚧 Type System Improvements

  • Status: PARTIAL
  • Current: 58 errors across 16 files
  • High Priority: Fix audio processing boundary types, core text processing types, function redefinitions
  • Medium Priority: Add missing type annotations, fix Pattern class type issues
  • Low Priority: Clean up Any return types, standardize type usage

🚧 Prompt Catalog Safety

  • Status: IN PROGRESS
  • Priority: HIGH (critical infrastructure)
  • Problem: Catalog health reporting is now in place, but the prompt platform still needs stronger manifest/schema guarantees and clearer operator docs
  • Tasks:
  • Aggregate catalog parse/validation issues into typed health reports
  • Surface catalog health through tnh-gen list and tnh-gen config show --catalog-health
  • Fix relative --prompt-dir prompt catalog path resolution and remove bogus legacy output_mode: text warnings
  • Normalize legacy prompt frontmatter to PT05 baseline (role, inputs, explicit output_contract)
  • Add schema_ref coverage for maintained JSON prompts
  • Simplify overlapping legacy prompt bodies and retire redundant variants
  • Decide the release-facing default prompt home per ADR-PT05.1: repoint workspace discovery from prompts/ to tnh-prompts/, or explicitly keep tnh-prompts as an override-only prototype path with clear docs
  • Deprecate the external prompt-repo / setup-download path from the normative release workflow, or explicitly reclassify it as experimental-only per ADR-PT05.1
  • Retire PromptSystemSettings.tnh_prompt_dir from the active runtime path so prompt-directory resolution flows through GenAISettings + TNHContext
  • Finish demoting git-backed/shared prompt-catalog code and docs to explicit experimental status where it remains in-repo
  • Follow up on the prompt-platform docs/ADR triage captured in Prompt Platform Cleanup Follow-On after the current tnh-gen testing slice is stabilized
  • Replace broad JSON-visibility pain points with narrow .gitignore allowlists for prompt/test artifact directories as needed; do not reverse the repo-wide *.json ignore
  • Add manifest validation
  • Better error messages (unknown prompt, hash mismatch)
  • Frontmatter/schema validation guidance
  • Document prompt schema
  • Clean up ADR statuses and cross-ADR alignment for the prompt-platform / tnh-gen structured JSON contract area (TG04, TG04.1, PT05, related prompt-system notes) once the design settles

🚧 tnh-gen Operator UX

  • Status: NOT STARTED
  • Priority: LOW–MEDIUM
  • Problem: tnh-gen provides no feedback while the model is working and does not save run output automatically, creating a poor experience for interactive and long-running calls
  • Tasks:
  • Add heartbeat / progress indicator to stderr during model calls so operators know the run is alive (especially for long documents — 10–30 s wait with no output)
  • Add basic run logging: log prompt key, model, input path, and elapsed time at completion even in non---api mode
  • Persist tnh-gen run output by default to a temp or run-artifact directory when no --output-file is provided

🚧 tnh-gen Review Context Ingestion

  • Status: NOT STARTED
  • Priority: MEDIUM
  • Problem: tnh-gen can run ad hoc review prompts via --prompt-dir, but it cannot yet gather bounded local document context on its own for review workflows such as docs language audits
  • Tasks:
  • Add repeatable local context inputs for tnh-gen run (for example --context-file or --context-dir)
  • Support bounded repo-local file loading for review prompts with explicit source allowlists
  • Emit included context sources in provenance and API output
  • Document a standard review-workflow pattern for docs, ADR, and architecture audits
  • Add follow-on conversation support for tnh-gen review/generation runs so a prompt can continue from prior output or thread state

🚧 Knowledge Base Implementation

  • Status: DESIGN COMPLETE
  • ADR: ADR-K01
  • Tasks:
  • Implement Supabase integration
  • Vector search functionality
  • Query capabilities
  • Semantic similarity search

🚧 Configuration & Data Layout

  • Status: NOT STARTED
  • Priority: HIGH (blocks pip install)
  • Problem: packaging and installed-wheel validation still need cleanup around prompt assets and repo-layout assumptions
  • Tasks:
  • Package prompt assets as resources where needed
  • Verify installed wheels work without repo-local prompt directories
  • Keep repo-layout assumptions out of import-time package initialization
  • Audit CLI entry points for any remaining repo-root-only assumptions

🚧 Repo-Root Docs Generation and CI Consistency

  • Status: NOT STARTED
  • Priority: HIGH
  • Problem: Documentation standards and generated docs link to /project/repo-root/*, but those generated files currently live under ignored paths and may be absent in clean remote CI checkouts, causing MkDocs/link-validation inconsistencies between local and GitHub builds
  • Tasks:
  • Decide whether docs/project/repo-root/ outputs should be tracked in git or generated in all CI docs-validation paths before link checks run
  • Align .gitignore, docs build scripts, and CI expectations so local and remote docs validation see the same repo-root docs set
  • Verify make docs-build, PR docs validation, and GitHub Actions all succeed from a clean checkout with no pre-existing generated repo-root docs
  • Document the intended contract for repo-root doc mirrors in docs ops guidance
  • Review upcoming MkDocs 2.0 / Material compatibility risk and define an upgrade stance before docs-tooling version changes are taken

🚧 Logging System Scope

  • Location: src/tnh_scholar/logging_config.py
  • Problem: Modules call setup_logging individually
  • Tasks:
  • Define single application bootstrap
  • Document logger acquisition pattern (get_logger only)
  • Create shared CLI bootstrap helper

🚧 Comprehensive CLI Reference Documentation

  • Status: IN PROGRESS (tnh-gen complete, tnh-conductor now documented, other CLIs still uneven)
  • Tasks:
  • Update user-guide examples to use tnh-gen
  • Document other CLI tools with maintained/operator-facing scope (audio-transcribe, ytt-fetch, nfmt, etc.)
  • Consider automation for CLI reference generation

🔮 Shared CLI UI Module (tnh_cli_ui)

  • Status: NOT STARTED (Research/Exploration)
  • Priority: MEDIUM (UX consistency across CLI tools)
  • ADR: ADR-ST01.1: tnh-setup UI Design
  • Context: The tnh-setup UI redesign (Rich library) could be extracted into a shared module for consistent styling across all tnh-scholar CLI tools.
  • Research Questions:
  • Survey CLI tools for shared UI patterns (headers, status indicators, progress, tables)
  • Evaluate Rich vs alternatives (click-extra, questionary, etc.)
  • Design minimal API surface for common operations
  • Consider Typer + Rich integration patterns
  • Potential Scope:
  • Styled section headers with step progress
  • Standardized status indicators (✓/⚠/✗/○/•) with color vocabulary
  • Spinner wrappers for async operations
  • Summary table generators
  • Banner/header utilities
  • Affected Tools: tnh-setup, tnh-gen, ytt-fetch, audio-transcribe, nfmt, token-count, tnh-tree

🚧 Document Success Cases

  • Status: NOT STARTED
  • Goal: Document TNH Scholar's successful real-world applications
  • Cases: Deer Park Cooking Course (SRTs), 1950s JVB Translation (OCR), Dharma Talk Transcriptions, Sr. Dang Nhiem's talks
  • Tasks:
  • Create docs/case-studies/ directory structure
  • Document each case with context, tools, challenges, outcomes

🚧 Notebook System Overhaul

  • Status: NOT STARTED
  • Priority: HIGH
  • Goal: Ship a cleaner release repo by reducing notebook clutter and keeping only intentional examples/research assets
  • Tasks:
  • Audit & categorize all notebooks
  • Remove or archive junk/testing notebooks that no longer justify repo overhead
  • Convert notebook-discovered tests into pytest where the behavior still matters
  • Keep only core example/research notebooks that are intentional release artifacts
  • Add context notes for archived notebooks that still have historical value

Priority 3: Future Work & Advanced Features

Goal: Long-term sustainability, advanced features, and nice-to-have improvements. Address after bootstrap loop is working.

🚧 Refactor Monolithic Modules

🚧 Complete Provider Abstraction

  • Status: NOT STARTED
  • Tasks:
  • Implement Anthropic adapter
  • Add provider-specific error handling
  • Test fallback/retry across providers
  • Provider capability discovery
  • Multi-provider cost optimization

🚧 Developer Experience Improvements

  • Status: PARTIAL (hooks and Makefile exist, automation pending)
  • Tasks:
  • Add pre-commit hooks (Ruff, notebook prep)
  • Create Makefile for common tasks (lint, test, docs, format, setup)
  • Add MyPy to pre-commit hooks
  • Add contribution templates (issue/PR templates)
  • CONTRIBUTING.md exists and documented
  • Release automation
  • Changelog automation

🚧 Historical ADR Status Audit

  • Status: NOT STARTED
  • Context: 25 ADRs marked with status: current from pre-markdown-standards migration
  • Tasks:
  • Review each ADR to determine actual status (implemented/superseded/rejected)
  • Update status field in YAML frontmatter
  • Cross-reference with newer ADRs for superseded decisions

🚧 Package API Definition

  • Status: Deferred during prototyping
  • Tasks:
  • Review and document all intended public exports
  • Implement __all__ in key __init__.py files
  • Verify exports match documentation

🚧 Repo Hygiene

  • Problem: Generated artifacts in repo (build/, dist/, site/, *.txt)
  • Tasks:
  • Add to .gitignore
  • Document regeneration process
  • Rely on release pipelines for builds

🚧 Notebook & Research Management

  • Location: notebooks/, docs/research/
  • Problem: Valuable but not curated exploratory work
  • Tasks:
  • Adopt naming/linting convention
  • Publish vetted analyses to docs/research via nbconvert
  • Archive obsolete notebooks

Recently Completed Tasks (Archive)

tnh-gen CLI Implementation ✅

  • Completed: 2025-12-27
  • ADR: ADR-TG01, ADR-TG01.1
  • What: Protocol-driven CLI replacing tnh-fab, dual modes (human-friendly default, --api for machine consumption)
  • Documentation: tnh-gen CLI Reference (661 lines)

File-Based Registry System (ADR-A14) ✅

  • Completed: 2026-01-01 (PR #24)
  • ADR: ADR-A14, ADR-A14.1
  • What: JSONC-based registry with multi-tier pricing, TNHContext path resolution, staleness detection
  • Key Deliverables: openai.jsonc registry, RegistryLoader, Pydantic schemas, JSON Schema for VS Code, refactored model_router.py and safety_gate.py, 264 tests passing

VS Code Extension Walking Skeleton ✅

  • Completed: 2026-01-07
  • ADR: ADR-VSC01, ADR-VSC02
  • What: TypeScript extension enabling "Run Prompt on Active File" workflow
  • Capabilities: QuickPick prompt selector, dynamic variable input, tnh-gen run subprocess execution, split-pane output, unit/integration tests
  • Validation: Proves bootstrapping concept - extension ready to accelerate TNH Scholar development

Pattern→Prompt Migration ✅

  • Completed: 2026-01-19
  • ADR: ADR-PT04
  • What: Pattern→Prompt terminology migration and directory restructuring
  • Key Changes: patterns/prompts/ (standalone tnh-prompts repo), TNH_PATTERN_DIRTNH_PROMPT_DIR, removed legacy tnh-fab CLI
  • Breaking: TNH_PATTERN_DIR env var removed, tnh-fab CLI removed

Provenance Format Refactor ✅

  • Completed: 2026-01-19
  • ADR: ADR-TG01 Addendum 2025-12-28
  • What: Switched tnh-gen from HTML comments to YAML frontmatter for provenance metadata
  • Files Modified: provenance.py, test_tnh_gen.py, tnh-gen.md

OpenAI Client Unification ✅

  • Completed: 2025-12-10
  • ADR: ADR-A13
  • What: Migrated from legacy openai_interface/ to modern gen_ai_service/providers/ architecture (6 phases)

Core Stubs Implementation ✅

  • Completed: 2025-12-10
  • What: Implemented params_policy, model_router, safety_gate, completion_mapper with strong typing
  • Grade: A- (92/100) - Production ready with minor polish

Documentation Reorganization Phase 1 ✅

  • Completed: 2025-12-05
  • ADR: ADR-DD01, ADR-DD02
  • What: Absolute links, MkDocs strict mode, filesystem-driven nav, lychee link checking

Packaging & CI Infrastructure ✅

  • Completed: 2025-11-20
  • What: pytest in CI, runtime dependencies declared, pre-commit hooks, Makefile targets

Remove Library sys.exit() Calls ✅

  • Completed: 2025-11-15
  • What: Library code raises ConfigurationError instead of exiting process
  • Completed: 2025-12-05 (PR #14)
  • What: Converted 964 links to absolute paths, enabled MkDocs strict link validation, integrated link verification

NumberedText Section Boundary Validation ✅

  • Completed: 2025-12-12
  • ADR: ADR-AT03.2 (status: accepted → should be implemented)
  • What: Implemented validate_section_boundaries() and get_coverage_report() methods for robust section management
  • Commits: cf99375 (docs), 798a552 (refactor unused methods)

TextObject Robustness Improvements ✅

  • Completed: 2025-12-14
  • ADR: ADR-AT03.3 (status: accepted → should be implemented)
  • What: Implemented merge_metadata() with MergeStrategy enum, validate_sections() with fail-fast, converted to Pydantic v2, added structured exception hierarchy
  • Commits: 096e528 (implementation), 03654fe (../../docstrings)