TNH Scholar¶
TNH Scholar is intended to support a community-aligned, open-source, multilingual digital ecosystem for studying, translating, and engaging with the teachings of Thích Nhất Hạnh and the Plum Village Community of Engaged Buddhism.
This document contains deeper onboarding and architectural context. For a more concise intro to the project, see the README.
Vision & Aspirations¶
TNH Scholar is intended as a long-term effort to support the living Plum Village tradition with trustworthy, transparent digital tools. This work is intended, both in development and usage, to deeply respect the tradition and practice of Thích Nhất Hạnh and the Plum Village community.
- Support the building of multilingual corpora of Thích Nhất Hạnh's and the Plum Village community's teachings with high-fidelity text, rich metadata, and sentence-level alignment across languages.
- Provide AI-assisted research tools that expose their reasoning and keep human judgment central, serving monastics, practitioners, teachers, and researchers.
- Support cross-lingual research with support for Vietnamese, English, French, Chinese, Pāli, Sanskrit, Tibetan, and other sources.
- Enable rich interactive environments like bilingual readers combining scans, text, translations, and audio.
- Enable human-supervised AI workflows for corpus processing, translation, and evaluation.
This work is envisioned on a multi-year to multi-decade timescale. The CLI tools and GenAI Service in this repository are the early infrastructure for that larger arc.
For the full vision, including scope, non-scope, relationship to spin-offs, and time horizon, see:
Note on Terminology: Earlier versions of TNH Scholar referred to engineered AI prompts as "Patterns" to emphasize their engineering pattern nature. Current documentation uses "Prompt" to align with industry standards. References to "Pattern" in legacy documentation should be read as "Prompt".
What TNH Scholar Makes Possible¶
TNH Scholar aims to support the community in:
- Exploring teachings with bilingual text and translation side-by-side
- Searching themes and teachings across languages and periods using semantic search and retrieval
- Discovering related teachings, concepts, and practices through advanced search capabilities
- Reviewing and refining translations collaboratively with transparent history
- Connecting practitioners, researchers, and teachers with reliable digital resources
- Preserving teaching materials for future generations with clarity and care
These are aspirational but active development goals aligned with the needs of the Plum Village community.
Current Features¶
- Audio and transcript processing:
audio-transcribewith diarization and YouTube support - Text formatting and translation:
tnh-genCLI for prompt-driven text processing with human-friendly defaults and API mode for programmatic use - Acquisition utilities:
ytt-fetchfor transcripts;token-countandnfmtfor prep and planning - Setup and configuration:
tnh-setupplus guided config in Getting Started - Prompt system: See Prompt System Architecture and ADR-PT03 for current status and roadmap
✅ tnh-gen v1.0 Available: The
tnh-genCLI is now fully implemented with dual output modes (human-friendly by default,--apiflag for machine-readable output). See tnh-gen CLI Reference for complete documentation.
Getting Started¶
Choose your path based on your primary interest:
Path 1: Use the Tools¶
For practitioners, translators, and researchers ready to work with TNH Scholar:
Get up and running with TNH Scholar's CLI tools for transcription, translation, and text processing:
- Install from PyPI:
- Configure credentials per Configuration
- Follow the Quick Start Guide for your first workflow
- Explore task-oriented workflows in the User Guide
Path 2: Understand the Vision & Principles¶
For community members, stakeholders, and and those exploring how this project fits within Plum Village initiatives:
Explore the project's foundation, values, and long-term direction:
- Vision & Scope: Project Vision – multi-year aspirations, community alignment, and what's in/out of scope
- Philosophy: Philosophy – ethical foundations and mindful technology principles
- Principles: Design Principles – transparency, human judgment, and architectural values
- Community Context: Parallax Overview – relationship to broader Plum Village digital initiatives
Path 3: Contribute to Development¶
For developers, architects, and contributors:
Understand the technical foundation and start contributing:
- Setup: DEV_SETUP.md – development environment and workflows
- Architecture: System Design and Architecture Overview – core patterns and technical decisions
- Standards: Style Guide and Contributing – code quality and PR workflow
- Key ADRs: Start with GenAI Service Strategy and Prompt System Status
- Research: Research Index – experiments, evaluations, and exploratory work
- Future Directions: Long-term Vision – planned research directions and architectural horizons
- Common commands: , , , ,
Documentation Overview¶
- Getting Started: Installation, configuration, first-run guidance
- User Guide: Task-oriented workflows and practical how-tos
- CLI Reference: Auto-generated command documentation for every CLI entry point
- API: Python API reference (mkdocstrings)
- Architecture: ADRs, design docs, system diagrams by component
- Development: Contributor guides, design principles, engineering practices
- Docs Ops: Style guides, ADR template, documentation maintenance
- Research: Experiments, evaluations, exploratory notes
Project Status¶
TNH Scholar is currently in alpha stage. Expect ongoing API and workflow changes during active development.
Support & Community¶
- Bug reports & feature requests: GitHub Issues
- Questions & discussions: GitHub Discussions
License¶
This project is licensed under the GPL-3.0 License.
Documentation Map¶
Auto-generated map of the documentation hierarchy. Regenerated during docs builds; edit source content instead of this file.
Getting Started¶
User Guide¶
Project¶
- Conceptual Architecture of TNH-Scholar
- Future Directions of TNH-Scholar
- TNH Scholar CHANGELOG
- TNH Scholar CONTRIBUTING
- TNH Scholar README
- TNH Scholar Release Checklist
- TNH Scholar TODO List
- TNH Scholar Versioning Policy
- TNH-Scholar DEV_SETUP
- TNH-Scholar Project Philosophy
- TNH-Scholar Project Principles
- TNH-Scholar Project Vision
Community¶
CLI Reference¶
- audio-transcribe
- Command Line Tools Overview
- json-to-srt
- nfmt
- sent-split
- srt-translate
- tnh-gen
- tnh-setup
- token-count
- ytt-fetch
Architecture¶
- ADR-A01: Adopt Object-Service for GenAI Interactions
- ADR-A02: PatternCatalog Integration (V1)
- ADR-A08: Configuration / Parameters / Policy Taxonomy
- ADR-A09: V1 Simplified Implementation Pathway
- ADR-A11: Model Parameters and Strong Typing Fix
- ADR-A12: Prompt System & Fingerprinting Architecture (V1)
- ADR-A13: Migrate All OpenAI Interactions to GenAIService
- ADR-A14.1: Registry Staleness Detection and User Warnings
- ADR-A14: File-Based Registry System for Provider Metadata
- ADR-A15: Thread Safety and Rate Limiting
- ADR-AT01: AI Text Processing Pipeline Redesign
- ADR-AT02: TextObject Architecture Decision Records
- ADR-AT03.1: AT03→AT04 Transition Plan
- ADR-AT03.2: NumberedText Section Boundary Validation
- ADR-AT03.3: TextObject Robustness and Metadata Management
- ADR-AT03: Minimal AI Text Processing Refactor for tnh-gen
- ADR-AT04: AI Text Processing Platform Strategy
- ADR-CF01: Runtime Context & Configuration Strategy
- ADR-CF02: Prompt Catalog Discovery Strategy
- ADR-DD01: Documentation System Reorganization Strategy
- ADR-DD02: Documentation Main Content and Navigation Strategy
- ADR-DD03: Pattern to Prompt Terminology Standardization
- ADR-DD03: Phase 1 Execution Punch List
- ADR-JV03: Canonical XML AST for English Parsing
- ADR-JVB01: JVB Parallel Viewer v1 As-Built
- ADR-K01: Preliminary Architectural Strategy for TNH Scholar Knowledge Base
- ADR-MD01: Adoption of JSON-LD for Metadata Management
- ADR-MD02: Metadata Infrastructure Object-Service Integration
- ADR-OA01.1: TNH-Conductor — Provenance-Driven AI Workflow Coordination (v2)
- ADR-OA01: TNH-Conductor — Provenance-Driven AI Workflow Coordination
- ADR-OA02: Phase 0 Protocol Layer Spike
- ADR-OA03.1: Claude Code Runner
- ADR-OA03.2: Codex Runner
- ADR-OA03: Agent Runner Architecture
- ADR-OS01: Object-Service Design Architecture V3
- ADR-PP01: Rapid Prototype Versioning Policy
- ADR-PT03: Prompt System Current Status & Roadmap
- ADR-PT04: Prompt System Refactor Plan (Revised)
- ADR-PV01: Provenance & Tracing Infrastructure Strategy
- ADR-ST01.1: tnh-setup UI Design
- ADR-ST01: tnh-setup Runtime Hardening
- ADR-TG01.1: Human-Friendly CLI Defaults with --api Flag
- ADR-TG01: tnh-gen CLI Architecture
- ADR-TG02: TNH-Gen CLI Prompt System Integration
- ADR-TR01: AssemblyAI Integration for Transcription Service
- ADR-TR02: Optimized SRT Generation Design
- ADR-TR03: Standardizing Timestamps to Milliseconds
- ADR-TR04: AssemblyAI Service Implementation Improvements
- ADR-VP01: Video Processing Return Types and Configuration
- ADR-VP02: yt-dlp Operational Strategy
- ADR-VSC01: VS Code Integration Strategy (TNH-Scholar Extension v0.1.0)
- ADR-VSC02: VS Code Extension Architecture
- ADR-VSC03.2: Real-World Survey Addendum (VS Code as a UI/UX Platform)
- ADR-VSC03.3: Investigation Synthesis - Validation of Design Choices
- ADR-VSC03: Preliminary Investigation Findings
- ADR-VSC03: Python-JavaScript Impedance Mismatch Investigation
- ADR-YF00: Early yt-fetch Transcript Decisions (Historical)
- ADR-YF01: YouTube Transcript Source Handling
- ADR-YF02: YouTube Transcript Format Selection
- Architecture Overview
- Audio Chunking Algorithm Design Document
- Codex Harness End-to-End Test Report
- Codex Harness Spike Findings
- Design Strategy: VS Code as UI/UX Platform for TNH Scholar
- Diarization Algorithms
- Diarization Chunker Module Design Strategy
- Diarization System Design
- Documentation Design
- GenAI Service — Design Strategy
- Generate Markdown Translation JSON Pairs
- Generate Markdown Vietnamese
- Interval-to-Segment Mapping Algorithm
- JVB Viewer — Version 2 Strategy & High‑Level Design
- Language-Aware Chunking Orchestrator Notes
- LUÂN-HỒI
- minimal but extensible setup tool for the prototyping phase
- Modular Pipeline Design: Best Practices for Audio Transcription and Diarization
- Object-Service Design Gaps
- Object-Service Design Overview
- Object-Service Implementation Status
- OpenAI Interface Migration Plan
- Package Version Checker Design Document
- Practical Language-Aware Chunking Design
- Prompt System Architecture
- Simplified Language-Aware Chunking Design
- Speaker Diarization Algorithm Design
- Speaker Diarization and Time-Mapped Transcription System Design
- TextObject Original Design
- TextObject System Design Document
- TimelineMapper Design Document
- TNH Configuration Management
- TNH-Scholar Agent Orchestration System
- TNH‑Scholar Utilities Catalog
- Versioning Policy Documentation Additions
- YouTube API vs yt-dlp Evaluation
Development¶
- Contributing to TNH Scholar (Prototype Phase)
- Development Documentation
- Fine Tuning Strategy
- Forensic Analysis: December 7, 2025 Git Data Loss Incident
- Git Workflow & Safety Guide
- Human-AI Software Engineering Principles
- Implementation Summary: Git Safety Improvements
- Improvements / Initial structure
- Incident Report: Git Recovery - December 7, 2025
- Proposed Updates to Incident Report
- Release Workflow
- TNH Scholar Design Principles
- TNH Scholar Style Guide
- TNH Scholar System Design
- v0.2.0 Tag Correction Plan
- yt-dlp Ops Check
Docs Ops¶
- ADR Template
- Markdown Standards
- MkDocs Strict Warning Backlog
- Preview TNH Scholar Theme
- TNH Scholar Theme Design
Research¶
- 1-3 Word Queries
- GPT Development Convos
- Passage Test
- Preliminary Feasibility Study
- RAG Research Directions for TNH Scholar
- Structural-Informed Adaptive Processing (SIAP) Methodology
- Summary Report on Metadata Extraction, Source Parsing, and Model Training for TNH-Scholar
- TNH Scholar Knowledge Base: Design Document