Skip to content

Future Directions of TNH-Scholar

Explores long-horizon possibilities for TNH-Scholar and related systems—non-committal scenarios informed by the project’s philosophy, architecture, and trajectory.

TNH-Scholar is intentionally designed as a foundational system — a clean corpus, structured text models, provenance-rich transformations, agent-ready pipelines, and a pattern-driven GenAI interface. This foundation enables not only the current scholarly workflows, but also a number of long-horizon possibilities.

The sections below outline potential future evolutions.


flowchart TD

    A[Foundational System<br/>Clean Corpus • Structured Text • Provenance • Patterns] --> B[Semi-Autonomous Agent Loops]
    A --> C[Autonomous Corpus Pipelines]
    A --> D[Intelligent Scholarly Assistants]
    A --> E[Corpus-Aware Model Training / Evaluation]
    A --> F[Agentic Application Framework]
    A --> G[Ecosystem-Level Integrations]

    B --> B1[Code Maintenance Agents<br/>Refactor • Test • Evaluate • Plan]
    B --> B2[Research / Data Agents<br/>Cleaning • Sectioning • Alignment]

    C --> C1[Continuous Ingest + Cleanup]
    C --> C2[Metadata + Sectioning Pipelines]
    C --> C3[Translation + Evaluation Loops]

    D --> D1[Semantic Research Companion]
    D --> D2[Interactive Dharma Exploration]

    E --> E1[Domain-Specific Models]
    E --> E2[Corpus-Aligned Evaluation Loops]

    F --> F1[General Codebase Agents]
    F --> F2[Document / Data Transformation Systems]

    G --> G1[Advanced UX Layers<br/>VS Code • Web • Jupyter]
    G --> G2[Distributed Scholarly Tools<br/>APIs • Collaborators • Multi-modal]

    style A fill:#fdf6e3,stroke:#b58900,stroke-width:2px
    style B fill:#eee8d5,stroke:#b58900
    style C fill:#eee8d5,stroke:#b58900
    style D fill:#eee8d5,stroke:#b58900
    style E fill:#eee8d5,stroke:#b58900
    style F fill:#eee8d5,stroke:#b58900
    style G fill:#eee8d5,stroke:#b58900

1. Semi-Autonomous, Long-Running Agent Loops

A natural evolution of the GenAIService + PromptCatalog + provenance system is the creation of long-running, semi-autonomous agents that execute sequences of tasks with human oversight.

These loops could support:

1.1 Code-oriented agents (descendant projects)

Agents that can:

  • Parse ADRs and design docs
  • Generate or refactor code patches
  • Evaluate quality using pattern-based evaluation prompts
  • Run tests
  • Detect architectural drift
  • Open pull requests
  • Summarize changes for humans
  • Make plans like:
  • “new-problem-encountered”
  • “design revision required”
  • “refactor recommended”
  • “evaluation failure—request human review”

This forms the basis of:

An AI-augmented software engineering assistant capable of maintaining complex codebases using structured, documented intent.

A direct descendant of TNH-Scholar could be a general-purpose agentic software engineering platform using these same abstractions.


2. Autonomous Corpus Processing Pipelines

TNH-Scholar’s data layer (OCR → structured text → metadata → alignment) can be expanded into autonomous pipelines that continuously refine the corpus.

These pipelines could

  • Automatically detect new scans or materials
  • Run cleanup/normalization stages
  • Apply sectioning & metadata tagging patterns
  • Align bilingual or trilingual segments
  • Evaluate translation quality
  • Surface anomalies or inconsistencies for human review
  • Trigger model updates or fine-tuning rounds

This becomes:

A living, evolving scholarly corpus with transparent, traceable transformations and continuous improvement.


3. Intelligent Scholarly Assistants

Once the corpus is structured and richly annotated, future systems could support:

3.1 Semantic research companions

Agents that:

  • Trace a concept (e.g., interbeing, emptiness, mindfulness) across decades of talks
  • Construct cross-lingual concept graphs
  • Surface related sutras, commentaries, and historical contexts
  • Link concepts across Vietnamese, English, Chinese, Pali, Sanskrit, Tibetan sources
  • Generate reading paths, study plans, or commentary maps

3.2 Interactive Dharma exploration

Higher-level interfaces could enable:

  • Interactive Q&A grounded in verifiable citations
  • Multilingual guided meditation or sutra explanations
  • Diachronic examination of teachings over time
  • Timeline exploration of ThĂ­ch Nhất Hạnh’s writings and talks

This brings the tradition into rich conversation with practitioners and scholars, with accuracy and transparency.


4. Model Training & Corpus-Aware AI Systems

TNH-Scholar could become the foundation for:

4.1 Domain-specific models

  • Multilingual Buddhist embedding models
  • Custom translation models fine-tuned on Plum Village sources
  • Topic-specific summarizers
  • Dialogue systems grounded in verifiable citations

4.2 Corpus-aligned model evaluation loops

With provenance and pattern-driven evaluation, you could build:

  • Continuous training pipelines
  • Regression tests for translation or summarization accuracy
  • Style- and lineage-aware evaluation criteria
  • Model quality dashboards

These models would not replace human teachers but enhance research, translation, and accessibility.

4.3 Training Pipeline Research Direction

Status: Research spike planned (see GitHub Issue #6)

The processed content generated by TNH Scholar's AI workflows could serve as training data for model fine-tuning:

Research Questions:

  • How to effectively extract training pairs from processed content?
  • What fine-tuning approaches are most suitable (OpenAI fine-tuning, open source alternatives)?
  • What are resource requirements for training?
  • How to evaluate training effectiveness?
  • What infrastructure is needed?

Potential Approaches:

  • Extract human-reviewed translation pairs for fine-tuning
  • Use sectioning outputs as examples for structure-aware models
  • Create domain-specific evaluation datasets from validated outputs
  • Develop feedback loops between model performance and corpus quality

Considerations:

  • Balance between prototype phase priorities and long-term research
  • Resource constraints (compute, storage, API costs)
  • Quality assurance for training data
  • Community involvement in evaluation and validation

This research direction aligns with the long-term vision of corpus-aware AI systems while remaining grounded in current prototype capabilities.


5. Agentic Application Development Framework

TNH-Scholar’s architecture (patterns → GenAIService → provenance → structured data) could generalize to:

A modular agentic automation framework for any domain.

Possible future descendant projects:

  • A codebase-maintaining agent system
  • A domain-specific document-processing AI
  • A pattern-driven data transformation engine
  • A provenance-preserving automation fabric

The philosophical and architectural foundations of TNH-Scholar (structured data, documented intent, provenance-first, pattern-based prompting) make it an ideal parent project for a broader agentic ecosystem.


6. Ecosystem-Level Integrations

Future possibilities include:

6.1 Advanced UX layers

  • VS Code development agent integration
  • In-browser corpus exploration environments
  • Interactive bilingual study interfaces
  • Multi-panel JVB + text + translation + metadata views
  • Notebook-based agent workflows (e.g., Jupyter, VS Code notebooks)

6.2 Distributed scholarly tools

  • APIs for universities or monasteries
  • Collaborative annotation environments
  • Integrations with digital humanities platforms
  • Cross-repository semantic search
  • Multi-modal study tools for audio/video/text composites

7. Long-Term Vision

Many of these horizons converge into a singular possibility:

A living, evolving, transparent, agent-assisted repository of Plum Village teachings and related Buddhist sources — continually cleaned, translated, aligned, evaluated, and enriched, with humans guiding the meaning and quality.

This is the highest vision of TNH-Scholar:

  • A bridge between ancient wisdom and modern AI practice.
  • A platform that supports, rather than automates, interpretation.
  • A system that grows with care, clarity, and purpose.

This document is intentionally speculative.
As the project matures, some directions will solidify into real designs; others may remain guiding inspirations.
It should be updated when major new horizons emerge or when certain horizons become active workstreams.