Prompt System Architecture¶

Current and planned architecture for the TNH Scholar prompt system, including VS Code integration and PromptCatalog service design.

Overview¶

The TNH Scholar prompt system provides engineered text inputs (prompts) for AI models, enabling consistent, reproducible, and customizable text processing. This document describes the current implementation and planned enhancements.

Current Status: See ADR-PT03: Current Status & Roadmap for implementation details. Historical prototype notes are summarized in the Historical References section.

Design Goals¶

Simplicity: Git-based storage, human-readable Jinja2 templates
Transparency: All prompts are plain text files, version-controlled
Versioning: Git lineage provides collaborative editing and reproducibility
Flexibility: Template-based customization with context variables
Discoverability: Structured metadata enables browsing and selection
Reproducibility: Fingerprinting ensures identical inputs produce identical outputs

Current Architecture (V1)¶

Storage and Versioning¶

Implementation:

Prompts stored as Jinja2 .md files in ~/.config/tnh_scholar/patterns/
Git version control for prompt history
Simple file-based discovery

Access Pattern:

from tnh_scholar.ai_text_processing.patterns import LocalPatternManager

# Singleton access (prototype phase)
manager = LocalPatternManager()
prompt = manager.get_pattern("default_punctuate")
rendered = prompt.apply_template(context={"language": "en"})

Components¶

graph TD
    A[CLI Tools] --> B[LocalPatternManager]
    B --> C[PatternManager]
    C --> D[Prompt Files<br/>*.md]
    D --> E[Git Repository]

    style B fill:#f9f,stroke:#333
    style C fill:#bbf,stroke:#333
    style D fill:#bfb,stroke:#333

Key Classes:

LocalPatternManager: Singleton wrapper for global access
PatternManager: Core prompt loading and rendering
Jinja2 templates: Prompt files with variable substitution

Limitations¶

No structured metadata: Limited discoverability
Singleton access: Difficult to test, hard to inject
No fingerprinting: Can't verify reproducibility
Limited validation: No enforcement of required variables
No caching: Repeated renders re-parse templates

Planned Architecture (V2)¶

PromptCatalog Service¶

Following the Object-Service Design pattern:

graph TD
    A[Application Layer] --> B[PromptCatalog Protocol]
    B --> C[GitPromptCatalog]
    C --> D[PromptRepository]
    D --> E[PromptLoader]
    D --> F[PromptCache]
    E --> G[Prompt Files]
    G --> H[Git Storage]

    style B fill:#bbf,stroke:#333
    style C fill:#bfb,stroke:#333
    style D fill:#fbb,stroke:#333

Service Protocol:

from typing import Protocol

class PromptCatalog(Protocol):
    """Protocol for prompt catalog implementations."""

    def get_prompt(self, name: str, version: Optional[str] = None) -> Prompt:
        """Retrieve prompt by name and optional version."""
        ...

    def list_prompts(
        self,
        task_type: Optional[TaskType] = None,
        tags: Optional[list[str]] = None
    ) -> list[PromptMetadata]:
        """List available prompts with optional filtering."""
        ...

    def render_prompt(
        self,
        name: str,
        context: dict[str, Any],
        version: Optional[str] = None
    ) -> RenderedPrompt:
        """Render prompt with context and return fingerprinted result."""
        ...

    def validate_prompt(self, name: str) -> PromptValidation:
        """Validate prompt template and metadata."""
        ...

Prompt Metadata Model¶

from pydantic import BaseModel, Field
from enum import Enum

class TaskType(str, Enum):
    """Types of processing tasks."""
    PUNCTUATE = "punctuate"
    TRANSLATE = "translate"
    SECTION = "section"
    CUSTOM = "custom"

class PromptMetadata(BaseModel):
    """Metadata for a prompt template."""

    name: str = Field(..., description="Unique prompt identifier")
    version: str = Field(..., description="Semantic version (e.g., '1.0.0')")
    task_type: TaskType = Field(..., description="Type of processing task")
    description: str = Field(..., description="Human-readable description")

    # Variable requirements
    required_variables: list[str] = Field(default_factory=list)
    optional_variables: dict[str, Any] = Field(default_factory=dict)

    # Model constraints
    recommended_model: Optional[str] = None
    min_context_window: Optional[int] = None
    max_tokens: Optional[int] = None

    # Organizational
    tags: list[str] = Field(default_factory=list)
    author: Optional[str] = None
    created: datetime
    modified: datetime

    # Git provenance
    git_commit: Optional[str] = None
    git_branch: Optional[str] = None

Prompt Fingerprinting¶

from dataclasses import dataclass
from hashlib import sha256

@dataclass(frozen=True)
class PromptFingerprint:
    """Immutable fingerprint for rendered prompt."""

    template_hash: str  # Hash of template content
    metadata_hash: str  # Hash of metadata
    context_hash: str   # Hash of context variables
    git_commit: Optional[str]  # Git commit of template

    @property
    def full_hash(self) -> str:
        """Combined hash of all components."""
        combined = f"{self.template_hash}:{self.metadata_hash}:{self.context_hash}"
        return sha256(combined.encode()).hexdigest()

@dataclass(frozen=True)
class RenderedPrompt:
    """Rendered prompt with provenance."""

    content: str
    metadata: PromptMetadata
    fingerprint: PromptFingerprint
    rendered_at: datetime

Use Case: Caching AI responses by fingerprint:

# Check cache before expensive AI call
fingerprint = catalog.render_prompt("translate", context).fingerprint
cached_response = response_cache.get(fingerprint.full_hash)

if cached_response:
    return cached_response

# Make AI call and cache by fingerprint
response = ai_service.generate(prompt)
response_cache.set(fingerprint.full_hash, response)

VS Code Integration Requirements¶

See ADR-VSC01 and ADR-VSC02 (pending) for full context.

Interactive Prompt Selection¶

Command Palette Integration:

// VS Code command: "TNH Scholar: Select Prompt"
const prompts = await promptCatalog.listPrompts({
  taskType: "translate",
  tags: ["dharma", "vietnamese"]
});

const selected = await vscode.window.showQuickPick(
  prompts.map(p => ({
    label: p.name,
    description: p.description,
    detail: `v${p.version} | ${p.tags.join(", ")}`
  }))
);

Prompt Preview and Editing¶

Requirements:

Real-time rendering with sample context
Syntax highlighting for Jinja2 templates
Variable validation and autocomplete
Git diff view for prompt versions

Preview Workflow:

User selects prompt in VS Code
Extension loads prompt metadata
User provides context variables (or uses defaults)
Extension renders preview with current values
User can edit template and see live updates

Prompt Authoring Workflow¶

Create New Prompt:

User runs command: "TNH Scholar: Create Prompt"
VS Code prompts for metadata (name, task type, description)
Extension generates scaffold with metadata frontmatter:

---
name: custom_summarize
version: 1.0.0
task_type: custom
description: Summarize dharma talk with key insights
required_variables: [text, max_words]
optional_variables: {language: en, style: concise}
tags: [summarize, dharma]
---

# Summarize Dharma Talk

Please summarize the following dharma talk in {{max_words}} words or less.

**Language**: {{language}}
**Style**: {{style}}

## Text to Summarize

{{text}}

## Summary Instructions

Focus on:
- Core dharma teachings
- Practical applications
- Key insights for practitioners

User edits template
Extension validates on save (required variables, Jinja2 syntax)
User commits to Git

Migration Path¶

Phase 1: Enhanced Metadata (Current)¶

✅ Add YAML frontmatter to existing prompts
✅ Document required/optional variables
✅ Add task_type and tags

Phase 2: PromptCatalog Service (Q1 2026)¶

Implement PromptCatalog protocol
Create GitPromptCatalog implementation
Add fingerprinting and caching
Maintain backwards compatibility with LocalPatternManager

Phase 3: VS Code Integration (Q1-Q2 2026)¶

Implement command palette integration
Add prompt preview/editing UI
Create authoring workflow
Enable git-based versioning UI

Phase 4: Production Hardening (Q2 2026)¶

Replace LocalPatternManager singleton with dependency injection
Full prompt validation and error reporting
Usage analytics and cost tracking
Performance optimization

Implementation Examples¶

Current Usage (V1)¶

# CLI tool using singleton
from tnh_scholar.ai_text_processing.patterns import LocalPatternManager

manager = LocalPatternManager()
prompt = manager.get_pattern("default_translate")
rendered = prompt.apply_template(context={
    "text": input_text,
    "target_language": "en",
    "source_language": "vi"
})

# Pass to AI service
result = ai_service.process(rendered)

Planned Usage (V2)¶

# Service with dependency injection
from tnh_scholar.gen_ai_service import GenAIService
from tnh_scholar.prompt_system import GitPromptCatalog

# Initialize with config
catalog = GitPromptCatalog(config=prompt_config)
ai_service = GenAIService(config=ai_config)

# Render with fingerprinting
rendered = catalog.render_prompt(
    name="default_translate",
    context={
        "text": input_text,
        "target_language": "en",
        "source_language": "vi"
    }
)

# Check cache
cached = response_cache.get(rendered.fingerprint.full_hash)
if cached:
    return cached

# Process with AI
result = ai_service.generate(rendered.content)

# Cache by fingerprint
response_cache.set(rendered.fingerprint.full_hash, result)

Security and Safety¶

Prompt Injection Prevention¶

Variable validation: Only allow expected variables
Template sandboxing: Restrict Jinja2 capabilities (no code execution)
Input sanitization: Escape special characters in context values
Audit logging: Track prompt modifications and usage

Secrets Management¶

Never store secrets in prompts
Use environment variables for API keys
Git-ignore sensitive context files
Prompt templates are public artifacts

Performance Considerations¶

Caching Strategy¶

Template cache: Parsed Jinja2 templates (in-memory)
Metadata cache: Prompt metadata (in-memory, TTL: 5 min)
Response cache: AI responses by fingerprint (Redis/disk, TTL: 24 hours)

Lazy Loading¶

Load prompts on demand, not at startup
Cache only frequently-used prompts
Invalidate cache on Git updates

Testing Strategy¶

Unit Tests¶

Prompt rendering with various contexts
Metadata validation
Fingerprint uniqueness and stability
Variable substitution edge cases

Integration Tests¶

Git repository operations
Cache invalidation
VS Code extension integration
End-to-end authoring workflow

ADR-PT03: Current Status & Roadmap - Current implementation
ADR-DD03: Terminology Standardization - Pattern→Prompt shift
ADR-VSC01: VS Code Integration Strategy
Object-Service Design - Architecture pattern

References¶

Historical References¶

📚 View superseded design documents (maintainers/contributors)

**Note**: These documents are archived and excluded from the published documentation. They provide historical context for the current design. ### Earlier Design Explorations - **[Core Pattern Architecture](/architecture/prompt-system/archive/core-pattern-architecture.md)** (2024-11-09) *Status*: Replaced by this architecture document. Uses legacy "Pattern" terminology before standardization to "Prompt" nomenclature.