ADR-A09: V1 Simplified Implementation Pathway¶

Outlines the pared-down GenAI Service build that keeps public seams intact but stubs complex internals for faster delivery.

Status: Accepted
Date: 2025-10-05
Author: Aaron Solomon, ChatGPT (GPT‑5)
Linked Docs: GenAI Service — Design Strategy (v0.2), ADR‑A01 (Domain & Object Service)

Purpose¶

This ADR captures the simplified implementation pathway for the first working release of the GenAIService. It preserves the architectural intent and interfaces of the strategy document but scales back complexity to achieve a usable prototype quickly. The goal is to reach a point where the system can “work on itself” via VS Code integration for AI‑assisted development.

Guiding Principles¶

Preserve all seams (router, safety, observability, policy) but stub or simplify internal logic.
Prioritize usability, determinism, and end‑to‑end working flow over completeness.
Keep the public shapes and interfaces stable for future ADRs.

Simplification Overview¶

Subsystem	Simplified for v1	Deferred / Full in Future
Routing	Single provider (`OpenAIAdapter`); model selected by Pattern `model_hint` if available, otherwise falls back to default (`gpt-4o-mini`).	Capability filtering, multi-provider, latency/budget heuristics
Params / Policy	Keep `temperature`, `max_output_tokens`, `output_mode`, and `max_dollars`. Use constants for other fields.	Full taxonomy ADR‑A08 (ownership, precedence, defaults)
Preflight	Use fixed constants: `MAX_INPUT_CHARS`, naive token estimator (`len(text)/3`), `MAX_CONTEXT_TOKENS=128k`, `MAX_DOLLARS`.	Model‑aware tokenizers and price tables
Images	Support basic JPEG/PNG via `Path`; generic handling (no image count limit, simple type validation).	Byte uploads, EXIF stripping, type registry
JSON mode	Prompt for JSON; `json.loads` parse; warn on failure.	Schema validation, structured field coercion
Observability	Single log line with `{cid, pattern_id, model, latency_ms, dollars}`; ULID correlation only.	Metrics registry, retention, structured tracing
Errors	Implement `PolicyError`, `TransportError`, `FormatError`.	Extended error taxonomy, redaction, retries w/ jitter
Retries	One retry on 429/5xx, 250 ms backoff.	Configurable policy, exponential/jitter, rate limiting
PatternCatalog	Only `render()` and `introspect()` returning `{task_kind, default_model_hint, output_mode}`.	Linting, forbidden tokens, metadata schema
ProviderRequest	Minimal fields: text, images, model, temperature, max_output_tokens, output_mode, correlation_id.	System messages, raw response, full transport
Convenience API	Add `openai_generate()` helper to simplify testing and notebook/extension use.	None—core UX improvement

Convenience API definition suggestion:

# ai_service/convenience.py
def openai_generate(pattern_name: str, variables: dict[str, Any], *, temperature: float = 0.2, max_tokens: int = 1024, images: list[Path] | None = None) -> str:
    service = GenAIService.from_env()  # factory to be added later
    req = CompletionRequest(
        pattern=PatternRef(id=pattern_name),
        variables=variables,
        images=[MediaAttachment(path=p, mime="image/jpeg") for p in (images or [])],
        params=CompletionParams(temperature=temperature, max_output_tokens=max_tokens),
    )
    resp = service.generate(req)
    if resp.status != "succeeded" or not resp.result:
        raise RuntimeError(f"Generation failed: {resp.error}")
    return resp.result.text

Definition of Done (Prototype Ready)¶

GenAIService.generate() works end‑to‑end for:
Text → Text (translation, summarization)
Text + Image → Text (vision‑in)
Constants enforce budget, context, and size limits.
OpenAIAdapter functional; AnthropicAdapter skeleton compiles.
JSON mode yields warnings=["json‑parse‑failed"] when needed.
Correlation ID and basic structured log emitted.
openai_generate() convenience API available for direct IDE/extension use and rapid testing.

Development Milestones¶

Skeleton compile pass: domain + provider base + OpenAI adapter.
Pattern render + fingerprint (example pattern).
Generate() text flow + log + usage.
Add image support (Path).
Add JSON mode.
Add budget/context guards.
Add simple_generate().
Add Anthropic skeleton.

Rationale¶

This approach enables a walking, testable skeleton within minimal engineering effort.
It allows for immediate integration with VS Code extensions and the PatternCatalog system, creating an environment where AI agents can progressively enhance their own supporting infrastructure.

Service-level policy only in v1; no per-request overrides.
Fingerprinting used for observability hooks only, no caching yet.
Static price table in config; dynamic pricing registry deferred.

Status & Next Steps¶

Review for approval by TNH Scholar core devs.
Upon acceptance, this ADR governs the immediate implementation sequence for GenAIService v1.
Follow‑ups:
ADR‑A04 (Routing Rules & Guardrails)
ADR‑A08 (Config/Params/Policy taxonomy) complete