Skip to content

ADR-YF02: YouTube Transcript Format Selection

Selects a canonical transcript format for yt-fetch to keep downstream processing deterministic during early releases.

  • Status: Proposed
  • Date: 2025-01-15

Context

The yt-fetch CLI tool needs to download YouTube transcripts/captions. YouTube offers multiple formats: - VTT (Web Video Text Tracks) - TTML (Timed Text Markup Language) - srv½/3 (YouTube internal formats) - json3 (YouTube JSON format)

While yt-dlp offers format conversion capabilities, these are: - Poorly documented - Inconsistent in behavior - May change across versions

Decision

Standardize on VTT format output because: 1. It is a web standard format, likely to remain stable 2. Human readable and well-documented 3. Has wide library support if needed 4. Already the default format from yt-dlp 5. Available for both manual and auto-generated captions

Implementation approach: - Use minimal yt-dlp options (writesubtitles, writeautomaticsub, subtitleslangs) - Accept VTT as default output without trying format conversion - Let downstream tools handle any needed format conversion

```python

Example minimal implementation

opts = { "writesubtitles": True, "writeautomaticsub": True, "subtitleslangs": ["en"], "skip_download": True }