SPIKE-02 Execution Context Comparison¶

SPIKE-02 compares the practical headless execution contexts already exercised during the OA01.x communication experiments.

Experiment ID¶

SPIKE-02

Which execution context is the most reliable and least noisy for headless Codex use?

Contexts compared:

Primary artifacts:

Best practical context:

Best machine-oriented context available from the live agent environment:

Observed ranking:

Key observations:

the direct user shell produced clean JSON on stdout and empty stderr
the agent-launched contexts all remained workable, but they carried noisy state-db and plugin warnings
repo-root versus subdirectory versus outside-repo changed details, but not the main conclusion
the wrapper improved normalization and capture, but it did not recreate user-shell cleanliness

tmp/codex-user-stderr.log is the clearest evidence that user-shell launch is materially cleaner
tmp/codex-matrix-root-stderr.log, tmp/codex-matrix-subdir-stderr.log, and tmp/codex-matrix-outside-stderr.log show the recurring warning pattern in tool-launched contexts
tmp/codex-script-stdout.jsonl and tmp/codex-script-stderr.log show the wrapper's value as a stable capture surface

Treat direct user-shell launch as the best high-value comparison path when testing supervisory behavior.

For automation from the live agent environment, continue using: